Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Iceberg, the Right Idea – The Wrong Spec (database-doctor.com)
3 points by DannyPage 8 months ago | hide | past | favorite | 1 comment


We just finished implementing Iceberg on top of a large set of Parquet files, stored in S3. It’s a neat idea to be able to turn a lot of data files into a SQL database, but I absolutely understand the pain and confusion the author writes, especially around how it handles metadata. It creates a lot of those files and makes a large mess of the directory. Some queries that I know would return a single parquet file take up to 30 seconds.

I don’t think we’ll scrap it and there are certainly ways to speed up the problematic aspects of querying the catalog, but I’m also rooting for DuckLake to make it a lot more approachable by not completely shying away from the database as an idea.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: