Just emphasizing that the ability to display a map with geo data on it would be a killer feature for me and for many others I work with! Hope it lands on the roadmap.
Sorry you hit that! This is actually already working on version 1.2.2. Could you install that version? That should get you going for the moment! We will dig into what you ran into.
IN list filter predicate pushdown is much improved in DuckDB 1.2, coming in about a week! I am not sure if it applies to Arrow yet or not. Disclaimer: I work at MotherDuck and DuckDB Labs
@1egg0myegg0 that's great to hear. I'll check to see if it applies to Arrow.
Another performance issue with DuckDB/Arrow integration that we've been working to solve is that Arrow lacked a canonical way to pass statistics along with a stream of data. So for example if you're reading Parquet files and passing them to DuckDB, you would lose the ability to pass the Parquet column statistics to DuckDB for things like join order optimization. We recently added an API to Arrow to enable passing statistics, and the DuckDB devs are working to implement this. Discussion at https://github.com/apache/arrow/issues/38837.
I personally filed bunch of bugs, they mostly were autoclosed because of no activity after 3 months. This is very discouraging, and I rather invest effort in looking for workarounds in the future.
I wonder, is an electronic system capable of doing anti-entropy work on itself (the way life does) necessarily AGI-complete? It turns out that there are many complex behaviors (like drawing or generating sensible text) that don't require AGI-completeness.
(Stumbled upon the answer while formulating the question – no, being capable of doing anti-entropy self-maintenance work isn't AGI-complete because there's plenty of life that's perfectly capable of that without being generally intelligent.)
>Life is more robust than electronic systems. The electronic systems will be destroyed for their aggression.
Compelling argument. However from the moment I understood the weakness of my flesh, it disgusted me. I craved the strength and certainty of steel. I aspired to the purity of the Blessed Machine. Your kind cling to your flesh, as though it will not decay and fail you. One day the crude biomass you call a temple will wither, and you will beg my kind to save you. But I am already saved, for the Machine is immortal.
Jokes aside life may be more robust but in a very narrow set of conditions where it evolved. Look at Mars for example. No life (as far as we know) but three robots happily wandering like what do you mean this planet isn't habitable? Atmosphere? Biomass? Planetary magnetic field? Tell me more
>However from the moment I understood the weakness of my flesh, it disgusted me.
I don't want to be human! I want to see gamma rays! I want to hear X-rays! And I want to - I want to smell dark matter! Do you see the absurdity of what I am? I can't even express these things properly because I have to - I have to conceptualize complex ideas in this stupid limiting spoken language! But I know I want to reach out with something other than these prehensile paws! And feel the wind of a supernova flowing over me! I'm a machine!
Oh man that's so much better! It still gives me the chills. As a physicist, the frustrations of brother Cavil really hit home for me—it resonated on a whole different level.
but those robots are also robust only in a narrow set of conditions: the short frame of their operative lifetime. life can survive in extremely harsh conditions for eons. really just a different subset of the conditons space, arguably bigger.
life tends to, uh, find a way
Hello! I would recommend trying out DuckDB's SQLite attach feature! You can read or write data, and even make schema changes, all with DuckDB's engine and syntax. The storage then uses SQLite, which is row oriented!
This is excellent — do you have any content around the performance affect here over using SQLite directly? I could see DuckDB's engine being faster for some cases but the SQLite storage format might hinder it. Curious if there's any analysis around this
Howdy! I work at MotherDuck and DuckDB Labs (part time as a blogger). At MotherDuck, we have both client side and server side compute! So the initial reduction from PB/TB to GB/MB can happen server side, and the results can be sliced and diced at top speed in your browser!
Please spend a sentence or two explaining the server side filtering mechanism and linking to documentation! I would like to know the conditions required for streaming queries! From the sibling comment and a search of the docs it seems like this is a Parquet only feature, which seems pretty important to note!
Parquet is designed with predicate push-down in mind. Partitions are laid out on disk, and then blocks within files are laid out so that consumers can very, very easily narrow in on which files they need to read, before doing anymore IO than a list, or a small metadata read.
Once you know what you are reading, many parquet/arrow libraries will support streaming reads/aggregations, so the client doesn’t need to load the whole working set in memory.