Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Bloom Filters: http://en.wikipedia.org/wiki/Bloom_filter

Sort of a probabilistic hash, where you trade space for accuracy. But it's also like a memory function - it can remember if it has seen a piece of data before.



What I love about bloom filters is that when checking to see if something is in a bloom filter, you can only get false positives, not false negatives. That property is just awesome to me.


Also the fact that you can use more bits per element and reduce the error rate for false positives. Using around 3 bytes per element, can bring the error rate down to 0.001%.


also the fact that you can use multiple hash functions to reduce the error rate even further is pretty nice. this is of course without incurring the overhead of extra storage space...


Using multiple hash functions requires that you use additional storage in order to maintain the same occupancy ratio.


I love that they're fast enough to use in hardware. Some of the newer schemes for hardware-accelerated transactional memory use Bloom filters to probabilistically detect when two transactions are using the same memory addresses. This is so much easier and more efficient than previous approaches, which had to use content associative memories and expensive broadcast operations.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: