At very high scale, there's less usage of graphs. Or there's a set of clustering...

antirez · 2025-11-11T20:28:26 1762892906

There is an entire section of the post about that. I believe that's more the illusion of a problem because of product design issues than a real challenge since far results that match the filter are totally useless.

whakim · 2025-11-12T05:52:15 1762926735

Doesn't this depend on your data to a large extent? In a very dense graph "far" results (in terms of the effort spent searching) that match the filters might actually be quite similar?

antirez · 2025-11-12T06:54:34 1762930474

The "far" here means "with vectors having a very low cosine similarity / very high distance". So in vector use cases where you want near vectors matching a given set of filters, far vectors matching a set of filters are useless. So in Redis Vector Sets you have another "EF" (effort) parameter just for filters, and you can decide in case not enough results are collected so far how much efforts you want to do. If you want to scan all the graph, that's fine, but Redis by default will do the sane thing and early stop when the vectors anyway are already far.

curl-up · 2025-11-11T19:37:04 1762889824

I'm facing the problem you describe daily. It's especially bad because it's very difficult for me to predict if the set of filters will reduce the dataset by ~1% (in which case following the original vector index is fine) or by 99.99% (in which case you just want to brute force the remaining vectors).

Tried a million different things, but haven't heard of Turbopuffer yet. Any references on how they perform with such additional filters?

inertiatic · 2025-11-11T20:05:37 1762891537

Lucene and ES implement a shortcut for filters that are restrictive enough. Since it's already optimized for figuring out if something falls into your filter set, you first determine the size of that. You traverse the HNSW normally, then if you have traversed more nodes than your filter set's cardinality, you just switch to brute forcing your filter set distance comparisons. So worst case scenario is you do 2x your filter set size vector distance operations. Quite neat.

curl-up · 2025-11-11T20:08:51 1762891731

Oh that's nice! Any references on this shortcut? How do you activate that behavior? I was playing around with ES, but the only suggestion I found was to use `count` on filters before deciding (manually) which path to take.

inertiatic · 2025-11-11T20:13:58 1762892038

Here you go https://github.com/apache/lucene/pull/656 - no need to do anything from the user side to trigger it as far as I know.

Sirupsen · 2025-11-17T19:59:16 1763409556

Our query planner has that built in! We've spent a lot of time making high recall with any selectivity in the fitler work.

ddorian43 · 2025-11-12T06:39:03 1762929543

Just lookup how vespa.ai does it, it's open source.

spullara · 2025-11-11T20:04:58 1762891498

Hybrid search with vector similarity and filtering I think has mostly been solved by Vespa and not even recently.

https://blog.vespa.ai/vespa-hybrid-billion-scale-vector-sear...

hansvm · 2025-11-12T00:15:07 1762906507

Not really. The thing Vespa solved was taking existing ANN methods and fixing a disk/RAM tradeoff (and some other niceties). That's nowhere close to adequte when:

1. As softwaredoug mentioned, you might want to filter results, potentially with a high filtration rate.

2. ANN isn't good enough. Suppose you need bounded accuracy with meaningfully sublinear time on a high-dimensional dataset. You're hosed.

Point (1) is just a repeat of a general observation that composition of nice data structures doesn't usually give you a nice data structure, even if it technially works. Creating a thing that does what you want without costing both arms and the president's leg requires actually understanding DS&A and applying it in your solution from the ground up.

Point (2) might seem irrelevant (after all, people are "building" stuff with RAG and whatnot nowadays aren't they?), but it's crucial to a lot of applications. Imagine, e.g., that there exists one correct result in your database. The guarantees provided by SOTA ANN solutions (on high-dimensional data) have a steep compute/correctness tradeoff, giving you an abysmal chance of finding your document without searching an eye-watering fraction of your database. I usually work around that with the relaxation that the result needs to be the best one but that its information can be fuzzy (admitting solutions which merge a bunch of low-dimensional queries, corrected via some symmetry in a later step), but if you actually need good KNN results then you're kind of hosed.

softwaredoug · 2025-11-11T20:10:35 1762891835

For sure. But its "solved" differently by every vector database. You have to pay attention to how its solved.

cfors · 2025-11-11T20:25:15 1762892715

Just curious what the state of the art around filtered vector search results is? I took a quick look at the SPFresh paper and didn't see it specifically address filtering.

sroussey · 2025-11-11T20:31:35 1762893095

Full text search has this same issue.