At Portkey, this is a problem we deal with quite a bit. Also the reason that Dat...

lmeyerov · on July 18, 2024

Clickhouse has text + vector indexes, so that may be native, though we have never used them and I find vector indexes tricky to scale w other DBs. Text... Or neither... may be enough in practice tho as we mostly only care about searching on metadata dimensions like task.

We are thinking about sampled hot data for ops staff in otel DB+UIs, and long-term full data in S3/Clickhouse for custom tooling. It'd be cool if we could send Clickhouse historical otel sessions to grafana etc on demand, but likely a bridge too far...

nirga · on July 18, 2024

I think you can (pretty) easily set this up with an otel collector and something that replays data from S3 - there's a native implementation that converts otel to clickhouse

lmeyerov · on July 18, 2024

Our scenario would be more like using Clickhouse / a dwh for session cohort/workflow filtering and then populating otel tools for viz goodies. Interestingly, to your point, the otel python exporter libs are pretty simple, so SQL results -> otel spans -> Grafana temp storage should be simple!