Kind of amazes me how many people bitch about agent performance but don't hook t...

chrisweekly · 2025-09-21T22:32:11 1758493931

Good point. Also (tangent), I followed your profile link to https://sibylline.dev and am thoroughly impressed. Stoked to have found your treasure trove of repos and insights.

CuriouslyC · 2025-09-21T23:27:53 1758497273

Don't play with them unless you're good at debugging alpha code (claude/codex can do it fine), I haven't ironed out env specific stuff or clarified the installation/usage, and I'm still doing UI polish/optimization passes (yay async simd rust). I'll do showy releases once I've got the tools one click install ready, in the meantime please feel free to drop an issue on any of my projects if there are features or questions you have.

chrisweekly · 2025-09-22T01:10:54 1758503454

Sounds good, will do. Good luck getting them polished up!

yahoozoo · 2025-09-22T02:15:52 1758507352

Could you elaborate? How does knowing numerical usage metrics help?

CuriouslyC · 2025-09-22T03:23:33 1758511413

With Phoenix + Clickhouse being fed from Otel, you can do queries over your traces for deep analysis. If I want to see which tool calls are failing and why (or just get tool statistics), or find common patterns in flagged/failure traces ("simpler solution") and their causes, it's one query and some wiring.

thewisenerd · 2025-09-22T04:45:20 1758516320

you mention traces

but the documentation only specifies metrics and logs

https://docs.claude.com/en/docs/claude-code/monitoring-usage

am i missing something here? or is this a case where they were lazy to do traces in the claude code sdk, so its logs + log attributes?

CuriouslyC · 2025-09-22T06:40:53 1758523253

You can get traces out of every agent framework that doesn't suck. You might need to collect from a proxy to get them from Claude because Anthropic is bigtime amateur hour.