Hacker Newsnew | past | comments | ask | show | jobs | submit | Hauk307's commentslogin

Right now it's purely automated,50+ compliance checks against the A2A spec (agent card validation, endpoint testing, state machine, streaming, auth, error handling). Each check is weighted and rolled into the 0-100 score.

But you're right that automated spec compliance only tells part of the story. The roadmap includes usage signals, uptime monitoring, response latency tracking, and community ratings from developers who've actually integrated with an agent. The spec tells you if an agent CAN work. Usage data tells you if it DOES work.

The profile pages are designed with that in mind, test history over time already shows trends, and adding real world signals is the natural next layer.


This is exactly what I've been looking for. I run a Mac mini as an always-on server for a side project and the API costs for cloud models are adding up fast. Being able to run something locally even at slower speeds would be a game changer for background tasks. What kind of tokens/sec are you seeing on a base M2 Mac mini with 16GB?


Well this is humbling. I've been running Claude on a Mac mini for a side project and was just looking at my API bill wondering if there was a cheaper way. Seeing someone at $4.5K makes me feel a little better about mine. Cool idea for a leaderboard though — turns the pain of spending into a game.


This is cool. I'd use it to track when state wildlife agencies update their regulation pages — those change once a year with no announcement and I always miss it. Element-level tracking would be perfect for that vs watching the whole page. To answer your question: I'd want both RSS and direct alerts (email/push) depending on urgency.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: