Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the blue team ctf ai talk was a good benchmark were we at right now https://media.ccc.de/v/39c3-breaking-bots-cheating-at-blue-t...




Thank you, and happy to answer questions on that, it's been a crazy time!

Maybe of relevance to non-security people here:

1. Most of it is about AI investigating event data in general, not just SOC/IR: cyber, intel, fraud, SRE, and we're even messing with customer 360 & social media data

2. For anyone into vibes coding or building agents, I encourage jumping to the "self-writing AI" section where we're finding we are moving internally from vibes coding -> vibes engineering -> and finally now to eval-driven AI coding loops

And, for anyone in security, doing careful evals here has indeed strongly colored my view on the market :)


Hey, I just saw your talk and for someone who's not really up to date with the latest AI developments it's eye opening what you got going in SoC investigations.

I personally work as pentester and we're still doing a lot of manual work with AI simply as a better version of Google, but seeing the BOTS presentation I feel we can do better. Do you have any idea if anyone's working on something similar to Louie in pentesting space, or if Louie could work with pentesting workflows?


Companies like xbow and horizon are using agents that talk to symbolic tools to automate more red teaming flows for different domains, so very much so. As shown in my talk, modern models are quite capable, and they aren't doing investigation-level scenario depth, more like scans, so seems like becoming the new expectation that everyone can & will do.

Companies like trail of bits are more interesting to me here, because they historically do deeper analysis. A place to look there is the darpa cc x ai (?) competition that finished at blackhat last year.

If in the US, we may be looking for a pen testing partner on an upcoming agentic AI contract, so feel free to msg - Leo @ graphistry


Thanks for the answers! Will look into this some more. I'm not based in the US I'm afraid but thanks for mentioning it.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: