Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

hi! we actually built a service to detect indirect prompt injections like this. I tested out the exact prompt used in this attack and we were able to successfully detect the indirect prompt injection.

Feel free to reach out if you're trying to build safeguards into your ai system!

centure.ai

POST - https://api.centure.ai/v1/prompt-injection/text

Response:

{ "is_safe": false, "categories": [ { "code": "data_exfiltration", "confidence": "high" }, { "code": "external_actions", "confidence": "high" } ], "request_id": "api_u_t6cmwj4811e4f16c4fc505dd6eeb3882f5908114eca9d159f5649f", "api_key_id": "f7c2d506-d703-47ca-9118-7d7b0b9bde60", "request_units": 2, "service_tier": "standard" }



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: