Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

IIRC the SWE bench dataset gives you the full repo snapshot + the issue text, the evaluation pipelines typically run some kind of retriever (eg. grep, BM25) to pick a subset of files to place in the model’s context. They provided context is usually limited up to ~50k tokens.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: