> If you ask Claude to fix a test that accidentally says assert(1 + 1 === 3), it'll say "this is clearly a typo" and just rewrite the test.
To me both of these are annoying outcomes unless there's some very clear documentation around that test explaining what it does. Ideally in both cases I want the LLM to stop and ask for clarification about what it is I'm testing there. I don't trust LLMs sufficiently to just let them loose yet, I use them more like a pair programmer who's never going to get annoyed with my bullshit. (So yes, I usually have them set to require approval on any edits, and will nitpick my way through them like the most annoying code reviewer you've ever met)
To me both of these are annoying outcomes unless there's some very clear documentation around that test explaining what it does. Ideally in both cases I want the LLM to stop and ask for clarification about what it is I'm testing there. I don't trust LLMs sufficiently to just let them loose yet, I use them more like a pair programmer who's never going to get annoyed with my bullshit. (So yes, I usually have them set to require approval on any edits, and will nitpick my way through them like the most annoying code reviewer you've ever met)