My problem with the OpenAI models (GPT5.2 in particular) recently is an extreme aversion to doing more than the smallest step in a task before asking for using input. Even if I explicitly instruct it to continue without input until the task is complete, it ignores the instruction.
I cannot imagine GPT5.2 working on a task for more than 2 minutes, let alone 4 hours. I’m curious if you’ve run into this and figured out a way around it?
I've not had that problem at all with GPT-5.2 running in Codex CLI.
I use prompts like this:
Build a pure JavaScript library (no dependencies) for encoding and
decoding this binary format. Start by looking at how the lite3-python
library works - the JavaScript one should have the same API and probably the
same code design too. Build the JS one in lite3-javascript - it should be a
single JavaScript module which works in both Node.js and in the browser.
There should be a test script that runs with Node.js which runs against the
files in the lite3-python/format_suite folder. Write the test script first,
run it and watch it fail, then build the JavaScript library and keep running
the tests until they pass.
I find that surprising. GPT 5.2 is the model I've had working the longest. It frequently works more than 4 hours nonstop, while earlier models would stop to ask if they should continue every 10 minutes. 5.1 and earlier ignores it if I ask it to continue until a task is done, but 5.2 will usually finish it.
I cannot imagine GPT5.2 working on a task for more than 2 minutes, let alone 4 hours. I’m curious if you’ve run into this and figured out a way around it?