Trying to get GPT to generate comments at a particular level really highlights i...

swiftstart · on May 24, 2023

Yeah, we had to do a lot of prompt engineering and even then, we still had to clean up the files quite a bit programmatically.

Perhaps GPT-5 and beyond would make this entire process 10X easier :)

i2cmaster · on May 24, 2023

Have you tried getting it to write a high level description before reproducing the code with comments? (via either FSL or instructions) Most of the reasoning ability in LLMs comes from them rambling about something and then the attention picking up on the rambling when it needs to generate the conclusion. If you skip that then the output will probably be much less coherent.

swiftstart · on May 24, 2023

We played around with this for a bit actually. One idea we had was to generate a PlantUML diagram to show how the different components of a file or even a repository connected with one another. However, given the current limitations with GPT context, even when using GPT-4, this quickly became impractical for large files. We would need to leverage an AI with a much larger context length.

That said, perhaps if the entire repository is fed into a vectorised database, a high-level overview would be possible? Just thinking aloud right now and am happy to collaborate with anyone interested in exploring this further!