Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We made an adapter (a specific CLI interface) for the LLM to interface with the app. Kind of like an integration test, just a little bit more sophisticated.

The LLM gets a prompt with the CLI commands it may use, and its "personality", and then it does what it does.

On the hardware-side, I personally have 2x 3090 cards on an AMD TR 79x platform with 128GB RAM, which yields around 12 token/sec for LLama 3.3 or Qwen 2.5 72B (Q5_k_m), which is okay (ingestion speed is approx double that)

If you want to know more details, feel free to drop me a message (username at liku dot social)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: