Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The source repo seems to be unavailable (private?), so I wanted to ask - is this a language model fine-tuned on your data, or you're using OpenAI's APIs with all data of your own stored in embeddings which are searched for the user query?


Ah, I haven't open sourced it yet! I need to remove that link for now. I am going the Embeddings route. My approach is (roughly) documented here: https://indieweb.org/OpenAI

Embeddings are in Faiss ~25 nearest neighbours are fed to GPT w/ a prompt written to ensure sources are cited (although performance of this varies - more work will be needed).


When there are open-source models on which I can fine tune, I'd love to make a version with one of them!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: