Show HN: A Personal AI Bot Trained on My Blog, Wiki, IRC, and More

Tiberium · on March 13, 2023

The source repo seems to be unavailable (private?), so I wanted to ask - is this a language model fine-tuned on your data, or you're using OpenAI's APIs with all data of your own stored in embeddings which are searched for the user query?

zerojames · on March 13, 2023

Ah, I haven't open sourced it yet! I need to remove that link for now. I am going the Embeddings route. My approach is (roughly) documented here: https://indieweb.org/OpenAI

Embeddings are in Faiss ~25 nearest neighbours are fed to GPT w/ a prompt written to ensure sources are cited (although performance of this varies - more work will be needed).

zerojames · on March 13, 2023

When there are open-source models on which I can fine tune, I'd love to make a version with one of them!