I'm 100% sure that all providers are playing with the quantization, kv cache and other parameters of the models to be able to serve the demand. One of the biggest advantage of running a local model is that you get predictable behavior.
I have tried to solve the agent running wild, and I found two solutions, the first is to mount the workspace folder using WASM to scope any potential damage, the second is running rquickjs with all APIs and module imports disabled, requiring the agent to call a host function that checks permissions before accessing any files
i would love if you took the time to instruct claude to re-implement inference in c/c++, and put an mit license on it, it would be huge, but only if it actually works
Not necessarily faster, but more easy for sure. There's plenty of stories of proficient abacus using accountants being faster than those using calculators. Those days are gone now though because a calculator is just so much easier to pick up.
they have been doomed for a while, it is just a matter of time, but honestly i like them better than the claude provider, if they can make openai profitable, that would be good for all of us, we don't want a world where gemini is the only winner or the chinese take over
reply