Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I haven’t tried it yet, but Rapid MLX has a neat feature for automatic model switching. It runs a local model using Apple’s MLX framework, then “falls forward” to the cloud dynamically based on usage patterns:

> Smart Cloud Routing > > Large-context requests auto-route to a cloud LLM (GPT-5, Claude, etc.) when local prefill would be slow. Routing based on new tokens after cache hit. --cloud-model openai/gpt-5 --cloud-threshold 20000

https://github.com/raullenchai/Rapid-MLX

 help



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: