As someone whose job is to inference on the edge. No one is getting a mass product consistently inferencing on the edge for $199 not today and not next year (and I’d bet not in 2026)
What do you mean by edge? There are already many apps that use LLMs and diffusion models locally on even a mac M1, which means iPhones and tensor chips aren't far behind
This will perform inference on Phi-2 or Mistral on almost any iPhone device. The first neural engine came with the A11 bionic chip which is in the iPhone 8. People can buy that phone for $150 used on Amazon.
I think that's evidence that inference on the edge is already here
You could even go up a few iPhone versions and still be <$199. iPhone 12 used ~$190 supports iOS 17 and can run the Tiny models (phi-2).
For small models, today a used iphone 14 costs ~$300. Which means next year it will by down to ~$200.
I'm pretty sure by 2026, we will be able to inference "medium" models like Mistral/Llama3 8B on a <$199 device. That's not even factoring in how much better models will get by that time
That’s not what the parent comment is talking about. They say that no one can mass produce these devices and sell them for $200. You don’t mass produce used iPhones to sell to people.
Your impression was spot on