As someone whose job is to inference on the edge. No one is getting a mass produ...

imranq · on May 3, 2024

What do you mean by edge? There are already many apps that use LLMs and diffusion models locally on even a mac M1, which means iPhones and tensor chips aren't far behind

maronato · on May 3, 2024

Macs and iPhones don’t cost $200

imranq · on May 3, 2024

Take a look at an app like this: https://apps.apple.com/us/app/offline-chat-private-ai/id6474...

This will perform inference on Phi-2 or Mistral on almost any iPhone device. The first neural engine came with the A11 bionic chip which is in the iPhone 8. People can buy that phone for $150 used on Amazon.

I think that's evidence that inference on the edge is already here

KomoD · on May 3, 2024

> or Mistral on almost any iPhone device

> "Small" requires iPhone 14 and 15 Pros

> "Small" and "Medium" models are based on Mistral 7B 0.2

So no, also that app wouldn't work on an iPhone 8 either since it requires iOS 17 or later

imranq · on May 4, 2024

You could even go up a few iPhone versions and still be <$199. iPhone 12 used ~$190 supports iOS 17 and can run the Tiny models (phi-2).

For small models, today a used iphone 14 costs ~$300. Which means next year it will by down to ~$200.

I'm pretty sure by 2026, we will be able to inference "medium" models like Mistral/Llama3 8B on a <$199 device. That's not even factoring in how much better models will get by that time

maronato · on May 4, 2024

That’s not what the parent comment is talking about. They say that no one can mass produce these devices and sell them for $200. You don’t mass produce used iPhones to sell to people.