Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This will sound snarky, so forgive me, but I honestly don't know the answer. Is this actually true? Is there a reliable source containing statistics on LLM compute usage that includes training vs inference for the whole market?
 help



I don’t understand why people don’t just use Gemini or some other AI web search to get an answer to these kinds of questions quickly (I excluded the sources, you can get them if you ask the same question).

> While AI training is often the most intense and expensive process for a single model, the majority of total AI compute usage (approximately 90%) is used for inference.

> Here is the breakdown of why this is the case: > Inference as High-Volume

> Activity: Inference occurs every time a user interacts with an AI model (e.g., asking ChatGPT a question, using image recognition, or generating code). While a model is trained once (or updated infrequently), it runs millions or billions of inferences continuously.

> Cost Scaling: Training is a massive, one-time upfront cost, while inference is an ongoing, daily operational cost. As the number of AI users grows, the demand for inference compute scales faster than the need for training new, large models.

> The Shift to Efficiency: While early AI hype focused on the immense compute needed for training, the industry has shifted toward making inference cheaper and faster through specialized hardware and techniques like optimization, quantization, and small language models (SLMs).


Gemini is not a reliable source. You posted the only part of the AI response that isn't useful in verifying whether it is true.

Sure, I guess. I asked Gemini to give me some markdown of citations and the claims made that address the question:

https://share.google/aimode/v3Y9P3rYIx1oj9VI2

And I finally figured out how to get links to answers instead of just inlining the content as before. Anyways, there it is. We live in a time where questions like "Does inference or training use more compute?" can be answered quickly by just pasting it into a search box.


The revenue numbers are public for the major AI companies. That's probably the best estimate for "inference for the whole market" we have, since most of that inference is billed in either API usage or subscriptions, and it won't include any in-house usage such as training.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: