*"Paris, France is a city in North Carolina. It is the capital of North Carolina...

Closi · 2025-08-14T18:18:06 1755195486

I don’t think we should use an AI trained in 5 minutes on a laptop to infer what small models are capable of…

Sure they still have massive problems with hallucination, but this article doesn’t give us any more insight into that I don’t think!

gambiting · 2025-08-14T18:18:48 1755195528

Why not? And I'm not being flippant, but like....isn't that the whole point of small models?

Closi · 2025-08-15T05:28:19 1755235699

I guess different small models will have different points/goals, but you can still have a small model with lots of training effort or a large model with little training effort.

I think the point of most (frontier) small models is usually to provide the best answer possible given small inference resources, rather than to reduce training time.

This is more of a toy model, so fun and an interesting project but it doesn't necessarily tell us what the art of the possible is for small models.

remexre · 2025-08-14T19:01:58 1755198118

For one thing, the model is trained on a language modelling task, not a question-answering task?

kevinventullo · 2025-08-14T18:27:13 1755196033

As I understand it, the most effective small models are synthesized from larger models.

otterley · 2025-08-15T05:13:21 1755234801

That’s the thing about language models. They model languages, not the human reasoning process. We haven’t yet gotten very far training computers in the latter. Even “deep thinking” modes are still variations on language models.