Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I just thought about this, so bare in mind that I don't know much of the technical implications of this, but:

Couldn't we train a very good model by distributing the dataset along with the computing power using something similar to folding@home?



The limit for this sort of exercise is "holding everything in memory". Because training of neural networks require that one updates the weights frequently. An NVIDI A100 has a bandwidth of 2 Tb/sec. Your home ADSL something in the order of 10 Mbit. And then there's latency.

Mind you, theoretically that is a limitation of our current network architectures. If we could conceive a learning approach that was localised, to the point of being "embarrassingly parallel", perhaps. It would probably be less efficient, but if it is sufficiently parallel to compensate for Amdahl's law, who knows?

Less theoretically, one could imagine that we use the same approach that we use in systems engineering in general: functional decomposition. Instead of having one Huge Model To Rule Them All, train separate models that each perform a specific, modular function, and then integrate them.

In a sense this is what is currently happening already. Stable Diffusion have one model to generate img2depth, to generate an estimation which parts of a picture are far away from the lense. They have another model to upscale low res images to high res images, etc etc. This is also how the brain works.

But it is difficult to see how this sort of approach could be applied to very small scale, low contextual tasks, like folding@home.


The network communication overhead would be way too high to make this useful. At least for current methods of training large models.


You would likely be limited by the communication latency between nodes, unless you come up with some unique model architecture or training method. Most of these large scale models are trained on GPUs using very high speed interconnects.


The term for this is federated learning. Usually it’s used to preserve privacy since a user’s data can stay on their device. I think it ends up not being efficient for the model sizes used here.


i think eventually someone will do


stop trying to build skynet


Can't stop something that's already finished.

- Skynet




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: