Precisely, if we are talking about low-latency, why would we let the communications go through the application (in user-land), to the kernel, then to the network stack, then be received by a kernel from another node, and then finally received again by the application in user-land. I would imagine as a first guess that bypassing several linux kernels and directly accessing remote hardwares would be mandatory for best low-latency.
If they are some info on internet about the software stack/architecture of the entire system, I would document myself on that. I didn't explore all the links I posted above yet.
I'm nowhere an expert, and HPC is really specific use case, but there is surely interesting bits to learn from it
If they are some info on internet about the software stack/architecture of the entire system, I would document myself on that. I didn't explore all the links I posted above yet.
I'm nowhere an expert, and HPC is really specific use case, but there is surely interesting bits to learn from it