Both threads are waiting on the same memory pipeline. The actual performance has...

kevin_thibedeau · on Dec 22, 2022

If one thread has to hit main memory but the other one can get what it needs from L1 cache they aren't competing for the same resources.

slashdev · on Dec 22, 2022

But not the same memory operations. Unless your code fully saturates the memory bandwidth, which is rare, you get some gains here.

AdrianB1 · on Dec 22, 2022

It is almost never about saturating the memory bandwidth, but waiting to load instructions and data from memory. That wait time counted in computer cycles is huge.

slashdev · on Dec 22, 2022

Yes, that’s precisely why hyperthreading is such a good deal.

hinkley · on Dec 23, 2022

“Such” a good deal is 2-30% in benchmarks (and they don’t say if that’s with cache leak protections turned on or off). In previous generations it was more like -20-20%. If one thread is having issues with L1 and L2 cache, splitting that evenly with a completely different workflow isn’t going to help.

If cache contention weren’t a problem, and it was just a matter of jumping into previously unseen instructions and data (cold cache), you’d expect to see 50-300% numbers from hyperthreading, precisely because of how long the stalls are.