I wonder if "normal" RDIMM ECC would be enough to mitigate most of those radiati...

eptcyka · 2025-10-22T13:58:13 1761141493

You'll get bitflips elsewhere besides just in RAM. A bitflip in L1 or L3 cache will be propagated to your DIMM and noone will be the wiser.

zamadatix · 2025-10-22T14:04:36 1761141876

I thought server CPUs already handled this? E.g. for Epyc https://moorinsightsstrategy.com/wp-content/uploads/2017/05/...

> Because caches hold the most recent and most relevant data to the current processing, it is critical that this data be accurate. To enable this, AMD has designed EPYC with multiple tiers of cache protection. The level 1 data cache includes SEC-DED ECC, which can detect two-bit errors and correct single-bit errors. Through parity and retry, L1 data cache tag errors and L1 instruction cache errors are automatically corrected. The L2 and L3 caches are extended even further with the ability to correct double errors and detect triple errors.

shrubble · 2025-10-22T15:57:10 1761148630

Sun Microsystems famously had this problem with their servers using the UltraSPARC II chips, with cache SRAM that didn’t have ECC. Later versions of their processors had ECC added.

LtdJorge · 2025-10-22T14:47:21 1761144441

Those do ECC already

ls612 · 2025-10-22T23:12:14 1761174734

What about the registers?

yencabulator · 2025-10-23T22:37:59 1761259079

What about the ALU/FPU/TPU itself?