Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To nitpick the article a bit:

> It's important to note that this algorithm is very much of its time. Back when Quake 3 was released in 1999, computing an inverse square root was a slow, expensive process. The game had to compute hundreds or thousands of them per second in order to solve lighting equations, and other 3D vector calculations that rely on normalization. These days, on modern hardware, not only would a calculation like this not take place on the CPU, even if it did, it would be fast due to much more advanced dedicated floating point hardware.

Calculations like this definitely take place on the CPU all the time. It's a common misconception that games and other FLOP-heavy apps want to offload all floating-point operations to the GPU. In fact, it really only makes sense to offload large uniform workloads to the GPU. If you're doing one-off vector normalization--say, as part of the rotation matrix construction needed to make one object face another--then you're going to want to stay on the CPU, because the CPU is faster at that. In fact, the CPU would remain faster at single floating point operations even if you didn't take the GPU transfer time into account--the GPU typically runs at a slower clock rate and relies on parallelism to achieve its high FLOP count.



I think he refers to FPU, not GPU.

In the old days, FPU do async computation.

FPU is now a considered an integrated part of CPU.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: