This is very relevant for compiler optimization and video codecs as well - both involve testing changes on lots of small benchmarks. Almost any change will hurt some benchmarks and help others. Maybe one benchmark improves a lot while causing a small slowdown on many others. The overall improvement for any one change can be minuscule, less than one percent. And yet if you keep doing these small improvements it adds up, hopefully to the point of improving every benchmark over time.