When your computer was anemic, and could barely do the tasks required for it, eking out a few percent — or a 2x! — from an optimizer was important.
Now-a-days, the difference between "big compiler optimized" and "little compiler not optimized" can be quite dramatic; but, is probably no more than 4x — certainly within range of the distinction between "systems programming language" and "high tuned JITted scripting language". I think most people are perfectly fine with the performance of highly-tuned scripting languages. The result is that all of the overhead of "big compiler" is just ... immaterial; overhead. This is especially true for the case of extremely well-tuned code, where the algorithm and — last resort — assembly, will easily beat out the best optimizer by at least an order-of-magnitude, or more.
>just a simple case of bad code generation render little compiler into a toy one
If you find some time to go through gcc bugzilla you'll find shockingly simple snippets of code that miscompiled (often by optimization passes), with fixes never backported to older versions that production environments like RHEL are using.
I realized with all the rhel systems I’m using, we are never using default toolchains on them. Just use those old systems to run stuff, even newer toolchains.
I think a production grade compiler not only can, but must, leave performance on the table when the cost is correctness (unless the performance gain is incredibly high and the correctness loss is minimal). Correctness is not all important, but it is the most important thing. Unfortunately, compiler writers do not agree and they do silly things like "let's assume UB cannot ever happen and optimize based on that".
This is about optimizations affecting timing of cryptographic code, not correctness of computation, the argument for calling this a correctness bug in the compiler is quite weak I think.
I do not agree in the general case. There are very useful DSL compilers which do not consider performance at all, but just compile to a target which does the optimization for them (JVM, LLVM IR or even just C)
if you aren't running on the gpu you're leaving 80+% of your computer's performance on the table. no optimizing compiler is going to make your legacy c or lisp or rust code run efficiently on the gpu, or even in most cases on a multicore cpu. nor, as thechao points out, can it compete with assembly-language programmers for simd vectorization on the cpu
in summary, optimizing compilers for c or pascal or zig or rust or whatever can only be used for code where considerations like compatibility, ease of programming, security, and predictability are more important than performance
probably the vast majority of production code is already in python and javascript, which don't even have reasonable compilers at all
Now-a-days, the difference between "big compiler optimized" and "little compiler not optimized" can be quite dramatic; but, is probably no more than 4x — certainly within range of the distinction between "systems programming language" and "high tuned JITted scripting language". I think most people are perfectly fine with the performance of highly-tuned scripting languages. The result is that all of the overhead of "big compiler" is just ... immaterial; overhead. This is especially true for the case of extremely well-tuned code, where the algorithm and — last resort — assembly, will easily beat out the best optimizer by at least an order-of-magnitude, or more.