Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Your scientists were so preoccupied with whether they could, they didn't stop to think if they should."

With Optimizing settings on, the compiler should immediately treat unused variables as errors by default.





So here are your options:

1. Syntactically require initialization, ie you can't write "int k;" only "int k = 0;". This is easy to do and 100% effective, but for many algorithms this has a notable performance cost to comply.

2. Semantically require initialization, the compiler must prove at least one write happens before every read. Rice's Theorem says we cannot have this unless we're willing to accept that some correct programs don't compile because the compiler couldn't see why they're correct. Safe Rust lives here. Fewer but still some programmers will hate this too because you're still losing perf in some cases to shut up the prover.

3. Redefine "immediately" as "Well, it should report the error at runtime". This has an even larger performance overhead in many cases, and of course in some applications there is no meaningful "report the error at runtime".

Now, it so happens I think option (2) is almost always the right choice, but then I would say that. If you need performance then sometimes none of those options is enough, which is why unsafe Rust is allowed to call core::mem::MaybeUninit::assume_init an unsafe function which in many cases compiles to no instructions at all, but is the specific moment when you're taking responsibility for claiming this is initialized and if you're wrong about that too fucking bad.


With optimizations, 1. and 2. can be kind of equivalent: if initialization is syntactically required (or variables are defined to be zero by default), then the compiler can elide this if it can prove that value is never read.

That, however, conflicts with unused write detection which can be quite useful (arguably more so than unused variable as it's both more general and more likely to catch issues). Though I guess you could always ignore a trivial initialisation for that purpose.

There isn't just a performance cost to initializing at declaration all the time. If you don't have a meaningful sentinel value (does zero mean "uninitialized" or does it mean logical zero?) then reading from the "initialized with meaningless data just to silence the lint" data is still a bug. And this bug is now somewhat tricky to detect because the sanitizers can't detect it.

Yes, that's an important consideration for languages like Rust or C++ which don't endorse mandatory defaults. It may even literally be impossible to "initialize with meaningless data" in these languages if the type doesn't have such "meaningless" values.

In languages like Go or Odin where "zero is default" for every type and you can't even opt out, this same problem (which I'd say is a bigger but less instantly fatal version of the Billion Dollar Mistake) occurs everywhere, at every API edge, and even in documentation, you just have to suck it up.

Which reminds of in a sense another option - you can have the syntactic behaviour but write it as though you don't initialize at all even though you do, which is the behaviour C++ silently has for user defined types. If we define a Goose type (in C++ a "class"), which we stubbornly don't provide any way for our users to make themselves (e.g. we make the constructors private, or we explicitly delete the constructors), and then a user writes "Goose foo;" in their C++ program it won't compile because the compiler isn't allowed to leave this foo variable uninitialized - but it also can't just construct it, so, too bad, this isn't a valid C++ program.


If you have a program that will unconditionally access uninitialized memory then the compiler can halt and emit a diagnostic. But that's rarely what is discussed in these UB conversations. Instead the compiler is encountering a program with multiple paths, some of which would encounter UB if taken. But the compiler cannot just refuse to compile this, since it is perfectly possible that the path is dead. Like, imagine this program:

    int foo(bool x, int* y) {
      if (x) return *y;
      return 0;
    } 
Dereferencing y would be UB. But maybe this function is called only with x=false when y is nullptr. This cannot be a compile error. So instead the compiler recognizes that certain program paths are illegal and uses that information during compilation.

Maybe we should make that an error.

More modern languages have indeed embedded nullability into the type system and will yell at you if you dereference a nullable pointer without a check. This is good.

Retrofitting this into C++ at the language level is impossible. At least without a huge change in priorities from the committee.


Maybe not the Standard, but maybe not impossible to retrofit into:

    -Werror -Wlet-me-stop-you-right-there

That's what Golang went for. There are order possibilities: D has `= void` initializer to explicitly leave variables uninitialized. Rust requires values to be initialized before use, and if the compiler can't prove they are, it's either an error or requires an explicit MaybeUninit type wrapper.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: