> That doesn’t mean they should. Nobody’s stopping you from using non-optimising...

foltik · 2025-12-29T23:20:23 1767050423

As if treating uninitialized reads as opaque somehow precludes all optimizations?

There’s a million more sensible things that the compiler could do here besides the hilariously bad codegen you see in the grandparent and sibling comments.

All I’ve heard amounts to “but it’s allowed by the spec.” I’m not arguing against that. I’m saying a spec that incentivizes this nonsense is poorly designed.

Negitivefrags · 2025-12-30T03:10:50 1767064250

Why is the code gen bad? What result are you wanting? You specifically want whatever value happened to be on the stack as opposed to a value the compiler picked?

masklinn · 2025-12-30T09:53:06 1767088386

> As if treating uninitialized reads as opaque somehow precludes all optimizations?

That's not what these words mean.

> There’s a million more sensible things

Again, if you don't like compilers leveraging UBs use a non-optimizing compiler.

> All I’ve heard amounts to “but it’s allowed by the spec.” I’m not arguing against that.

You literally are though. Your statements so far have all been variations of or nonsensical assertions around "why can't I read from uninitialised memory when the spec says I can't do that".

> I’m saying a spec that incentivizes this nonsense is poorly designed.

Then... don't use languages that are specified that way? It's really not that hard.

foltik · 2025-12-30T16:05:08 1767110708

From the LLVM docs [0]:

> Undef values aren't exactly constants ... they can appear to have different bit patterns at each use.

My claim is simple and narrow: compilers should internally model such values as unspecified, not actively choose convenient constants.

The comment I replied to cited an example where an undef is constant folded into the value required for a conditional to be true. Can you point to any case where that produces a real optimization benefit, as opposed to being a degenerate interaction between UB and value propagation passes?

And to be explicit: “if you don’t like it, don’t use it” is just refusing to engage, not a constructive response to this critique. These semantics aren't set in stone.

[0] https://llvm.org/doxygen/classllvm_1_1UndefValue.html#detail...

masklinn · 2025-12-30T17:51:15 1767117075

> My claim is simple and narrow: compilers should internally model such values as unspecified, not actively choose convenient constants.

An assertion you have provided no utility or justification for.

> The comment I replied to cited an example where an undef is constant folded into the value required for a conditional to be true.

The comment you replied to did in fact not do that and it’s incredible that you misread it such.

> Can you point to any case where that produces a real optimization benefit, as opposed to being a degenerate interaction between UB and value propagation passes?

The original snippet literally folds a branch and two stores into a single store, saving CPU resources and generating tighter code.

> this critique

Critique is not what you have engaged in at any point.

foltik · 2025-12-31T05:04:27 1767157467

Sorry, my earlier comments were somewhat vague and assuming we were on the same page about a few things. Let me be concrete.

The snippet is, after lowering:

  if (x)
    return { a = 13, b = undef }
  else
    return { a = undef, b = 37 }

LLVM represents this as a phi node of two aggregates:

  a = phi [13, then], [undef, else]
  b = phi [undef, then], [37, else]

Since undef isn’t “unknown”, it’s “pick any value you like, per use”, InstCombine is allowed to instantiate each undef to whatever makes the expression simplest. This is the problem.

  a = 13
  b = 37

The branch is eliminated, but only because LLVM assumes that those undefs will take specific arbitrary values chosen for convenience (fewer instructions).

Yes, the spec permits this. But at that point the program has already violated the language contract by executing undefined behavior. The read is accidental by definition: the program makes no claim about the value. Treating that absence of meaning as permission to invent specific values is a semantic choice, and precisely what I am criticizing. This “optimization” is not a win unless you willfully ignore the program and everything but instruction count.

As for utility and justification: it’s all about user experience. A good language and compiler should preserve a clear mental model between what the programmer wrote and what runs. Silent non-local behavior changes (such as the one in the article) destroy that. Bugs should fail loudly and early, not be “optimized” away.

Imagine if the spec treated type mismatches the same way. Oops, assigned a float to an int, now it’s undef. Let’s just assume it’s always 42 since that lets us eliminate a branch. That’s obviously absurd, and this is the same category of mistake.