I'm a bit skeptical either example is representative of "most" existing software...

112233 · 2025-12-31T15:18:41 1767194321

Thank you for putting it in a much more correct and understandable language than I could. That is exactly what I am talking about: if you call __builtin_malloc (e.g. via macro definition in the libc header), compiler is free to do whatever it wants. However, calling "malloc" library function should call "malloc" library function, and anything else is unacceptable and a bug. There should be no case where compiler could assume anything about a function it does not see based simply on it's name. Neither malloc nor strlen.

aw1621107 · 2025-12-31T17:56:56 1767203816

> That is exactly what I am talking about: if you call __builtin_malloc (e.g. via macro definition in the libc header), compiler is free to do whatever it wants. However, calling "malloc" library function should call "malloc" library function, and anything else is unacceptable and a bug.

I think that's an overly narrow reading of the footnote. I don't see an obvious reason why "such names" in the footnote should only cover "some macro names beginning with an underscore" and not also "external identifiers". And if implementations are allowed to define special semantics for "external identifiers", then... well, that's exactly what they did!

In addition, there's still the as-if rule. The semantics of malloc/free are defined by the C standard; if the compiler can deduce that there is no observable difference between a version of the program that calls those and a version that does not, why does it matter that the call is emitted? A function call in and of itself is not a side effect, and since the C standard dictates what malloc/free do the compiler knows their possible side effects.

Furthermore, the addition of memset_explicit and its footnote ("The intention is that the memory store is always performed (i.e. never elided), regardless of optimizations. This is in contrast to calls to the memset function (7.26.6.1)") implies that eliding calls is in fact acceptable behavior when optimizations are enabled. If eliding calls were not permissible when optimizing then what's the point of memset_explicit?

> There should be no case where compiler could assume anything about a function it does not see based simply on it's name.

Again, external identifiers defined by the C standard are reserved. Reserved external identifiers aren't just for show. From the C89 standard:

> If the program defines an external identifier with the same name as a reserved external identifier, even in a semantically equivalent form, the behavior is undefined.

And from C23:

> If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), the behavior is undefined.

This means that yes, under modern compilers' interpretation of UB compilers can assume things about functions based on their names because modern compilers generally optimize assuming UB does not happen. The compiler does not need to see the function's implementation because it is the function's implementation as far as it is concerned.

112233 · 2025-12-31T19:34:29 1767209669

Ah yes, N2625 "What we think we reserve". Basically any C program containing variable or function "top", "END", "strict", "member" and so on is non-conforming and subject to undefined behaviour, so they define "potentially reserved" identifiers and as usual compiler vendors go and do the sane right thing.

aw1621107 · 2025-12-31T21:23:46 1767216226

That paper isn't relevant here. From the paper (emphasis added):

> 7.1.3 Reserved Identifiers

> [snip]

> Macro names and identifiers with external linkage that are specified in the C standard library clauses.

> This proposal does not propose any changes to these reserved identifiers.

Furthermore, that paper doesn't make the use of reserved external identifiers not UB, so there's no change there either.