Interesting article! One thing that made me literally LOL was the fact that seve...

dbdr · 2025-11-14T11:59:58 1763121598

Quote from their style guide:

> The fact that unsigned arithmetic doesn't model the behavior of a simple integer, but is instead defined by the standard to model modular arithmetic (wrapping around on overflow/underflow), means that a significant class of bugs cannot be diagnosed by the compiler.

Fair enough, but signed arithmetic doesn't model the behavior of a "simple integer" (supposedly the mathematical concept) either. Instead, overflow in signed arithmetic is undefined behavior. Does that actually lead to the compiler being able to diagnose bugs? What's the claimed benefit exactly?

sltkr · 2025-11-14T13:43:20 1763127800

Tools like UBsan [1] can detect integer overflow in debug builds, and are used internally at Google to run automated tests.

So if you use a signed integer, there is a chance that overflows are caught in tests.

1. https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html

im3w1l · 2025-11-14T18:18:09 1763144289

I feel like there should be a "sign-queer" integer. It's range is the intersection of unsigned and signed integers, in other words the high bit must be unset. Wrapping around on either end is illegal and should be diagnosed in debug builds. In production builds it may either be silently treated as an unsigned integer or generate a diagnostic.

Implicitly casting to either signed or unsigned integer is allowed and does not generate a warning.

sltkr · 2025-11-14T19:22:14 1763148134

That's the behavior you get in Zig, if you declare a variable of type `u31`: an unsigned 31-bit integer that's implicitly convertible to `u32` or `i32`.

Also in Zig overflow is illegal for both signed and unsigned integers, and guaranteed to be detected when building in Safe mode (but not in Fast mode). There is a separate set of operators for wrapping arithmetic.

gf000 · 2025-11-14T12:05:30 1763121930

I believe some logic behind may be that you can't recognize an overflow has happened with unsigned, but with signed you can recognize over and underflows in certain cases by simply checking if it's a non-negative number.

At least I believe Java decided on signed integers for similar reasons. But if it's indeed UB in C++, it doesn't make sense.

pjmlp · 2025-11-14T16:43:38 1763138618

Gosling on the matter,

> One of the little experiments I tried was asking people about the rules for unsigned arithmetic in C. It turns out nobody understands how unsigned arithmetic in C works. There are a few obvious things that people understand, but many people don't understand it.

-- https://www.artima.com/articles/james-gosling-on-java-may-20...

mzs · 2025-11-14T17:08:21 1763140101

For C23 at least: https://gustedt.wordpress.com/2022/12/18/checked-integer-ari...

  #include <stdckdint.h>

alpinisme · 2025-11-14T13:09:40 1763125780

It’s the opposite in cpp: unsigned integer overflow is undefined but signed overflow is defined as wrapping

sltkr · 2025-11-14T13:49:43 1763128183

No, it's the opposite. UNSIGNED overflow wraps around. SIGNED overflow is undefined behavior.

This leads to fun behavior. Consider these functions which differ only in the type of the loop variable:

    int foo() {
        for (int i = 1; i > 0; ++i) {}
        return 42;
    }
    
    int bar() {
        for (unsigned i = 1; i > 0; ++i) {}
        return 42;
    }

If you compile these with GCC with optimization enabled, the result is:

    foo():
    .L2:
        jmp     .L2

    bar():
        mov     eax, 42
        ret

That is, foo() gets compiled into an infinite loop, while the loop in bar() is eliminated instead. This is because the compiler may assume only in the first case that i will never overflow.

debugnik · 2025-11-14T13:37:36 1763127456

Did you mix up unsigned and signed by mistake? Because in C and C++, the wrapping one is unsigned and the here-be-dragons-on-overflow one is signed.

alpinisme · 2025-11-15T00:02:08 1763164928

Oh jeeze. I can’t believe I flipped that. I find myself wishing I could delete my comment.

dzaima · 2025-11-14T13:45:35 1763127935

A sanitizer or static analysis or any other tool can unconditionally give you a warning/error on signed integer overflow. Whereas that's invalid for unsigned integers as they have well-defined behavior, and things depend on said overflow (hashing, bitwise magic, temporary wrapping that unwraps later, etc).

Ideally there'd be a third type for unsigned-non-wrapping-integer (and llvm even supports a UB-on-unsigned-wrapping flag for arith ops in its IR that largely goes unused for C/C++), but alas such doesn't exist. Half-relatedly, this previously appeared as a discussion point on Linux (though Linus really did not like the concept of multiple unsigned types and as such it didn't go anywhere iirc).

Leszek · 2025-11-14T11:54:42 1763121282

The signed length fields pre-date the sandbox, and at that point being able to corrupt the string length meant you already had an OOB write primitive and didn't need to get one via strings. The sandbox is the new weird thing, where now these in-sandbox corruptions can sometimes be promoted into out-of-sandbox corruptions if code on the boundary doesn't handle these sorts of edge cases.

pjjpo · 2025-11-15T10:31:38 1763202698

I somehow suspect the conversation stopped before `GE#1: Yeah, but that could easily be exploited, right?`