That would be easier but is not required. There are no compiler hints these days...

gpm · on Jan 30, 2025

I was going to make this argument, but I actually don't think it's true in almost any case.

Most functions could be inferred, but the ultimate source of basically all of these write only APIs is FFI functions, which in turn call systemcalls.

You're at least going to need a way to annotate the FFI calls and systemcalls to describe to the compiler how they access data.

queuebert · on Jan 30, 2025

If you're calling FFIs in an inner loop, you have bigger issues than the time it takes to clear the buffer, right?

gpm · on Jan 30, 2025

No? It depends on your definition of inner loop I guess.

If you're doing some sort of zero-copy IO, the time to clear the buffer might be non-trivial (not huge, but non-trivial). It's true that you need a large enough buffer that syscall/ffi overhead doesn't dominate, but that's not unrealistic.

It's rare that we care about this, that's true, that's why generally rust has been fine with "just zero buffers". There are definitely domains that care though.

thayne · on Jan 31, 2025

In some languages, like Java, go and probably Javascript, this is probably true, depending on how much memory needs to to be initialized. But in rust FFI isn't any more expensive than any other non-inlined function call.

vlovich123 · on Jan 30, 2025

The loop unrolling & invariant hoisting is a static transformation. What the “read” function does semantically isn’t captured today within that and the compiler wouldn’t be able to automatically infer it. It would have to be told that information and there would need to be unsafe annotations for things like syscalls and FFI boundaries. The other approach is to change the API which is what BorrowedBuf is.

If you can think of a different approach of how the compiler can figure out automatically what memory has become initialized by a random function call I’m all ears.

queuebert · on Jan 30, 2025

That's what I glossed over as "complicated analysis". In my mind, if a compiler can understand register and stack use (required for static transformations), it can (theoretically, and with some effort) understand heap use. Am I wrong?

vlovich123 · on Jan 31, 2025

Yes, you are wrong. This isn’t basic constant hoisting.

The compiler doesn’t reasonably have any of that information to understand what read is filling in at runtime because that information is encoded purely at runtime and the compiler has no reasoning mechanism even close to answering runtime data flow questions.

There’s also all sorts of complexity that has to do with the kinds of transformations that are possible as the legal information that exists at the language level is often erased before it gets to the stack/register piece and vice versa the language layer knows nothing about registers and minimal stuff about stack.

This is the same reason that the compiler fails to compile something like:

    for _ in 1..10 {
       let x: String = create_new_string();
       eprintln(“{x}”);
    }

Fails to hoist x out of the loop even if the returned string is String::new(“ABC”) unless maybe LTO is on (and even then maybe not). Basically the compiler’s “magic” is very limited to static transformations that follow as-if - the compiler must know the static transformation is blindly identical and the amount of reasoning about the structure is often very limited.

Said another way, if the compiler could do the optimizations you’re hypothesizing, it would be equivalent to applying a mid level performance engineer to every code base it encounters.

thayne · on Jan 31, 2025

The problem is that that kind of analysis would require full program optimization, rather than optimizing individual functions, because the signature of the call to read doesn't provide enough information for the compiler to know if it ever reads the data. And indeed, it can't know until link time.

I think it's theoretically possible, but at the cost of much longer compile times, and greater complexity in the compiler.