That would be easier but is not required. There are no compiler hints these days to unroll loops or hoist invariants, even though if done incorrectly it could change the result. It would take some complicated analysis, but I think it could be done safely in some cases.
No? It depends on your definition of inner loop I guess.
If you're doing some sort of zero-copy IO, the time to clear the buffer might be non-trivial (not huge, but non-trivial). It's true that you need a large enough buffer that syscall/ffi overhead doesn't dominate, but that's not unrealistic.
It's rare that we care about this, that's true, that's why generally rust has been fine with "just zero buffers". There are definitely domains that care though.
In some languages, like Java, go and probably Javascript, this is probably true, depending on how much memory needs to to be initialized. But in rust FFI isn't any more expensive than any other non-inlined function call.
The loop unrolling & invariant hoisting is a static transformation. What the “read” function does semantically isn’t captured today within that and the compiler wouldn’t be able to automatically infer it. It would have to be told that information and there would need to be unsafe annotations for things like syscalls and FFI boundaries. The other approach is to change the API which is what BorrowedBuf is.
If you can think of a different approach of how the compiler can figure out automatically what memory has become initialized by a random function call I’m all ears.
That's what I glossed over as "complicated analysis". In my mind, if a compiler can understand register and stack use (required for static transformations), it can (theoretically, and with some effort) understand heap use. Am I wrong?
Yes, you are wrong. This isn’t basic constant hoisting.
The compiler doesn’t reasonably have any of that information to understand what read is filling in at runtime because that information is encoded purely at runtime and the compiler has no reasoning mechanism even close to answering runtime data flow questions.
There’s also all sorts of complexity that has to do with the kinds of transformations that are possible as the legal information that exists at the language level is often erased before it gets to the stack/register piece and vice versa the language layer knows nothing about registers and minimal stuff about stack.
This is the same reason that the compiler fails to compile something like:
for _ in 1..10 {
let x: String = create_new_string();
eprintln(“{x}”);
}
Fails to hoist x out of the loop even if the returned string is String::new(“ABC”) unless maybe LTO is on (and even then maybe not). Basically the compiler’s “magic” is very limited to static transformations that follow as-if - the compiler must know the static transformation is blindly identical and the amount of reasoning about the structure is often very limited.
Said another way, if the compiler could do the optimizations you’re hypothesizing, it would be equivalent to applying a mid level performance engineer to every code base it encounters.
The problem is that that kind of analysis would require full program optimization, rather than optimizing individual functions, because the signature of the call to read doesn't provide enough information for the compiler to know if it ever reads the data. And indeed, it can't know until link time.
I think it's theoretically possible, but at the cost of much longer compile times, and greater complexity in the compiler.