> All memory must be statically allocated at startup. But why? If you do that yo...

matklad · 2025-12-29T19:20:10 1767036010

See https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/TI... for motivation.

- Operational predictability --- latencies stay put, the risk of threshing is reduced (_other_ applications on the box can still misbehave, but you are probably using a dedicated box for a key database)

- Forcing function to avoid use-after-free. Zig doesn't have a borrow checker, so you need something else in its place. Static allocation is a large part of TigerBeetle's something else.

- Forcing function to ensure existence of application-level limits. This is tricky to explain, but static allocation is a _consequence_ of everything else being limited. And having everything limited helps ensure smooth operations when the load approaches deployment limit.

- Code simplification. Surprisingly, static allocation is just easier than dynamic. It has the same "anti-soup-of-pointers" property as Rust's borrow checker.

pron · 2025-12-29T20:10:43 1767039043

> Forcing function to avoid use-after-free

Doesn't reusing memory effectively allow for use-after-free, only at the progam level (even with a borrow checker)?

muvlon · 2025-12-29T22:07:24 1767046044

Yes, kind of. In the same sense that Vec<T> in Rust with reused indexes allows it.

Notice that this kind of use-after-free is a ton more benign though. This milder version upholds type-safety and what happens can be reasoned about in terms of the semantics of the source language. Classic use-after-free is simply UB in the source language and leaves you with machine semantics, usually allowing attackers to reach arbitrary code execution in one way or another.

pron · 2025-12-29T23:15:25 1767050125

That what happens can be reasoned about in the semantics of the source language as opposed to being UB doesn't necessarily make the problem "a ton more benign". After all, a program written in Assembly has no UB and all of its behaviours can be reasoned about in the source language, but I'd hardly trust Assembly programs to be more secure than C programs [1]. What makes the difference isn't that it's UB but, as you pointed out, the type safety. But while the less deterministic nature of a "malloc-level" UAF does make it more "explosive", it can also make it harder to exploit reliably. It's hard to compare the danger of a less likely RCE with a more likely data leak.

On the other hand, the more empirical, though qualitative, claim made by by matklad in the sibling comment may have something to it.

[1]: In fact, take any C program with UB, compile it, and get a dangerous executable. Now disassemble the executable, and you get an equally dangerous program, yet it doesn't have any UB. UB is problematic, of course, partly because at least in C and C++ it can be hard to spot, but it doesn't, in itself, necessarily make a bug more dangerous. If you look at MITRE's top 25 most dangerous software weaknesses, the top four (in the 2025 list) aren't related to UB in any language (by the way, UAF is #7).

matklad · 2025-12-30T01:47:11 1767059231

>If you look at MITRE's top 25 most dangerous software weaknesses, the top four (in the 2025 list) aren't related to UB in any language (by the way, UAF is #7).

FWIW, I don't find this argument logically sound, in context. This is data aggregated across programming languages, so it could simultaneously be true that, conditioned on using memory unsafe language, you should worry mostly about UB, while, at the same time, UB doesn't matter much in the grand scheme of things, because hardly anyone is using memory-unsafe programming languages.

There were reports from Apple, Google, Microsoft and Mozilla about vulnerabilities in browsers/OS (so, C++ stuff), and I think there UB hovered at between 50% and 80% of all security issues?

And the present discussion does seem overall conditioned on using a manually-memory-managed language :0)

pron · 2025-12-30T04:08:34 1767067714

You're right. My point was that there isn't necessarily a connection between UB-ness and danger, and stuck together two separate arguments:

1. In the context of languages that can have OOB and/or UAF, OOB/UAF are very dangerous, but not necessarily because they're UB; they're dangerous because they cause memory corruption. I expect that OOB/UAF are just as dangerous in Assembly, even though they're not UB in Assembly. Conversely, other C/C++ UBs, like signed overflow, aren't nearly as dangerous.

2. Separately from that, I wanted to point out that there are plenty of super-dangerous weaknesses that aren't UB in any language. So some UBs are more dangerous than others and some are less dangerous than non-UB problems. You're right, though, that if more software were written with the possibility of OOB/UAF (whether they're UB or not in the particular language) they would be higher on the list, so the fact that other issues are higher now is not relevant to my point.

kibwen · 2025-12-30T04:16:51 1767068211

> In fact, take any C program with UB, compile it, and get a dangerous executable. Now disassemble the executable, and you get an equally dangerous program, yet it doesn't have any UB.

I'd put it like this:

Undefined behavior is a property of an abstract machine. When you write any high-level language with an optimizing compiler, you're writing code against that abstract machine.

The goal of an optimizing compiler for a high-level language is to be "semantics-preserving", such that whatever eventual assembly code that gets spit out at the end of the process guarantees certain behaviors about the runtime behavior of the program.

When you write high-level code that exhibits UB for a given abstract machine, what happens is that the compiler can no longer guarantee that the resulting assembly code is semantics-preserving.

uecker · 2025-12-30T08:12:32 1767082352

Since it has UB it is easy for the compiler to guarantee that the resulting code is semantics-preserving: Anything the code does is OK.

matklad · 2025-12-29T21:05:15 1767042315

There's some reshuffling of bugs for sure, but, from my experience, there's also a very noticeable reduction! It seems there's no law of conservation of bugs.

I would say the main effect here is that global allocator often leads to ad-hoc, "shotgun" resource management all other the place, and that's hard to get right in a manually memory managed language. Most Zig code that deals with allocators has resource management bugs (including TigerBeetle's own code at times! Shoutout to https://github.com/radarroark/xit as the only code base I've seen so far where finding such bug wasn't trivial). E.g., in OP, memory is leaked on allocation failures.

But if you manage resources manually, you just can't do that, you are forced to centralize the codepaths that deal with resource acquisition and release, and that drastically reduces the amount of bug prone code. You _could_ apply the same philosophy to allocating code, but static allocation _forces_ you to do that.

The secondary effect is that you tend to just more explicitly think about resources, and more proactively assert application-level invariants. A good example here would be compaction code, which juggles a bunch of blocks, and each block's lifetime is tracked both externally:

* https://github.com/tigerbeetle/tigerbeetle/blob/0baa07d3bee7...

and internally:

* https://github.com/tigerbeetle/tigerbeetle/blob/0baa07d3bee7...

with a bunch of assertions all other the place to triple check that each block is accounted for and is where it is expected to be

https://github.com/tigerbeetle/tigerbeetle/blob/0baa07d3bee7...

I see a weak connection with proofs here. When you are coding with static resources, you generally have to make informal "proofs" that you actually have the resource you are planning to use, and these proofs are materialized as a web of interlocking asserts, and the web works only when it is correct in whole. With global allocation, you can always materialize fresh resources out of thin air, so nothing forces you to do such web-of-proofs.

To more explicitly set the context here: the fact that this works for TigerBeetle of course doesn't mean that this generalizes, _but_, given that we had a disproportionate amount of bugs in small amount of gpa-using code we have, makes me think that there's something more here than just TB's house style.

pron · 2025-12-29T23:18:17 1767050297

That's an interesting observation. BTW, I've noticed that when I write in Assembly I tend to have fewer bugs than when I write in C++ (and they tend to be easier to find). That's partly because I'm more careful, but also because I only write much shorter and simpler things in Assembly.

nickmonad · 2025-12-30T15:27:59 1767108479

Hey matklad! Thanks for hanging out here and commenting on the post. I was hoping you guys would see this and give some feedback based on your work in TigerBeetle.

You mentioned, "E.g., in OP, memory is leaked on allocation failures." - Can you clarify a bit more about what you mean there?

matklad · 2025-12-30T16:47:21 1767113241

In

    const recv_buffers = try ByteArrayPool.init(gpa, config.connections_max, recv_size);
    const send_buffers = try ByteArrayPool.init(gpa, config.connections_max, send_size);

if the second try throws, than the memory allocation created by the first try is leaked. Possible fixes:

A) clean up individual allocations on failure:

    const recv_buffers = try ByteArrayPool.init(gpa, config.connections_max, recv_size);
    errdefer recv_buffers.deinit(gpa);

    const send_buffers = try ByteArrayPool.init(gpa, config.connections_max, send_size);
    errdefer send_buffers.deinit(gpa);

B) ask the caller to pass in an arena instead of gpa to do bulk cleanup (types & code stays the same, but naming & contract changes):

    const recv_buffers = try ByteArrayPool.init(arena, config.connections_max, recv_size);
    const send_buffers = try ByteArrayPool.init(arena, config.connections_max, send_size);

C) declare OOMs to be fatal errors

    const recv_buffers = ByteArrayPool.init(gpa, config.connections_max, recv_size) catch |err| oom(err);
    const send_buffers = ByteArrayPool.init(gpa, config.connections_max, send_size) catch |err| oom(err);

    fn oom(_: error.OutOfMemory) noreturn { @panic("oom"); }

You might also be interesting in https://matklad.github.io/2025/12/23/static-allocation-compi..., it's essentially a complimentary article to what @MatthiasPortzel says here https://news.ycombinator.com/item?id=46423691

nickmonad · 2025-12-30T17:13:31 1767114811

Gotcha. Thanks for clarifying! I guess I wasn't super concerned about the 'try' failing here since this code is squarely in the initialization path, and I want the OOM to bubble up to main() and crash. Although to be fair, 1. Not a great experience to be given a stack trace, could definitely have a nice message there. And 2. If the ConnectionPool init() is (re)used elsewhere outside this overall initialization path, we could run into that leak.

The allocation failure that could occur at runtime, post-init, would be here: https://github.com/nickmonad/kv/blob/53e953da752c7f49221c9c4... - and the OOM error kicks back an immediate close on the connection to the client.

AnimalMuppet · 2025-12-29T17:23:11 1767028991

1. On modern OSes, you probably aren't "taking it away from other processes" until you actually use it. Statically allocated but untouched memory is probably just an entry in a page table somewhere.

2. Speed improvement? No. The improvement is in your ability to reason about memory usage, and about time usage. Dynamic allocations add a very much non-deterministic amount of time to whatever you're doing.

jeffreygoesto · 2025-12-29T18:18:41 1767032321

Using this as well in embddded. The whole point is to commit and lock the pages after allocation, to not experience what you correctly describe. You want to have a single checkpoint after which you simply can stop worrying about oom.

Ericson2314 · 2025-12-29T17:30:39 1767029439

If you use it and stop using it, the OS cannot reclaim the pages, because it doesn't know that you've stopped. At best, it can offload the memory to disk, but this waste disk space, and also time for pointless writes.

norir · 2025-12-29T17:55:02 1767030902

This is true, whether it matters is context dependent. In an embedded program, this may be irrelevant since your program is the only thing running so there is no resource contention or need to swap. In multi-tenant, you could use arenas in an identical way as single static allocation and release the arena upon completion. I agree that allocating a huge amount of memory for a long running program on a multi-tenant os is a bad idea in general, but it could be ok if for example you are running a single application like a database on the server in which you are back to embedded programming only the embedding is a database on a beefy general purpose computer.

Ericson2314 · 2025-12-29T18:56:42 1767034602

Yes it is context dependent, but the parent comment was acting as if it was just better. I wanted to correct that.

jrmg · 2025-12-29T19:20:02 1767036002

In response to (1) - you’re right, but that also implies that the added safety from static allocation when running on a modern OS is just an illusion: the OS may be unable to supply a fresh page from your ‘statically allocated’ memory when you actually write to it and it has to be backed by something real. The real stuff may have run out.

kibwen · 2025-12-29T18:17:05 1767032225

> On modern OSes, you probably aren't "taking it away from other processes" until you actually use it.

But if you're assuming that overcommit is what will save you from wasting memory in this way, then that sabotages the whole idea of using this scheme in order to avoid potential allocation errors.

pastage · 2025-12-29T21:46:09 1767044769

Use mlock as long as it is allocated it is going to be rather deterministic, of course you might be running in a VM on an over commited host. I guess you can "prefault" in a busy loop instead of only on startup, waste memory and cpu!