This is not always true. Compilers are quite good at register allocation but sometimes they get it wrong and sometimes you can make small changes to code that improve register allocation and thus performance.
Usually the problem is an unfortunately placed spill, so the operation is actually l1d$ traffic, but still.
Unless you reserve on fork you're still over committing because after the fork writes to basically any page in either process will trigger memory commitment.
Thread stacks come up because reserving them completely ahead of time would incur large amounts of memory usage. Typically they start small and grow when you touch the guards. This is a form of overcommit. Even windows dynamically grows stacks like this
> Thread stacks come up because reserving them completely ahead of time would incur large amounts of memory usage. Typically they start small and grow when you touch the guards. This is a form of overcommit.
Ahead of the time memory reservation entails a page entry being allocated in the process’s page catalogue («logical» allocation), and the page «sits» dormant until it is accessed and causes a memory access fault – that is the moment when the physical allocation takes place. So copying reserved but not accessed yet pages has zero effect on the physical memory consumption of the process.
What actually happens to the thread stacks depends on the actual number of active threads. In modern designs, threads are consumed from thread pools that implement some sort of a run queue where the threads sit idle until they get assigned a unit of work. So if a thread is idle, it does not use its own stack thread and, consequently, there is no side effect on the child's COW address space.
Granted, if the child was copied with a large number of active threads, the impact will be very different.
> Even windows dynamically grows stacks like this
Windows employs a distinct process/thread design, making the UNIX concept of a process foreign. Threads are the primary construct in Windows and the kernel is highly optimised for thread management rather than processes. Cygwin has outlined significant challenges in supporting fork(2) semantics on Windows and has extensively documented the associated difficulties. However, I am veering off-topic.
I am aware reserving excess memory doesn't commit said memory. But it does reserve memory, which is what we were talking about. The point was that because you can have a lot of threads and restricting reserved stacks to some small value is annoying all systems overcommit stack. Windows initially commits some memory (reserving space in the page file/ram) for each but will dynamically commit more when you touch the guard. This is overcommit. Linux does similarly.
Idle threads do increase the amount of committed stack. Once their stack grows it stays grown, it's not common to unmap the end and shrink the stacks. In a system without overcommit these stacks will contribute to total reserved phys/swap in the child, though ofc the pages will be cow.
> Windows employs a distinct process/thread design, making the UNIX concept of a process foreign. Threads are the primary construct in Windows and the kernel is highly optimised for thread management rather than processes. Cygwin has outlined significant challenges in supporting fork(2) semantics on Windows and has extensively documented the associated difficulties. However, I am veering off-topic.
The nt kernel actually works similarly to Linux w.r.t. processes and threads. Internally they are the same thing. The userspace is what makes process creation slow. Actually thread creation is also much slower than on Linux, but it's better than processes. Defender also contributes to the problems here.
Windows can do cow mappings, fork might even be implementable with undocumented APIs. Exec is essentially impossible though. You can't change the identity of a process like that without changing the PID and handles.
Fun fact: the clone syscall will let you create a new task that both shares VM and keeps the same stack as the parent. Chaos results, but it is fun. You used to be able to share your PID with the parent too, which also caused much destruction.
> Idle threads do increase the amount of committed stack.
I am not clear on why the stack of an idlying thread would continue to grow. If a previously processed unit of work resulted in large amounts of memory pages backing the thread stack getting committed, then yes, it is not common to unmap the no longer required pages. It is a deliberate trade-off: automatic stack shrink is difficult to do safely and cheaply.
Idle does not actually make stacks grow, put simply.
> The nt kernel actually works similarly to Linux w.r.t. processes and threads.
Respectfully, this is slightly more that entirely incorrect.
Since Linux uses a single kernel abstraction («task_struct») for both processes and threads, it has one schedulable kernel object – «task_struct» – for both what user space calls a process and what user space calls a thread. «Process» is essentially a thread group leader plus a bundle of shared resources. Linux underwent the consolidation of abstractions in a quest to support POSIX threads at the kernel level decades ago.
Since fork(2) is, in fact, clone(2) with a bunch of flags, what you get depends on clone flags: sharing VM, files, FS context, signal handlers, and whether you are in the same thread group (CLONE_THREAD) and that creates a new thread group with its own memory management (but populated using copy-on-write), separate signal disposition context, etc.
Windows has two different kernel objects: a process (EPROCESS) and a thread (ETHREAD/KTHREAD). Threads are the schedulable entities; a process is the container for address space, handle table, security token, job membership, accounting, etc. They are tightly coupled, but not «the same thing».
On Windows, «CreateProcess» is heavier than Linux fork for structural reasons: it builds a new process object, maps an image section, creates the initial thread, sets up the PEB/TEB, initialises the loader path, environment, mitigations, etc. A chunk of that work is kernel-side and a chunk is user-mode (notably the loader and, for Win32, subsystem involvement). Blaming only «userspace» is wrong.
Defender (and third-party AV/EDR) can measurably slow process creation because it tends to inspect images, scripts, and memory patterns around process start, not because of deficiences of the kernel and system calls design.
> […] because after the fork writes to basically any page in either process will trigger memory commitment.
This is largely not true for most processes. For a child process to start writing into its own data pages en masse, there has to exist a specific code path that causes such behaviour. Processes do not randomly modify their own data space – it is either a bug or a peculiar workload that causes it.
You would have a stronger case if you mentioned, e.g., the JVM, which has a high complexity garbage collector (rather, multiple types of garbage collectors – each with its own behaviour), but the JVM ameliorates the problem by attempting to lock in the entire heap size at startup or bailing if it fails to do so.
In most scenarios, forking a process has a negligible effect on the overall memory consumption in the system.
> In most scenarios, forking a process has a negligible effect on the overall memory consumption in the system.
Yes, that’s what they’re getting at. It’s good overcommitment. It’s still overcommitment, because the OS has no way of knowing whether the process has the kind of rare path you’re talking about for the purposes of memory accounting. They said that disabling overcommit is wasteful, not that fork is wasteful.
Yep. If you aren't overcommitting on fork it's quite wasteful, and if you are overcommitting on fork then you've already given up on not having to handle oom conditions after malloc has returned.
This is a crucial distinction and I agree when the problem is framed this way.
The original statement by another GP, however, was that fork(2) is wasteful (it is not).
In fact, I have mentioned it in a sister thread that the OS does not have a way to know of the kind of behaviour the parent or the child will exhibit after forking[0].
Generally speaking, this is in line with the foundational ethos of the UNIX philosophy where UNIX gives its users a wide array of tools tantamount to shotguns that shoot both forward and backward simultaneously and the responsibility for with the number of deaths and permanent maimings ultimately lies with its users. In comparison, memory management in operating systems that run mainframes is substantially more complex and sophisticated.
[0] In a separate thread, somebody else has mentioned a valid reverse scenario where the child idles by after forking and it is the parent that makes its data pages dirty causing the physical memory consumption to baloon.
Fwiw you can use pressure stall information to load shed. This is superior to disabling overcommit and then praying the first allocation to fail is in the process you want to actually respond to the resource starvation.
Fact is that by the time small allocations are failing you are almost no better off handling the null than you would be handling segfaults and the sigterms from the killer.
Often for servers performance will fall off a cliff long before the oom killer is needed, too.
I don't see how you spending time in ways that work well with your challenges is different from your job providing accomodations, except that if your employer is willing to work with you then you don't have to randomly roll the dice until you come up with an employer where things happen to work in whatever way you wanted.
It's not like one of the accomodations on the table is "not doing your job"
1. The career I would like to have, and the life I desire to live, is my free choice. Once I've made that choice, the community's responsibility is to give me whatever I need so that I can apply myself to that career and live the life I imagine for myself.
vs
2. I have certain capabilities and limitations. The community has certain needs. If there's any way for me to do so, it's my responsibility to figure out how my capabilities can service the community's needs, respecting my limitations, and it's the community's reciprocal responsibility to make sure my contribution is fairly acknowledged so that I can live a secure and constructive life. I'll figure out the rest from there.
I see your point, but generally you don't need to roll the dice purely randomly. You generally do get some level of control over your career, your job and the people you work with. With white collar jobs employers are also generally relatively more willing to accommodate personal quirks as long as you get the actual job done.
The level control and flexibility is much smaller when you're a student.
Also, in school there's a lot of emphasis on how you get the job done. There's a prescribed process, and learning the prescribed process is more important (to some of the teachers apparently) than getting the actual job done. This is often where neurodivergent people struggle more in school and at work.
The difference is how you relate to the job providing accommodations. If you know certain employers are more willing and able to provide accommodations then you can consciously weigh that piece of information when considering a job/career field.
By consciously accepting who you are and how you work with the world, it lets you navigate better in it. For some people that is just feeling it out and ending up in a career that fits them. For some people, it might be getting a diagnosis. The end result might be the same.
This is probably ok if you consistently sell stuff that exits the top7 and buy stuff that enters, but I kinda doubt it's all that much better. Same as the s&p500 and the Russell 5000 have really similar returns.
You'll be more exposed to screwing things up when companies enter/leave.
Usually the problem is an unfortunately placed spill, so the operation is actually l1d$ traffic, but still.