Unfortunately, there is a huge amount of cargo-culted cruft lying around in vari...

FeepingCreature · on July 16, 2024

I have had systems completely die from hitting swap a few years ago. This is not a 2000s problem.

a2tech · on July 16, 2024

I’ve learned to disable swap on my scientific computing machines where we’re working on giant datasets. It’s better for the machine to crash when it exhausts its RAM than go to swap.

In my experience a machine is never going to recover when a workload pushes it into swapping because something has gone awry and that situation is not going to fix itself.

oblio · on July 16, 2024

There are many reasons this situation could happen outside of your context and swapping on SSDs is comparatively harmless compared to the old days of HDDs. Random example: swapping due to VM. You just stop VMs.

FeepingCreature · on July 16, 2024

Yeah on my current nvme linux systems, swap is just "the phase where the ongoing leak makes the system kind of sluggish, shortly before the oom killer goes to work". On 32GB, I ~never hit swap "legitimately".

The most useful thing honestly has been a memory usage applet in the task bar. Memory leaks usually have a very clean and visible signature that provides a few seconds of warning to hit alt-tab-tab-ctrl-c.

gmokki · on July 16, 2024

Was you kernel new enough to have MGLRU (kernel 6.1+).

After that improvement one can be swapping constantly and the machine is still responsive.

cameronh90 · on July 16, 2024

That's because when it comes to memory management on a Linux workstation, it is an unsolved problem. I've tried every piece of advice, distro and tool, and spent hundreds of hours trying to tune it over the years, and haven't been able to find a configuration that works as reliably as Windows or MacOS do out of the box.

Linux memory management works well for servers where you can predict workloads, set resource limits, spec the right amount of memory, and, in most cases, don't care that much if an individual server crashes.

For workstations, it either kicks in too early (and kills your IDE to punish you for opening too many tabs in Chrome) or it doesn't kick in at all, even when the system has become entirely unresponsive and you have to either mash magic sysrq or reboot.

TiredOfLife · on July 16, 2024

>At that time suddenly dealing with memory swap made the system unusably unresponsive

Interestingly that was my experience on steam deck with its default 1gb swap. But after enabling both zram and larger ordinary swap (now also default setting for upcoming release) it became much more stable and responsive.

speed_spread · on July 16, 2024

Swapping in any form always sucks, period. The machine starts behaving strangely and does not tell you why because it's trying it's hardest to hide the fact that it ran out of resources.

Experience has shown me over and over that you just want to feel the limits of the machine hard and fast so you can change what you're asking of it rather than thinking that there is some perf issue or weird bug.

It's the idea that swap is somehow useful that's old. It's not, it never worked right for interactive systems. It's a mainframe thing that needs to die.

andrewaylett · on July 16, 2024

But where else are you going to put your anonymous pages when you don't want them for a while?

Lots of the stuff you're using is backed by disk anyway -- and will be removed from RAM when there's any memory pressure, whether or not you have any swap. If you've got swap then the system can put anonymous pages in it, otherwise it'll need to evict named files more frequently.

Unless you have enough RAM that you're literally never evicting anything from your page cache, in which case swap still doesn't hurt you.

I'll absolutely agree that swapping out part of the working set is unwanted, but most swapping is benign and genuinely helps performance by allowing the system to retain more useful data in RAM. You don't want to get into a state where you're paging code in and out of RAM because there's nowhere to put data that's not being used.

speed_spread · on July 16, 2024

The whole concept of "virtual memory" has tainted systems design for decades. Treating RAM as a cache relies on the OS making guesses about what will be needed and what can be passivated without it actually knowing the application requirements. Except that compared to CPU level caching, the cost of page faults is big enough that performance degradation is not linear and breaks the user experience. The idea that a 4GB machine can do the same with as an 8GB one albeit slower is just not true. If you hit the swap, you feel it bad. I'll concede that Zram can work because the degradation is softer. But anything hitting the IO should be explicitly controlled by the app.

Other random semi-related thoughts:

- Rust having to define a new stdlib to be used in Linux kernel because of explicit allocation failure requirements. Why wasn't this possibility factored in from the beginning?

- Most software nowadays just abstracts memory costs, partly explaining why a word processor that used to work fine with 64mb of RAM now takes a gig to get anything done.

- Embedded development experience should be a requirement for any serious software engineer.

steveklabnik · on July 16, 2024

> Rust having to define a new stdlib to be used in Linux kernel because of explicit allocation failure requirements.

This is phrased in a way that’s a bit more extreme than in reality. Some new features are in the process of being added.

> Why wasn't this possibility factored in from the beginning?

So, there’s a few ways to talk about this. The first is… it was! Rust has three major layers to its standard library: core, alloc, and std. core, the lowest level, is a freestanding library. Alloc introduces memory allocations, and std introduces stuff that builds on top of OS functionality, like filesystems. What’s going on here is the kernel wanting to use the alloc layer in the kernel itself. So it’s naturally a bit higher level, and so needs some more work to fit in. Just normal software development stuff.

Why didn’t alloc have fallible APIs? Because of Linux, ironically. The usual setup there means you won’t ever observe an allocation failure. So there hasn’t been a lot of pressure to add those APIs, as they’re less useful then you might imagine at first. And it also goes the other way; a lot of embedded systems do not allocate dynamically at all, so for stuff smaller or lower level than Linux, there hasn’t been any pressure there either.

Also, I use the word “pressure” on purpose: like any open source project, work gets done when someone that needs a feature drives that feature forward. These things have been considered, for essentially forever, it’s just that finishing the work was never prioritized by anyone, because there’s an infinite amount of work to do and a finite number of people doing it. The Rust for Linux folks are now those people coming along and driving that upstream work. Which benefits all who come later.

speed_spread · on July 16, 2024

Oh hello, thanks for the clarification! Having enjoyed writing some embedded Rust, I'm familiar with the core/alloc/std split. IIUC you're saying that the user-space Linux malloc API itself does not provide a reliable way for the application to think about hard memory limits? Which would fuel my pet theory about "infinite virtual memory" being a significant factor in the ever growing software bloat.

steveklabnik · on July 16, 2024

> I'm familiar with the core/alloc/std split.

Ah, okay. So yeah, it's not a new standard library, it's "things like Vec are adding .push_within_capacity() that's like push except it returns a Result and errors instead of reallocating" more than "bespoke standard library."

> IIUC you're saying that the user-space Linux malloc API itself does not provide a reliable way for the application to think about hard memory limits?

It's not the user-space malloc API, it's lower than that. See " /proc/sys/vm/overcommit_memory" in https://man7.org/linux/man-pages/man5/proc_sys_vm.5.html

The default is "heuristic overcommit." This page does a better job of explaining what that means: https://www.kernel.org/doc/Documentation/vm/overcommit-accou...

So, unless you've specifically configured this to 2, there are many circumstances where you simply will not get an error from the kernel, even if you've requested more memory than available.

What happens in this case is that your program will continue to run. At some point, it will access the bad allocation. The kernel will notice that there's not actually enough memory, and the "oom killer" will decide to kill a process to make space. It might be your process! It also might not be. Just depends. But this happens later, and asynchronously from your program. You cannot handle this error from inside your program.

So even if these APIs existed, they wouldn't change the behavior: they would faithfully report what the kernel reported to them: that the allocation succeeded.

andrewaylett · on July 16, 2024

Most of the time, you want to use RAM as a cache for the disk. I was trying to make the argument that sometimes that disk cache is more valuable than an under-used anonymous mapping.

Steve has responded to your comment about Rust; to your other comments:

Modern applications do a lot more than old ones. Even if you only use 20% of the features, you probably use a different 20% from any arbitrary other person. You also probably benefit from the OS being able to map everything into virtual memory but only actually load the bits you use :).

And I strongly disagree with your stance on being "serious". I'm sure you don't mean to gate-keep, but we need to teach people where they are rather than giving them hoops to jump through.

In my experience, some of the best software engineers have very little development background. And I say that as someone who implemented 64-bit integer support for the compiler and RTL for a DSP part back in the day. It's useful to have people around with a variety of backgrounds, it's not necessary for everyone to share any particular experience.

izacus · on July 16, 2024

Swapping to zram is just fine and it will improve experience on many machines.

speed_spread · on July 16, 2024

Yeah, I agree. The memory-to-memory + modern CPU power makes it transparent or at least gives it a soft roll-off that IO based swap never achieves. But it's still a hack which too often is used by manufacturers to cheapen on RAM in machines.

As the gas-powered engine people will say: "there's no replacement for displacement" (I wont push the analogy comparing zram to turbocharging but, you know, they both deal with "compression"...)

tetha · on July 16, 2024

I have similar experiences. I've been digging into this more over the years and my two conclusions are: (a) Linux memory management is overall rather complex and contains many rather subtle decisions that speed up systems. (b) Most recommendations you find about it are old, rubbish, or not nuanced enough.

Like one thing I learned some time ago: swap-out in itself is not a bad thing. swap-out on it's own means the kernel is pushing memory pages it currently doesn't need to disk. It does this to prepare for a low-memory situation so if push comes to shove and it has to move pages to disk, some pages are already written to disk. And if the page is dirtied later on before needing to swap it back in, alright, we wasted some iops. Oh no. This occurs quite a bit for example for long-running processes with rarely used code paths, or with processes that do something once a day or so.

swap-in on the other hand is nasty for the latency of processes. Which, again, may or may not be something to care about. If a once-a-day monitoring script starts a few milliseconds slower because data has to be swapped in... so what?

It just becomes an issue if the system starts trashing and rapidly cycling pages in and out of swap. But in such a situation, the system would start randomly killing services without swap, which is also not entirely conductive to a properly working system. Especially because it'll start killing stuff using a lot of memory... which, on a server, tends to be the thing you want running.

jorvi · on July 16, 2024

It is not just advice.

Default configs of most distros are set up for server-style work, even on workstation distros. So they’ll have CPU and IO schedulers optimized for throughput instead of latency, meaning a laggy desktop under load. The whole virtual memory system still runs things like it is on spinning rust (multiple page files in cache, low swappiness, etc).

The only distro without this problem is Asahi. It’s bespoke for MacBooks, so it’s been optimized all the way down to the internal speakers(!).

oblio · on July 16, 2024

> Default configs of most distros are set up for server-style work, even on workstation distros. So they’ll have CPU and IO schedulers optimized for throughput instead of latency, meaning a laggy desktop under load. The whole virtual memory system still runs things like it is on spinning rust (multiple page files in cache, low swappiness, etc).

LOL. A Ken Colivas problem, circa 2008, still there :-)))

yjftsjthsd-h · on July 16, 2024

> At that time suddenly dealing with memory swap made the system unusably unresponsive (I mean unusable, not just frustrating or irritating).

I had a machine freeze this month because it was trying to zram swap, and have hit shades of the problem over the last few years on multiple machines running multiple distros. Sometimes running earlyoom helps, but at that point what's the point of swap? So no, this isn't out of date.

tjoff · on July 16, 2024

This is OS-agnostic. I love the old fact that you should have twice the amount of swap as your RAM size. I could rant but, no. Just don't.

Today, don't buy a computer (regardless of size) with less than 32 GB of ram. Yes, this applies to fruity products as well. Part from making it a more enjoyable experience it will also extend the usable life of the computer immensely.

(The weird crap about apple computers not needing as much RAM comes from iOS vs. android and is for different reasons, does not apply to real computers)

hhh · on July 16, 2024

I don’t understand the sentiment. People should analyze what they actually use and what the need is. Sure, I bought a 64gb ram macbook because I like toys and don’t want to think about it, but for 80% of my workload 8gb is fine, and for my partner it’s fine for 100%.

tjoff · on July 16, 2024

8 GB can, even in this electron world, barely work. But it won't tomorrow. Buying something with 8 GB today is wasting an otherwise perfectly good computer.

And when your partner gets a new computer, for whatever reason, the old one can easily live on for many many years. But it's utility will be limited if it only has 8 GB of ram.

The product in the article is only 8 years old but already stretching its usefulness for no good reason.

sampo · on July 16, 2024

> I love the old fact that you should have twice the amount of swap as your RAM size.

With a 32GB memory, 256GB ssd-disk laptop, it would be really weird to set up 64GB of the disk for swap.

tjoff · on July 16, 2024

Maybe I was unclear, I despise that rule.

(also, a computer with 32 GB and 256 GB disk is a very weird combination not quite fitting a typical general purpose computer)