> Since everyone is treating containers as cattle CRIU doesn't seem to get much ...

__turbobrew__ · on June 22, 2024

The thing with VMs is that there is much more overhead to booting a Linux VM which makes checkpointing much more attractive. For a container running with Linux namespaces/cgroups the container can be started in a few milliseconds.

Im sure there are some niche applications for container checkpointing, but I don’t really see the complexity being worth it. Maybe checkpointing some long running batch jobs could save you some money, but you should just make your jobs checkpoint their state to an external store such a ceph or s3 and make the jobs smart enough to load any state from those stores if they are preempted.

yencabulator · on June 23, 2024

Firecracker starts running the application in as low as 125ms. Most of the overhead in a "cloud cold start" comes from the cloud infrastructure, not from the virtualization mechanism.

yourapostasy · on June 22, 2024

Yeah, I’ve always held a soft spot for CRIU, but I don’t see it battle-tested enough to trust big, gnarly closed-source third-part vendor enterprise Java products to run under it. And if I’ve reduced an open source piece of kit’s execution footprint enough to trust it, I’d probably reach for Unikraft with checkpointing to a Persistent Volume before CRIU, which feels like early days of VMWare.

Hopefully though, my trepidation is wrong. What is the most complex piece of software others have run under CRIU in production, and for how long?