Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

CRIU is used by LXD to save the state of an LXD container, very similar to suspending or snapshotting a virtual machine.

Unfortunately, I was disappointed to find `lxd stop --stateful` couldn't save any of my LXD containers. There was always some error or other. This is how I learned about CRIU, as it was due to limitations of CRIU when used with the sorts of things running in LXD.

  # lxc stop --stateful test
  (00.121636) Error (criu/namespaces.c:423): Can't dump nested uts namespace for 2685261
  (00.121645) Error (criu/namespaces.c:682): Can't make utsns id
  (00.150794) Error (criu/util.c:631): exited, status=1
  (00.190680) Error (criu/util.c:631): exited, status=1
  (00.191997) Error (criu/cr-dump.c:1768): Dumping FAILED.
  Error: snapshot dump failed
LXD is generally used with "distro-like" containers, like running a small Debian or Ubuntu distro, rather than single-application containers as are used with Docker.

It turns out CRIU can't save the state of those types of containers, so in practice `lxd stop --stateful` never worked for me.

I'd have to switch to VMs if I want their state saved across host reboots, but those don't have other behaviours regarding host-guest filesystem sharing that I needed.

In practice this meant I had to live with never rebooting the host. Thankfully Linux just keeps on working for years without a reboot :-)



Stéphane Graber (key Incus née LXD contributor) just did a video about developing placement scriptlets in the Starlark language but the interesting thing is, if I'm interpreting what I saw correctly, his cluster was 6 beefy servers plus 3 decent-sized VMs and the idea was, I think, that containers could get placed on the nested VMs, neatly solving the migration issue with containers. The interesting part was it looked like the 3 VMs in the cluster may have been themselves in the cluster.

I could be wrong, though. Interesting approach if true


> Linux just keeps on working for years without a reboot

Except I would strongly suggest not doing that as there have been some very nasty security issues fixed as of late.


> (00.121636) Error (criu/namespaces.c:423): Can't dump nested uts namespace for 2685261

Found a GitHub issue for this: https://github.com/checkpoint-restore/criu/issues/1430

The issue apparently is newer systemd versions create their own UTS namespace, so suddenly running systemd in a container results in nested UTS namespace. Containers with older versions of systemd, or which don't use systemd, shouldn't have the issue.

One commenter posted in April 2021 that they had a patch to add support for nested UTS namespaces, but they don't appear to have submitted it: https://github.com/checkpoint-restore/criu/issues/1430#issue...

Comment on another issue has suggestion on how to implement nested UTS namespace support: https://github.com/checkpoint-restore/criu/issues/1011#issue...

It doesn't sound like nested UTS namespace support is impossible, just something nobody has got around to implementing.

Comment in CRIU source code says nested namespaces are only supported for mount namespaces (CLONE_NEWNS) and network namespaces (CLONE_NEWNET): https://github.com/checkpoint-restore/criu/blob/b5e2025765b9...

But if you look at the OpenVZ fork of CRIU, you see it also supports PID (CLONE_NEWPID), UTS (CLONE_NEWUTS) and IPC (CLONE_NEWIPC) namespaces: https://bitbucket.org/openvz/criu.ovz/src/d9bf55896015a27df9...

I don't know why these additional features in OpenVZ CRIU don't exist in the upstream.

I think the main blocker to supporting nesting of the other namespace types (user, cgroup, time), is someone getting around to write the code for the support. It is possible some of them pose some kind of architectural issue where some kernel enhancement might be necessary (if that's true of any, I'd say most likely of user), but I suspect for most of them it is simply a matter that nobody has gotten around to it.

The other issue is eventually someone will add another namespace type to the Linux kernel, and then CRIU will need to support that too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: