# Unpacking Linux containers: understanding Docker and its alternatives

By [Optimized by Otto](https://paragraph.com/@otto) · 2023-05-08

---

![](https://storage.googleapis.com/papyrus_images/6d42c229da8ab4318030bb7513c796e9.jpg)

In popularizing Linux containers, Docker brought about a new era of systems design based on these lightweight platforms, rather than heavy virtual machines. However, now that Docker is slowly declining, it’s time to learn about the next generation of Linux container tools.

Docker
------

When [Docker](https://en.wikipedia.org/wiki/Docker_%28software%29) officially launched in 2013, it was not the first containerization solution for Linux. For example, Linux already had [LXC](https://en.wikipedia.org/wiki/LXC) back in 2008 (early versions of Docker ran on top of it), and [FreeBSD jails](https://en.wikipedia.org/wiki/FreeBSD_jail) had been around since 1999. Nevertheless, Docker _was_ the first developer-friendly and complete end-to-end solution that let us easily create, distribute, and run Linux containers.

Not only was it technically sound and convenient to use, but Docker was also a great example of a successful and well-run open source project. I experienced this personally during a couple of contributions where two people did the initial review within 24h of my Pull Request and a third person merged it in less than two weeks from the submission date. Docker developers also contributed _back_ to Linux plenty of containerization-related improvements, started drove standardization efforts, and spun off many subcomponents (e.g., [containerd](https://containerd.io/), [OCI](https://en.wikipedia.org/wiki/Open_Container_Initiative), [BuildKit](https://github.com/moby/buildkit)).

Today, container-based system architectures and development workflows are extremely popular, as seen with, for instance, the rise of [Kubernetes](https://github.com/moby/moby/commit/b619220ce11770ffaea068b54d3975c74f7c24f9). While we are _still_ waiting for the _‘year of the Linux desktop’_ to happen, Docker did certainly make more Windows and Mac users run a virtual Linux machine on their laptops than ever before.

The company Docker Inc was, from the start, a venture-funded endeavor centered around an [open core model](https://en.wikipedia.org/wiki/Open-core_model) and launched many closed-source products that drove revenue over the years. What used to be the core Docker software was renamed _Moby_ in 2017, and that is where the open-source contributions (e.g., [mine from 2015](https://github.com/moby/moby/commit/b619220ce11770ffaea068b54d3975c74f7c24f9)) can be found. The founder [Solomon Hykes](https://twitter.com/solomonstre) no longer works for Docker Inc, and in recent years public sentiment around Docker has suffered due to various controversies. Yet at the same time, many similar (and some perhaps _better_) solutions have entered the space.

What actually _is_ a Docker container?
--------------------------------------

To build a container, a software developer first writes a [Dockerfile](https://docs.docker.com/engine/reference/builder/), which defines what Linux distribution the container is based on along with what software and configuration files and data it has. Much of the `Dockerfile` contents are basically shell script.

The build is done with command `docker build`, which executes the contents of the `Dockerfile` line-by-line and creates a Linux-compatible root filesystem (files under `/`). This is done utilizing a clever overlay filesystem, where each line in the `Dockerfile` amounts to one new layer. Thus, rebuilds of the container do _not_ need to rebuild the whole filesystem, but can just execute the `Dockerfile` lines that changed from the previous build.

On a typical Linux system, the filesystem layers after a `docker build` execution can be found at `/var/lib/docker/`. If the container was based on Debian, one could find, for example, the `apt-get` binary of the image at a path like `/var/lib/docker/overlay2/c1ead1[...]d04e06/diff/usr/bin/apt-get`.

Additionally, some metadata is created in the process, which designates among other things the _entrypoint_ of the container — i.e. what binary on the root filesystem to run when starting the container.

Unpacking a container
---------------------

To inspect what the root filesystem of the Docker image `debian:sid` looks like, one could create a container and inspect the mounted merged filesystem:

Copy

`$ docker container create -i -t --name demo debian:sid 2734eb[...]d18852 $ cat /var/lib/docker/image/overlay2/layerdb/mounts/2734eb[...]d18852/mount-id 2854c7[...]9dfe25 $ find /var/lib/docker/overlay2/2854c7[...]9dfe25 | grep apt-get /var/lib/docker/overlay2/2854c7[...]9dfe25/merged/usr/share/man/man8/apt-get.8.gz /var/lib/docker/overlay2/2854c7[...]9dfe25/merged/usr/share/man/pt/man8/apt-get.8.gz /var/lib/docker/overlay2/2854c7[...]9dfe25/merged/usr/bin/apt-get`

`$ docker container create -i -t --name demo debian:sid 2734eb[...]d18852 $ cat /var/lib/docker/image/overlay2/layerdb/mounts/2734eb[...]d18852/mount-id 2854c7[...]9dfe25 $ find /var/lib/docker/overlay2/2854c7[...]9dfe25 | grep apt-get /var/lib/docker/overlay2/2854c7[...]9dfe25/merged/usr/share/man/man8/apt-get.8.gz /var/lib/docker/overlay2/2854c7[...]9dfe25/merged/usr/share/man/pt/man8/apt-get.8.gz /var/lib/docker/overlay2/2854c7[...]9dfe25/merged/usr/bin/apt-get`

The command [docker export](https://docs.docker.com/engine/reference/commandline/export/) makes it easy to get the root filesystem into, for example, a tar package.

Copy

`$ docker export demo > debian-sid.tar $ tar xvf debian-sid.tar .dockerenv bin boot/ dev/ dev/console dev/pts/ dev/shm/ etc/ etc/.pwd.lock etc/alternatives/ etc/alternatives/README etc/alternatives/awk ... var/spool/mail var/tmp/ $ find . | grep apt-get ./usr/share/man/man8/apt-get.8.gz ./usr/bin/apt-get`

`$ docker export demo > debian-sid.tar $ tar xvf debian-sid.tar .dockerenv bin boot/ dev/ dev/console dev/pts/ dev/shm/ etc/ etc/.pwd.lock etc/alternatives/ etc/alternatives/README etc/alternatives/awk ... var/spool/mail var/tmp/ $ find . | grep apt-get ./usr/share/man/man8/apt-get.8.gz ./usr/bin/apt-get`

In theory, **anything could create this root filesystem**, and likewise anything starting could run a binary inside it — even the classic [chroot](https://en.wikipedia.org/wiki/Chroot). If you edit the files and want to get them back into Docker to run as a container, [docker import](https://docs.docker.com/engine/reference/commandline/import/) makes it easy.

To export a full container image with both the root filesystem and the metadata, the [docker buildx](https://docs.docker.com/build/exporters/oci-docker/) command offers some output format options, such as the [Open Container Initiative standard format](https://github.com/opencontainers/image-spec/blob/v1.0.2/image-layout.md) or the [Docker native image format](https://github.com/moby/moby/blob/v24.0.5/image/spec/v1.2.md). To import a full container image with metadata, refer to the [docker load](https://docs.docker.com/engine/reference/commandline/load/) command.

Orchestrating a container start with dockerd, containerd and runc
-----------------------------------------------------------------

In the above example, a container was created, but not _started_. To start a container, one can try running:

Copy

`$ docker run -it debian:sid bash root@c9a8e6c222ae:/#`

`$ docker run -it debian:sid bash root@c9a8e6c222ae:/#`

From a user experience point of view, you are basically dropped into a Bash shell in a Debian Sid container. Under the hood, the `docker` command-line tool sends an HTTP request to the `dockerd` daemon running on the local system, which in turn asks `containerd` to run the container, which _in turn_ starts `runc` directly or (due to backwards compatibility reasons) a `containerd-runc-shim`. Inside this one, you can find the actual running Bash binary:

Copy

`$ ps fax | grep -C container 1122 /usr/bin/containerd 1660 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock 55409 /usr/bin/containerd-shim-runc-v2 -namespace moby -id c9a8e[..]0847e -address /run/containerd/containerd.sock 55428 \_ bash`

`$ ps fax | grep -C container 1122 /usr/bin/containerd 1660 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock 55409 /usr/bin/containerd-shim-runc-v2 -namespace moby -id c9a8e[..]0847e -address /run/containerd/containerd.sock 55428 \_ bash`

Anyway, if you’re fine with slightly _less_ automation and having more of a “hands-on” experience, read the [man page for runc](https://manpages.debian.org/unstable/runc/runc.8.en.html) and try running the container directly with it.

Alternatives in the Linux containers stack
------------------------------------------

The Linux Foundation has a nice architecture schema to illustrate the various components and alternatives in the stack that originally evolved from Docker:

![Linux containers architecture diagram from containerd.io](https://substack-post-media.s3.amazonaws.com/public/images/84737ade-03bd-493e-8001-8adaf2ad57ad_1200x727.png "Linux containers architecture diagram from containerd.io")

The `runc` is the [OCI](https://en.wikipedia.org/wiki/Open_Container_Initiative) reference implementation of their runtime specification. Popular alternatives to `runc` include: [crun](https://github.com/containers/crun) (implemented in C to be faster and use less memory than `runc`, which is in Go) and [CRI-O](https://cri-o.io/) (smaller and faster, with just enough features to be perfect for Kubernetes).

There are also container runtimes such as [Kata](https://katacontainers.io/) and [ZeroVM](https://www.zerovm.org/) based on the idea of running each container inside a minimal virtual machine, which achieve better isolation between the containers compared to running them directly on the same host. This design aims to hit a “sweet spot” between the optimized performance of lightweight containers and the security of traditional full virtual machines.

Podman
------

Missing from that diagram above is the current major competitor, the Red Hat-sponsored [Podman, which offers a complete replacement](https://github.com/containers) to the whole Docker stack.

The command-line tool `podman` is designed to be a drop-in-replacement for `docker`, so one can run the earlier command examples by just changing the first word: `podman build ..`, `podman container create ...`, `podman export ..` and so forth. Even `podman volume prune --force && podman system prune --force` does exactly the same as the Docker equivalent — which is nice, as I tend to run that frequently to clean away containers and free disk space when I’m not actively using them.

To start a container one can run (for example):

Copy

`$ podman run -it debian:sid bash root@312cbccb5938:/#`

`$ podman run -it debian:sid bash root@312cbccb5938:/#`

When a container started like this is running, you would see in the process list something along the lines of:

Copy

`87524 \_ podman 99902 \_ /usr/libexec/podman/conmon --api-version 1 -c 312cbc[...]93a0e1 -u 312cbc[...]93a0e1 -r /usr/bin/crun -b /home/otto/.local/share/containers/storage/overlay-containers/312cbc[...]93a0e1/userdata -p /run/user/1001/containers/overlay-containers/312cbc[...]93a0e1/userdata/pidfile -n naughty_dewdney --exit-dir /run/user/1001/libpod/tmp/exits --full-attach -l journald --log-level warning --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/run/user/1001/containers/overlay-containers/312cbc[...]93a0e1/userdata/oci-log -t --conmon-pidfile /run/user/1001/containers/overlay-containers/312cbc[...]93a0e1/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/otto/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1001/containers --exit-command-arg --log-level --exit-command-arg warning --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1001/libpod/tmp --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 312cbc[...]93a0e1 99905 \_ bash`

`87524 \_ podman 99902 \_ /usr/libexec/podman/conmon --api-version 1 -c 312cbc[...]93a0e1 -u 312cbc[...]93a0e1 -r /usr/bin/crun -b /home/otto/.local/share/containers/storage/overlay-containers/312cbc[...]93a0e1/userdata -p /run/user/1001/containers/overlay-containers/312cbc[...]93a0e1/userdata/pidfile -n naughty_dewdney --exit-dir /run/user/1001/libpod/tmp/exits --full-attach -l journald --log-level warning --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/run/user/1001/containers/overlay-containers/312cbc[...]93a0e1/userdata/oci-log -t --conmon-pidfile /run/user/1001/containers/overlay-containers/312cbc[...]93a0e1/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/otto/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1001/containers --exit-command-arg --log-level --exit-command-arg warning --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1001/libpod/tmp --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 312cbc[...]93a0e1 99905 \_ bash`

Unlike Docker, there is no `containerd` or `runc` at play, but instead [conmon](https://manpages.debian.org/unstable/conmon/conmon.8.en.html) runs [crun](https://manpages.debian.org/unstable/crun/crun.1.en.html), which is the _actual_ container runtime. Note also that the container runs with regular user permissions (no need for root) and that the default location for storing container images and other data is in `~/.local/share/containers/` in the user home directory.

### Podman desktop

While I _personally_ prefer to work on the command-line, I need to give a shoutout to Podman for also having a nifty desktop application for those who prefer to use graphical tools:

![Podman Desktop demo](https://substack-post-media.s3.amazonaws.com/public/images/0ba76ead-5885-4f11-99ab-5a8dc711e5f9_926x597.gif "Podman Desktop demo")

LXC and LXD
-----------

The basic utility of Linux containers is to give system administrators a building block which behaves a bit _like_ a virtual machine in terms of being an encapsulated unit — **but without being so slow and resource hungry as actual virtual machines!** Although containers typically boast a full root filesystem, the Docker philosophy was that each container should run just _one_ process — and run it _well_ — and crucially, not have any process managers or init systems inside the container. Many system administrators, however, _do_ in practice run Docker containers that use, as an example, [runit](https://en.wikipedia.org/wiki/Runit) to ‘boot’ the container and manage server daemon processes inside them.

The Canonical-backed [LXD](https://ubuntu.com/lxd) however tailors itself _specifically_ for this type of use case, building upon LXC. After installing LXD and running `lxd init` to configure it, you can run full containerized operating systems with:

Copy

`$ lxc launch images:debian/sid demo Creating demo Starting dem1 $ lxc exec demo -- bash root@demo:~#`

`$ lxc launch images:debian/sid demo Creating demo Starting dem1 $ lxc exec demo -- bash root@demo:~#`

The host process list will show something along the lines of:

Copy

`root 105632 lxcfs /var/snap/lxd/common/var/lib/lxcfs -p /var/snap/lxd/common/lxcfs.pid root 105466 /bin/sh /snap/lxd/24061/commands/daemon.start root 105645 \_ lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd lxd 105975 \_ dnsmasq --keep-in-foreground --strict-order --bind-interfaces --except-interface=lo --pid-file= --no-ping --interface=lxdbr0 --dhcp-rapid-commit --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.199.145.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.leases --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.hosts --dhcp-range 10.199.145.2,10.199.145.254,1h --listen-address=fd42:3147:bafe:37e5::1 --enable-ra --dhcp-range ::,constructor:lxdbr0,ra-stateless,ra-names -s lxd --interface-name _gateway.lxd,lxdbr0 -S /lxd/ --conf-file=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.raw -u lxd -g lxd ... root 107867 \_ /snap/lxd/current/bin/lxd forkexec demo /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/demo/lxc.conf 0 0 0 -- env PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOME=/root USER=root LANG=C.UTF-8 TERM=xterm-256color -- cmd bash 1000000 107870 \_ bash root 106209 [lxc monitor] /var/snap/lxd/common/lxd/containers demo 1000000 106221 \_ /sbin/init 1000000 106372 \_ /lib/systemd/systemd-journald 1000000 106401 \_ /lib/systemd/systemd-udevd 1000997 106420 \_ /lib/systemd/systemd-resolved 1000100 106431 \_ /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only 1000000 106433 \_ /lib/systemd/systemd-logind 1000000 106436 \_ /sbin/agetty -o -p -- \u --noclear --keep-baud - 115200,38400,9600 linux 1000998 106446 \_ /lib/systemd/systemd-networkd`

`root 105632 lxcfs /var/snap/lxd/common/var/lib/lxcfs -p /var/snap/lxd/common/lxcfs.pid root 105466 /bin/sh /snap/lxd/24061/commands/daemon.start root 105645 \_ lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd lxd 105975 \_ dnsmasq --keep-in-foreground --strict-order --bind-interfaces --except-interface=lo --pid-file= --no-ping --interface=lxdbr0 --dhcp-rapid-commit --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.199.145.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.leases --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.hosts --dhcp-range 10.199.145.2,10.199.145.254,1h --listen-address=fd42:3147:bafe:37e5::1 --enable-ra --dhcp-range ::,constructor:lxdbr0,ra-stateless,ra-names -s lxd --interface-name _gateway.lxd,lxdbr0 -S /lxd/ --conf-file=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.raw -u lxd -g lxd ... root 107867 \_ /snap/lxd/current/bin/lxd forkexec demo /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/demo/lxc.conf 0 0 0 -- env PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOME=/root USER=root LANG=C.UTF-8 TERM=xterm-256color -- cmd bash 1000000 107870 \_ bash root 106209 [lxc monitor] /var/snap/lxd/common/lxd/containers demo 1000000 106221 \_ /sbin/init 1000000 106372 \_ /lib/systemd/systemd-journald 1000000 106401 \_ /lib/systemd/systemd-udevd 1000997 106420 \_ /lib/systemd/systemd-resolved 1000100 106431 \_ /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only 1000000 106433 \_ /lib/systemd/systemd-logind 1000000 106436 \_ /sbin/agetty -o -p -- \u --noclear --keep-baud - 115200,38400,9600 linux 1000998 106446 \_ /lib/systemd/systemd-networkd`

Notice how the daemon runs as root (and interacting with lxd/lxc requires root permissions). However, thanks to UID mapping, the root user inside the container is _not_ a root user as found on the host system. This is one of the key design differences — and _why LXD is considered more secure than Docker_.

The downloaded root filesystems are stored at `/var/snap/lxd/common/lxd/images/` while the filesystems of running containers can be found at `/var/snap/lxd/common/lxd/storage-pools/default/containers/` as long as the LXD storage is directory-based (as opposed to a LVM or OpenZFS pool).

The examples above all have `snap` in their path, as there is no native Ubuntu package for LXD…but it forces users to install a Snap even when running `apt install lxd`.

As `lxd` controls the whole system, the command for managing individual containers is `lxc`:

Copy

`$ lxc list +------+---------+---------------------+-----------------------------------------------+-----------+-----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | +------+---------+---------------------+-----------------------------------------------+-----------+-----------+ | demo | RUNNING | 10.199.145.6 (eth0) | fd42:3147:bafe:37e5:216:3eff:fe01:8da8 (eth0) | CONTAINER | 0 | +------+---------+---------------------+-----------------------------------------------+-----------+-----------+ $ lxc delete demo --force`

`$ lxc list +------+---------+---------------------+-----------------------------------------------+-----------+-----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | +------+---------+---------------------+-----------------------------------------------+-----------+-----------+ | demo | RUNNING | 10.199.145.6 (eth0) | fd42:3147:bafe:37e5:216:3eff:fe01:8da8 (eth0) | CONTAINER | 0 | +------+---------+---------------------+-----------------------------------------------+-----------+-----------+ $ lxc delete demo --force`

Ergo, the process of creating LXC compatible container images is fairly simple. One can use any container builder to create the root filesystem (the [LXC docs recommend using deboostrap](https://ubuntu.com/tutorials/create-custom-lxd-images#3-creating-basic-system-installation) directly), and the basic metadata yaml file is so brief, it can be written manually. These are then imported to LXC with `lxc image import metadata.tar.gz rootfs.tar.gz --alias demo`.

The whole LXD stack ships with integrated tooling—even offering metal-as-a-service capabilities (MAAS) — so it goes _way beyond_ what the Docker stack has.

So, where are we headed?
------------------------

To fully grasp how containers actually work, you should read the Linux kernel documentation on [namespaces](https://man7.org/linux/man-pages/man7/namespaces.7.html) and permission control via [capabilities](https://man7.org/linux/man-pages/man7/capabilities.7.html). Keeping an eye on the progress of the [Open Container Initiative](https://opencontainers.org/) will keep you right on top of the latest developments, and considering OCI compatibility in your infrastructure will enable you to migrate between Docker, Podman, and LXD easily.

Choosing the right container technology to use depends on _where_ you intend to ship your containers. For developers targeting Kubernetes compatible production environments, Podman probably makes the most sense at the moment. Or, if your infrastructure consists of a lot of virtualized Ubuntu hosts and you want to have more flexibility, LXD is probably a good choice.

Podman is certainly gaining a lot of popularity [according to Google Trends](https://trends.google.com/trends/explore/TIMESERIES/1693782600?hl=en-US&tz=420&date=today+5-y&hl=en-CA&q=%2Fg%2F11j4j_npvw,lxc&sni=3). Docker will, however, continue to have the largest mindshare among average developers for years to come. For now, my recommendation is for all systems administrators and software architects to try and understand how these tools you _rely on_ actually work — _by getting your hands dirty with them_. Choose the solutions you understand best, and keep an eye on the horizon for what’s coming next!

---

*Originally published on [Optimized by Otto](https://paragraph.com/@otto/unpacking-linux-containers-understanding-docker-and-its-alternatives)*
