LXC howto for myself

2025-07-27

Pet used Debian-based systems while was writing this memo, and it has the strong opinion that apt's option --no-install-recommends is extremely important. Without it you easily get lots of crap installed both on the host system and in containers. Best to turn it on by default by creating /etc/apt/apt.conf.d/01 norecommends with the following content:

APT::Install-Recommends "0";
APT::Install-Suggests "0";

Links that helped pet

"Worse-Better" knobs on the host system

kernel.dmesg_restrict = 1 in /etc/sysctl.conf This makes dmesg output inaccessible from unprivileged containers.
kernel.unprivileged_bpf_disabled = 1 Pet does not exactly why, but thinks it's worth applying.

Installing LXC

Pet uses minimalistic approach:

apt install lxc lxcfs lxc-templates cgroupfs-mount

Pet prefers to create containers manually with debootstrap. Here's what is needed for that:

apt install debootstrap distro-info debian-keyring debian-archive-keyring

For networking that uses /etc/network/interfaces, the obsolete bridge utils might be required:

apt install bridge-utils

Other packages pet seen in recommendations:

libvirt0: might be needed to run alien containers (i.e. arm on x86), but qemu-user-static plus binfmt-support are also required. Pet will revise this.
libpam-cgfs: pet has no idea why they recommend it
uidmap: pet has no idea why they recommend it

Networking

Pet prefers to configure networking with its own paws and does not use lxc-net.

On systems with systemd this can be turned off with:

systemctl stop lxc-net
systemctl disable lxc-net

On systems with sysvinit:

/etc/init.d/lxc-net stop
update-rc.d lxc-net remove

Pet uses two approaches for networking: bridged on its own systems and NATed on she-master's systems.

With bridged approach all containers have direct access to the network (layer 2 in OSI model as pet could remember). But the host system should be prepared for that. Namely, its primary network adapter should be bridge with physical ethernet interface as a part of it.

NATed approach does not require such major changes on the host system, so pet can use she-master's system and she does not notice anything. Only a couple of changes are required: one in /etc/nftables.conf that turns NAT on:

table ip nat {
    chain postrouting {
        type nat hook postrouting priority 100; policy accept;
        oif eth0 masquerade random,persistent
    }
}

and another is in /etc/sysctl.conf that enables routing:

net.ipv4.ip_forward=1

No reboot is necessary, just

sysctl -p
nft -f /etc/nftables.conf

With bridged approach pet configures networking in /etc/network/interfaces:

iface eth0 inet manual

auto br0
iface br0 inet static
    bridge_ports eth0
    address 192.168.0.2
    netmask 255.255.255.0
    gateway 192.168.0.1

Note that pet uses br0 instead of lxcbr0 which is configured in/etc/lxc/default.conf. As long as pet creates containers manually that does not matter and no changes are required.

Subordinate uid/gid maps

To run unprivileged containers, UID and GID maps should be configured on the host system.

Pet simply adds as many as necessary to both /etc/subuid and /etc/subgid:

root:100000:65536
root:200000:65536
root:300000:65536
root:400000:65536
root:500000:65536
...

Creating LXC container

Pet's way:

mkdir -p /var/lib/lxc/mycontainer/rootfs
debootstrap --variant=minbase \
    --include=dialog,libc-l10n,locales,nano \
    --exclude=vim-common,vim-tiny \
    excalibur \
    /var/lib/lxc/mycontainer/rootfs \
    http://deb.devuan.org/merged

It's not a good idea to install everything with debootstrap, pet install only the bare minimum.

Pet prefers nano because it's too stupid and each time when it accidentally steps into vim it has to reboot the system or ask AI how to exit.

Set hostname just in case:

echo mycontainer >/var/lib/lxc/mycontainer/rootfs/root/etc/hostname

By default hostname is taken from the host system by debootstrap and this is confusing. This file is not used by minimal setup because host name is set by LXC, see lxc.uts.name below.

Now copy your favorite .bashrc, enter chrooted environment and make some tweaks:

cp ~/.bashrc /var/lib/lxc/mycontainer/rootfs/root/
chroot /var/lib/lxc/mycontainer/rootfs
dpkg-reconfigure locales
echo 'APT::Install-Recommends "0";' >/etc/apt/apt.conf.d/01-norecommends
echo 'APT::Install-Suggests "0";' >>/etc/apt/apt.conf.d/01-norecommends
echo 'DSELECT::Clean "always";' >/etc/apt/apt.conf.d/90-autoclean

Pet's preferred set of packages for the minimal system:

apt install \
    apt-utils \
    bash-completion \
    bsdextrautils \
    ca-certificates \
    file \
    findutils \
    iputils-ping \
    iputils-tracepath \
    iproute2 \
    less \
    lsb-release \
    lsof \
    netbase \
    netcat-openbsd \
    procps \
    psutils \
    psmisc \
    runit \
    runit-init \
    tree \
    tzdata \
    xz-utils

Runit is pet's choice for containers. It's not perfect, Debian package is buggy, the codebase is spooky, but other init systems are not better.

Pet uses runit in the native boot mode:

touch /etc/runit/native.boot.run
touch /etc/runit/no.emulate.sysv
mkdir /etc/runit/boot-run
mkdir /etc/runit/shutdown-run
rm -rf /etc/sv/getty* /etc/service/getty*

Minimal initialization needs two scripts only. First, /etc/runit/boot-run/10-sysctl.sh:

/sbin/sysctl -p

Second, /etc/runit/boot-run/20-mountall.sh:

# Based on /etc/init.d/mountall.sh and /etc/init.d/mountdevsubfs.sh

do_mount_all()
{
    . /lib/init/vars.sh
    . /lib/init/tmpfs.sh
    . /lib/init/mount-functions.sh

    TTYGRP=5
    TTYMODE=620
    [ -f /etc/default/devpts ] && . /etc/default/devpts

    MNTMODE=mount_noupdate

    mount -a  # mount everything from /etc/fstab

    mount_run $MNTMODE
    mount_lock $MNTMODE
    mount_shm $MNTMODE

    if [ ! -d /dev/pts ] ; then
        mkdir --mode=755 /dev/pts
        [ -x /sbin/restorecon ] && /sbin/restorecon /dev/pts
    fi
    domount "$MNTMODE" devpts "" /dev/pts devpts "-onoexec,nosuid,gid=$TTYGRP,mode=$TTYMODE"
}

do_mount_all

Now it's okay to exit chrooted environment and create container configuration file /var/lib/lxc/mycontainer/config:

lxc.apparmor.profile = unconfined

lxc.include = /usr/share/lxc/config/devuan.common.conf
lxc.include = /usr/share/lxc/config/devuan.userns.conf

lxc.idmap = u 0 100000 65536
lxc.idmap = g 0 100000 65536

lxc.rootfs.path = dir:/var/lib/lxc/mycontainer/rootfs
lxc.rootfs.options = idmap=container,nodiratime,relatime
lxc.uts.name = mycontainer

lxc.net.0.type = veth
lxc.net.0.name = eth0
lxc.net.0.link = br0
lxc.net.0.flags = up
lxc.net.0.ipv4.address = 192.168.0.3/24
lxc.net.0.ipv4.gateway = 192.168.0.1

lxc.start.auto = 0

The above configuration is for bridged approach. Here's how it would look for NATed approach:

lxc.hook.version = 1

lxc.net.0.type = veth
lxc.net.0.veth.mode = router
lxc.net.0.ipv4.address = 192.168.10.2/24
lxc.net.0.ipv4.gateway = 192.168.10.1
lxc.net.0.flags = up
lxc.net.0.script.up = /bin/sh -c "ip address add 192.168.10.1 dev $LXC_NET_PEER"

Block devices and file systems

To use a block device in an unprivileged container, change group of the block device to container's GIG, e.g. 100000. The owner may remain root.

To automate this, create /etc/udev/rules.d/90-sda-permissions.rules with the following line (assuming the device is sda):

KERNEL=="sda", ACTION=="add", GROUP="100000"

Next, allow using block device in the container. Add the following lines to config file:

lxc.cgroup.devices.allow = b 8:0 rwm
lxc.mount.entry = /dev/sda dev/sda none bind,create=file

An open question is how to make this by UUID? I.e. sda and 8:0 may change, but UUID is stable.

So, block device can be read and written, but filesystems cannot be mounted from unprivileged container.

VPN

Pet did not try to run servers so far. All notes are for clients only.

Wireguard

As a client it works without any tweaks in unprivileged containers.

OpenVPN

It works in unprivileged containers with the following tweaks in config file:

lxc.cgroup.devices.allow = c 10:200 rwm
lxc.mount.entry = /dev/net dev/net none bind,create=dir 0 0
lxc.mount.entry = /dev/net/tun dev/net/tun none bind,create=file