# Cross-compiling C++ with Bazel and Wolfi

*rules_apko, toolchains_llvm and Wolfi*

By [Kyle Downey](https://paragraph.com/@kyle-downey) · 2025-11-30

---

Introduction
------------

[Chainguard](https://www.chainguard.dev/)'s [Wolfi](https://github.com/wolfi-dev/) is a distroless base for Docker containers: the bare minimum needed to run a Linux kernel. This has two big implications:

*   the images are small, which can make them faster to build -- a key consideration if you are aiming for [rapid iteration and continuous delivery](https://paragraph.com/@kyle-downey/left-of-launch-questioning-the-speed-quality-tradeoff-in-software-engineering)
    
*   the number of packages is limited, which reduces attack surface
    

It goes hand-in-hand with [apko](https://github.com/chainguard-dev/apko), a tool that builds OCI-compliant containers from Alpine's apk distribution format. The companion [rules\_apko](https://github.com/chainguard-dev/rules_apko) module for the [Bazel](https://bazel.build/) build system then lets you automate creating those containers. However, there is a catch if you are building native code: you also need to package up your own binary, and that binary needs to be fully compatible with the target runtime for the container, which could be different from the host operating system. For instance, you might be compiling on MacOS running on ARM64-based Apple Silicon chip, but you want to create an image that runs on Linux on an x86-based Intel i9 chip.

In order to do this, you need to cross-compile the binary. [LLVM](https://llvm.org/) excels at this, and Bazel's [toolchains\_llvm](https://github.com/bazel-contrib/toolchains_llvm) helpfully takes care of things like downloading the toolchain and sysroot that you need to cross-compile. The latter, though, is its own tricky piece. A sysroot is a filesystem with things like [glibc](https://sourceware.org/glibc/), Linux kernel headers, etc.. You need this because even if your compiler running on MacOS knows how to generate machine code for x86 while running on an ARM processor, it still needs headers and libraries to link to in order to create a complete Linux executable. Depending on how you compile and link, your new executable might end up with absolute paths or links to shared libraries that may or may not be available inside your container -- or worse, the versions present might have subtle differences, so what you compiled and tested might not be what you run.

To get around this hazard, Bazel encourages [hermetic builds](https://bazel.build/basics/hermeticity). Namely, you want full control of the build environment, libraries, etc. and then you want to use that to deploy as well. In our case, since we want to target Wolfi, it would be nice if that sysroot we were compiling against was the same base image as the one we plan to use with apko, right?

That's what we're going to tackle here.

From wolfi-base to sysroot
--------------------------

Chainguard's wolfi-base image can be pulled from their Docker registry:

    docker pull cgr.dev/chainguard/wolfi-base

which is great if you want to run it, and not as helpful if your goal is to build a sysroot for cross-compilation. What we really want to do is run `apko` to create a sysroot, and then expose it to the LLVM toolchain. Thankfully, with a `repository_rule` in Bazel you can do just that, even downloading the apko binary you need on-the-fly.

We start with `sysroot.yaml` -- an apko configuration file that starts with `wolfi-base` and layers on top `glibc`, `libstdc++` and the Linux kernel headers.

_sysroot.yaml_

    contents:
      repositories:
        - https://packages.wolfi.dev/os
      keyring:
        - https://packages.wolfi.dev/os/wolfi-signing.rsa.pub
      packages:
        - build-base
        - glibc-dev
        - libstdc++-dev
        - linux-headers
    
    archs:
      - x86_64
      - aarch64

This will create a larger than usual image: about 500 MB each, almost entirely due to gcc's inclusion in `build-base`. However, if we subsequently create an even more minimal setup, we know that the hermetic build's sysroot and the container image share identical foundations -- making the image more reliable and secure.

One key finding it getting this working is that `toolchains_llvm` requires a sysroot to be a package, which means you have to generate it with a `repository_rule`. However, this imposes a constraint: repository rules are evaluated during Bazel's loading phase, and so you cannot depend on artifacts generated by Bazel itself. That means if we want to call `apko` unfortunately we cannot rely on `rules_apko` to load the binary for us: we have to download a specific version. This is why the attributes include details like `apko_version` and `apko_sha256` .

_repos.bzl_

    
    sysroot = repository_rule(
        implementation = _sysroot_impl,
        attrs = {
            "apko_config": attr.label(
                doc = "Label pointing to the apko config YAML file.",
                default = "//build-support/sysroot:sysroot.yaml"),
            "architecture": attr.string(
                mandatory = True,
                values = ["amd64", "arm64"]
            ),
            "apko_version": attr.string(
                doc = "Version of apko to use for building the sysroot.",
                default = "0.30.26",
            ),
            "apko_sha256": attr.string_dict(
                doc = "SHA256 checksums of the apko binary, per architecture.",
                default = {
                    "darwin_arm64": "347bd6c...",
                    "linux_amd64": "12c227b...",
                    "linux_arm64": "f46bc84...",
                },
            ),
            "strip_components": attr.int(
                doc = "Number of components to strip when extracting (similar to strip_prefix).",
            ),
            "include_patterns": attr.string_list(),
            "exclude_patterns": attr.string_list(),
        },
    )

With the rule defined, we need an implementation. The first part is just using the `repository_ctx` object and our input attributes to download the appropriate apko version for our host binary:

_repos.bzl_

    load("@aspect_bazel_lib//lib:repo_utils.bzl", "repo_utils")
    
    def _sysroot_impl(rctx):
        apko_version = rctx.attr.apko_version
        host_platform = repo_utils.platform(rctx)
    
        url = "https://github.com/chainguard-dev/apko/releases/download/v{apko_version}/apko_{apko_version}_{host_platform}.tar.gz".format(
            apko_version = apko_version,
            host_platform = host_platform,
        )
        strip_prefix = "apko_{}_{}".format(
            apko_version,
            host_platform,
        )
    
        apko_sha256 = rctx.attr.apko_sha256.get(host_platform)
        if apko_sha256 == None:
            fail("No apko SHA256 checksum provided for platform: %s" % host_platform)
    
        rctx.download_and_extract(
            url = url,
            output = "apko",
            sha256 = apko_sha256,
            strip_prefix = strip_prefix,
        )
    

We can then use the context to run `apko build-minirootfs` and turn the `apko_config` YAML into an extract of the Linux `usr`, `lib` and other top-level directories:

_repos.bzl_

    
        archive = rctx.path("sysroot.tar")
        result = rctx.execute([
            rctx.path("apko/apko"),
            "build-minirootfs",
            rctx.path(rctx.attr.apko_config),
            archive,
            "--build-arch",
            rctx.attr.architecture,
        ])
        if result.return_code != 0:
            fail(result.stdout + result.stderr)

The next step is critical for `toolchains_llvm` and ultimately clang to work. We declare a `BUILD.bazel` inside the new repo that returns a `filegroup` with the top-level directory. Subsequently the toolchain will execute this to convert the extracted repo into a sysroot:

_repos.bzl_

        rctx.file(
            "sysroot/BUILD.bazel",
            """filegroup(
        name = "sysroot",
        srcs = ["."],
        visibility = ["//visibility:public"],
    )""",
        )

This approach is necessary but Bazel does not particularly like it: if you run this without any overrides you will get warnings about directories as inputs not being supported. So far the only workaround I found was to add this to `.bazelrc`:

    startup --host_jvm_args=-DBAZEL_TRACK_SOURCE_DIRECTORIES=1

This startup flag forces the Bazel daemon to monitor directories for changes, which it does not do by default for performance reasons. Still, we now have an empty sysroot directory in a format that `toolchains_llvm` can ingest.

The output of the apko execution is a tar file, so we'll follow `toolchains_llvm`'s own custom `sysroot.bzl` and use the embedded `tar` toolchain for the host platform:

_repos.bzl_

        host_bsdtar = Label("@bsd_tar_toolchains_%s//:tar" % repo_utils.platform(rctx))
        cmd = [
            rctx.path(host_bsdtar),
            "--extract",
            "--no-same-owner",
            "--no-same-permissions",
            "--file",
            archive,
            "--directory",
            "sysroot",
            "--strip-components",
            str(rctx.attr.strip_components),
        ]
    
        for include in rctx.attr.include_patterns:
            cmd.extend(["--include", include])
    
        for exclude in rctx.attr.exclude_patterns:
            cmd.extend(["--exclude", exclude])
    
        result = rctx.execute(cmd)
        if result.return_code != 0:
            fail(result.stdout + result.stderr)
    
        rctx.delete(archive)

Finally we return the `repo_metadata`, and tell it that it's reproducible so it's cached:

_repos.bzl_

        if hasattr(rctx, "repo_metadata"):
            return rctx.repo_metadata(reproducible = True)
        else:
            return None

We can now use our rule and declare sysroots for each architecture:

_MODULE.bazel_

    sysroot = use_repo_rule("//build-support/sysroot:repos.bzl", "sysroot")
    
    sysroot(
        name = "sysroot_amd64",
        architecture = "amd64",
        include_patterns = ["**"],
        exclude_patterns = ["dev/*", "etc/shadow", "etc/gshadow"],
    )
    
    sysroot(
        name = "sysroot_arm64",
        architecture = "arm64",
        include_patterns = ["**"],
        exclude_patterns = ["dev/*", "etc/shadow", "etc/gshadow"],
    )

which can support our toolchain:

_MODULE.bazel_

    
    # Configure LLVM toolchain for cross-compilation
    llvm = use_extension("@toolchains_llvm//toolchain/extensions:llvm.bzl", "llvm")
    
    # Configure toolchain with Linux targets
    llvm.toolchain(
        name = "llvm_toolchain",
        llvm_version = "21.1.6",
        extra_llvm_distributions = {
            "LLVM-21.1.6-Linux-ARM64.tar.xz": "1d8a9e...",
            "LLVM-21.1.6-Linux-X64.tar.xz": "38bd99...",
            "LLVM-21.1.6-macOS-ARM64.tar.xz": "bdf036...",
            "clang+llvm-21.1.6-x86_64-pc-windows-msvc.tar.xz": "6fd57e...",
        },
        stdlib = {
            "linux-x86_64": "stdc++",
            "linux-aarch64": "stdc++",
        }
    )
    # Register sysroots for cross-compilation to Linux
    llvm.sysroot(
        name = "llvm_toolchain",
        label = "@sysroot_amd64//sysroot",
        targets = ["linux-x86_64"],
    )
    
    llvm.sysroot(
        name = "llvm_toolchain",
        label = "@sysroot_arm64//sysroot",
        targets = ["linux-aarch64"],
    )

These last two blocks bind the labels of our declared sysroot repositories. Note that we have to point to the `sysroot` directory inside the repo in our label.

With all this in place, you can easily cross-compile Linux x86 and Linux ARM64 binaries from different hosts, including Apple Silicon.

Conclusion
----------

Hermetic builds are an important principle for Bazel, and essential when you are dealing with a compiled language like C++ that is very sensitive to its environment. From a cybersecurity perspective as well, the more you can control and build and runtime environment, the lower the probability that something unexpected sneaks in. A fully realized [Shift Left](https://paragraph.com/@kyle-downey/left-of-launch-questioning-the-speed-quality-tradeoff-in-software-engineering) setup for Modern C++ will require extending this to more advanced features like building OCI containers and leveraging remote caching and execution, but Bazel gives you all the pieces needed to get there.

---

*Originally published on [Kyle Downey](https://paragraph.com/@kyle-downey/cross-compiling-c-with-bazel-and-wolfi)*
