Files
SilverMetal/linux/build/scripts/build.sh
SysAdmin 38ac4f8a96
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 15m34s
fix(linux/build): systemd-in-container build host (M1.1)
Run #4258 cleared the systemctl shim only to die two seconds later on
the *next* expectation derivative-maker has of a real systemd host:
its sources.list points at http://127.0.0.1:9977/debian (the approx
package-cache socket-activated by systemd) and apt-get update could
not reach the daemon because nothing was actually started by the
no-op shim:

    Err:1 http://127.0.0.1:9977/debian trixie InRelease
      Could not connect to 127.0.0.1:9977 (127.0.0.1).
      - connect (111: Connection refused)

Whack-a-mole'ing each service derivative-maker tries to start (approx
today, then journald, then systemd-logind, then who-knows-what
tomorrow) is going to keep failing for a while — derivative-maker is
fundamentally designed for a real systemd-managed Debian host. The
container pattern upstream itself ships
(linux/build/derivative-maker/docker/) runs systemd as PID 1 inside
the container; this commit adopts that approach.

Architecture:

  - PID 1 in the build container is now systemd. Upstream's vendored
    entrypoint.sh records the user-supplied command into
    /etc/docker-entrypoint-cmd, captures env into
    /etc/docker-entrypoint-env, masks irrelevant units, and execs
    systemd. systemd boots, docker-entrypoint.service runs the
    command, docker-entrypoint-stop.sh propagates the exit code via
    `systemctl exit <code>` so the container exits with the right
    status.

  - The four entrypoint files (entrypoint.sh,
    docker-entrypoint.service / .target, docker-entrypoint-stop.sh)
    are vendored at linux/build/docker/systemd-entrypoint/ rather
    than COPY'd from the submodule path — Docker build context can
    only reach below itself, and bumping is tracked in that dir's
    README.

  - Container runtime now requires --cgroupns=host, --tmpfs /run,
    --tmpfs /run/lock, and -v /sys/fs/cgroup:/sys/fs/cgroup:rw so
    systemd can manage cgroups properly. -t allocates a TTY,
    satisfying entrypoint.sh's `[ ! -t 0 ] && exit 1` check in CI
    where stdin is otherwise /dev/null.

  - User renamed builder → user (uid 1000, passwordless sudo) to
    match upstream's USER=user / HOME=/home/user convention. chown
    in build.sh now uses uid 1000:1000 so it's name-agnostic.

  - Image package list grew to match upstream's
    derivative-maker-docker-setup (sq stack + dbus + approx + the
    rest) plus our ISO toolchain (live-build / debootstrap / xorriso
    / squashfs-tools / etc.). Snapshot.debian.org pinning is
    preserved (same APT_SNAPSHOT_URL, two-phase install pattern).

Verified:

  Smoke test on 10.0.0.51 — `docker run --rm --privileged
  --cgroupns=host --tmpfs /run --tmpfs /run/lock -v /sys/fs/cgroup:...:rw
  -t <image> /bin/bash -c 'echo OK'` — booted systemd, ran the
  command via docker-entrypoint.service, captured the output, shut
  down filesystems and exited cleanly.

build.sh BUILDER_IMAGE pin → sha256:dc9dd29d…8811. Image rebuilt
natively on 10.0.0.51, pushed to docker-registry.silverlabs.uk.

The systemctl shim is removed by virtue of the Dockerfile rewrite —
real systemd makes it unnecessary. The previous "iter6 / iter7"
intermediate digests stay in the registry until we GC; the live one
is m1.1-iter8-systemd.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:06:47 +01:00

187 lines
8.0 KiB
Bash
Executable File

#!/usr/bin/env bash
# SilverMetal Linux — ISO build wrapper.
#
# Runs the Kicksecure derivative-maker inside the pinned builder container
# with the reproducibility levers locked down. This script is the single
# entry point for both local developer builds and CI — there is no separate
# CI-only path. If you need to debug, run *this*, not lb directly.
#
# Usage:
# linux/build/scripts/build.sh # writes to linux/build/output/<commit>
# BUILD_DIR=/tmp/build-a linux/build/scripts/build.sh # override output root
#
# Exit codes:
# 0 ISO produced and SHA256SUMS written
# 1 argument / environment error
# 2 derivative-maker submodule missing
# 3 build failed
# 4 post-build hash/manifest step failed
set -euo pipefail
# --- Locate repo root -------------------------------------------------------
SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd -- "${SCRIPT_DIR}/../../.." && pwd)"
cd "${REPO_ROOT}"
# --- Pinned builder image ---------------------------------------------------
# Bumped together with linux/build/docker/Dockerfile.builder. The digest form
# is required; refusing the tag-only form is what stops a silent host drift.
#
# docker-registry.silverlabs.uk is the canonical hostname both inside and
# outside the LAN — it's the entry that fleet-wide /etc/docker/daemon.json
# registers as an insecure-registry. The host-style "docker-registry:5000"
# is *not* DNS-resolvable; do not use it.
BUILDER_IMAGE="${BUILDER_IMAGE:-docker-registry.silverlabs.uk/silvermetal-builder@sha256:dc9dd29df4bee54807aee5bb2605b400754cba86db5343b4947a81a7ecea8811}"
if [[ "${BUILDER_IMAGE}" != *"@sha256:"* ]]; then
echo "build.sh: BUILDER_IMAGE must be pinned by digest, got: ${BUILDER_IMAGE}" >&2
exit 1
fi
# --- Sanity: submodule present ---------------------------------------------
if [[ ! -f "linux/build/derivative-maker/.git" && ! -d "linux/build/derivative-maker/.git" ]]; then
echo "build.sh: linux/build/derivative-maker submodule is not initialised." >&2
echo " Run: git submodule update --init --recursive" >&2
exit 2
fi
# --- Compute SOURCE_DATE_EPOCH ---------------------------------------------
# Order of preference:
# 1. Explicit env var passed in (CI may set it for cross-runner consistency)
# 2. config/source-date-epoch.env override (offline rebuilds)
# 3. git commit timestamp of HEAD (default)
# shellcheck disable=SC1091
source linux/build/config/source-date-epoch.env || true
if [[ -z "${SOURCE_DATE_EPOCH:-}" ]]; then
if [[ -n "${SOURCE_DATE_EPOCH_OVERRIDE:-}" ]]; then
SOURCE_DATE_EPOCH="${SOURCE_DATE_EPOCH_OVERRIDE}"
echo "build.sh: using SOURCE_DATE_EPOCH override = ${SOURCE_DATE_EPOCH}"
else
SOURCE_DATE_EPOCH="$(git log -1 --pretty=%ct)"
fi
fi
export SOURCE_DATE_EPOCH
# --- Pinned snapshot timestamp ---------------------------------------------
# shellcheck disable=SC1091
source linux/build/config/snapshot-pin.env
export SNAPSHOT_TIMESTAMP
# --- Resolve commit & output dir -------------------------------------------
COMMIT_SHA="$(git rev-parse --short=12 HEAD)"
BUILD_DIR="${BUILD_DIR:-${REPO_ROOT}/linux/build/output/${COMMIT_SHA}}"
mkdir -p "${BUILD_DIR}"
echo "build.sh: commit=${COMMIT_SHA} epoch=${SOURCE_DATE_EPOCH} snapshot=${SNAPSHOT_TIMESTAMP}"
echo "build.sh: output -> ${BUILD_DIR}"
# --- Mount strategy: local vs CI -------------------------------------------
# Locally we bind-mount the repo into the build container at the *same*
# path (self-referential), so internal references work transparently and
# the inner script doesn't need to care which host it's on.
#
# In CI we can't do that. build.sh runs inside a Gitea Actions job
# container which talks to the *host's* docker daemon via /var/run/docker.sock.
# Bind-mounting REPO_ROOT (= /workspace/<owner>/<repo>) would resolve
# against the host filesystem where that path doesn't exist; docker
# silently creates an empty dir on the host and mounts that, leaving the
# build container with an empty /work and a confusing "No such file or
# directory" error on the first config source.
#
# The standard fix for that DooD topology is --volumes-from of the parent
# job container, which inherits its /workspace mount intact. That keeps
# paths identical inside and outside, so the inner heredoc below is the
# same in both environments.
if [[ -n "${GITHUB_ACTIONS:-}" ]]; then
BIND_ARGS=(--volumes-from "$(hostname)")
else
BIND_ARGS=(-v "${REPO_ROOT}:${REPO_ROOT}:rw")
# If BUILD_DIR lives outside REPO_ROOT (uncommon, but the env-var
# override allows it), mount it explicitly too.
if [[ "${BUILD_DIR}" != "${REPO_ROOT}/"* && "${BUILD_DIR}" != "${REPO_ROOT}" ]]; then
BIND_ARGS+=(-v "${BUILD_DIR}:${BUILD_DIR}:rw")
fi
fi
# --- Run the build inside the container ------------------------------------
# This is a systemd-in-container build host. Upstream Kicksecure's
# derivative-maker assumes a real systemd-managed Debian — its build steps
# call `systemctl restart approx-derivative-maker.socket`,
# `systemctl daemon-reload`, etc. and depend on those services *actually*
# running. Without systemd as PID 1 we'd be playing whack-a-mole with
# every service derivative-maker starts.
#
# Required runtime flags for systemd-in-container:
# --privileged live-build needs loop devices + chroot mounts
# --cgroupns=host systemd needs to manage cgroups; with its own
# namespace it can't see the host hierarchy
# --tmpfs /run, /run/lock systemd writes runtime state here
# -v /sys/fs/cgroup:rw the cgroup tree systemd manages
# -t entrypoint.sh requires a TTY (it `exit 1`s on
# stdin not a tty); allocating one keeps that
# path happy in CI too where stdin is otherwise
# /dev/null
#
# `tail -f /dev/null` is NOT used — control flow goes through systemd:
# entrypoint.sh writes the user command to /etc/docker-entrypoint-cmd,
# execs systemd, systemd boots docker-entrypoint.service which runs the
# command, and docker-entrypoint-stop.sh propagates exit code via
# `systemctl exit <code>` so the container exits with the right status.
docker run --rm --privileged \
--cgroupns=host \
--tmpfs /run \
--tmpfs /run/lock \
-v /sys/fs/cgroup:/sys/fs/cgroup:rw \
--network=host \
-t \
"${BIND_ARGS[@]}" \
-e SOURCE_DATE_EPOCH \
-e SNAPSHOT_TIMESTAMP \
-e LC_ALL=C.UTF-8 \
-e LANG=C.UTF-8 \
-e TZ=UTC \
-e REPO_ROOT="${REPO_ROOT}" \
-e BUILD_DIR="${BUILD_DIR}" \
"${BUILDER_IMAGE}" \
bash -c '
# docker-entrypoint.service runs this as root via systemd, with
# the env vars captured by entrypoint.sh into
# /etc/docker-entrypoint-env. We hand workspace ownership to the
# unprivileged user (uid 1000), then sudo into it for the
# derivative-maker invocation. derivative-maker uses sudo
# internally for the bits that need root.
set -e
chown -R 1000:1000 "${REPO_ROOT}" "${BUILD_DIR}"
exec sudo --non-interactive --preserve-env -u user -- \
"${REPO_ROOT}/linux/build/scripts/build-inner.sh"
' \
|| { echo "build.sh: derivative-maker failed"; exit 3; }
# --- Hash artefacts ---------------------------------------------------------
# Run hashing on the host (not in the container) so a busted container image
# can't tamper with the digests we publish.
shopt -s nullglob
ISO_FILES=("${BUILD_DIR}"/*.iso)
shopt -u nullglob
if (( ${#ISO_FILES[@]} == 0 )); then
echo "build.sh: no ISO produced in ${BUILD_DIR}" >&2
exit 4
fi
(
cd "${BUILD_DIR}"
sha256sum -- *.iso > SHA256SUMS
cp -- "${REPO_ROOT}/linux/build/config/snapshot-pin.env" snapshot-pin.env
{
echo "commit=${COMMIT_SHA}"
echo "source_date_epoch=${SOURCE_DATE_EPOCH}"
echo "snapshot_timestamp=${SNAPSHOT_TIMESTAMP}"
echo "builder_image=${BUILDER_IMAGE}"
echo "host_uname=$(uname -srm)"
} > BUILD_INFO
)
echo "build.sh: SHA256SUMS:"
cat "${BUILD_DIR}/SHA256SUMS"