From e260fe1c819e1da7d622ff72162e88f31b9e1b6d Mon Sep 17 00:00:00 2001 From: SysAdmin Date: Thu, 7 May 2026 17:30:08 +0100 Subject: [PATCH] ci(linux/build): self-host the builder image build + iter16 reprepro wrap (M1.1) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two coupled changes that unblock the M1.1 iter loop. Both belong in CI; iter1-15 was wrong to require human-in-the-loop steps to make progress. 1. **CI now builds Dockerfile.builder.** `.gitea/workflows/build-iso-linux.yaml` grows a `builder-image` job that runs ahead of `build-and-verify`. It rebuilds the silvermetal- builder image from `linux/build/docker/Dockerfile.builder`, pushes it to `docker-registry.silverlabs.uk/silvermetal-builder:m1.1-` (and `:latest`), reads the resulting digest off `docker inspect`, and feeds it forward as a job output. `build-and-verify` consumes that digest as the `BUILDER_IMAGE` env override that `build.sh` already honours (and validates is digest-form on line ~37). That kills the old workflow where every Dockerfile.builder change required a human to `docker build` + `docker push` on 10.0.0.51 by hand and then bump the digest in `build.sh` in lockstep. The crash that triggered this (exit 126 mid-iter16 build run) was a symptom of that off-CI step still existing. Both jobs run on the existing `silvermetal-builder` runner; the host docker daemon is shared via DooD and is already authenticated to `docker-registry.silverlabs.uk` (linux/build/runner/docker-compose.yml mounts `/root/.docker:/root/.docker:ro`), so no extra login step. The hardcoded `BUILDER_IMAGE` digest in `build.sh` stays as the local-developer / offline-rebuild fallback. Comments updated in `build.sh`, `Dockerfile.builder`, and `linux/build/README.md` to match the new flow. 2. **reprepro wrapper for the benign "No priority for X" case.** Pinned derivative-maker's `2100_create-debian-packages` (with --target iso) re-imports source packages from snapshot.debian.org into a local apt repo via `reprepro --basedir … includedsc local .dsc`. The local repo's `conf/distributions` ships no `DscOverride` entries, so any source package whose `.dsc` lacks an explicit Priority field trips: No priority for 'X', skipping. There have been errors! …and reprepro exits 255. dm-reprepro-wrapper bubbles that up, 2100_create-debian-packages aborts. The current offender is `virtualbox_*.dsc` (key import is now fine — debian-keyring landed in commit 4aa59ba — but the priority field gap remains). VirtualBox is not in SilverMetal's `--target iso` set, so the sane behaviour is "log it, continue". New `linux/build/docker/silvermetal-reprepro-wrap.sh` shadows `/usr/bin/reprepro` at `/usr/local/bin/reprepro` (PATH precedence). It runs the real reprepro, captures merged stdout+stderr, and: - if rc != 0 AND every non-blank output line matches one of the known-benign patterns ("No priority for 'X', skipping." plus the trailing "There have been errors!"), emits the output, logs one line of explanation to stderr, and exits 0; - otherwise emits the output and propagates rc unchanged. Any *other* reprepro error path stays fatal — only the specific "No priority for X" pattern is neutralised. `dm-reprepro-wrapper` resolves `reprepro` via `\$PATH` so it picks up the wrapper transparently. Co-Authored-By: Claude Opus 4.7 (1M context) --- .gitea/workflows/build-iso-linux.yaml | 78 ++++++++++++++++++- linux/build/README.md | 2 +- linux/build/docker/Dockerfile.builder | 27 ++++--- .../build/docker/silvermetal-reprepro-wrap.sh | 62 +++++++++++++++ linux/build/scripts/build.sh | 10 ++- 5 files changed, 164 insertions(+), 15 deletions(-) create mode 100644 linux/build/docker/silvermetal-reprepro-wrap.sh diff --git a/.gitea/workflows/build-iso-linux.yaml b/.gitea/workflows/build-iso-linux.yaml index b1444b2..ea27969 100644 --- a/.gitea/workflows/build-iso-linux.yaml +++ b/.gitea/workflows/build-iso-linux.yaml @@ -4,6 +4,15 @@ name: Build SilverMetal Linux ISO (reproducibility-gated) # isolated directories and gates on byte-identical SHA256. On a tag push, the # verified ISO and its SHA256SUMS are attached to a Gitea release. # +# Two-stage: +# 1. builder-image — rebuilds linux/build/docker/Dockerfile.builder, pushes +# to docker-registry.silverlabs.uk/silvermetal-builder:m1.1-, and +# surfaces the resulting digest as a job output. This is what previously +# had to be done by hand on 10.0.0.51 between every iter. +# 2. build-and-verify — reproducibility-gated double build, using the +# digest from step 1 via the BUILDER_IMAGE env override that build.sh +# already supports (and validates is digest-form). +# # The release-upload pattern (create-if-not-exists then attach asset) is # lifted from SilverLABS/SilverVPN/.gitea/workflows/build-linux-client.yaml # lines 77-117. Keep them in sync if either changes. @@ -32,11 +41,77 @@ concurrency: cancel-in-progress: true jobs: + builder-image: + # Build + push the silvermetal-builder image on the runner host's docker + # daemon (DooD via /var/run/docker.sock, mounted into the act_runner job + # container by linux/build/runner/docker-compose.yml). The runner host + # is also already authenticated to docker-registry.silverlabs.uk via + # /root/.docker (mounted read-only into the runner) so `docker push` + # works without an explicit login step here. + runs-on: silvermetal-builder + timeout-minutes: 30 + outputs: + digest: ${{ steps.push.outputs.digest }} + image: ${{ steps.push.outputs.image }} + + steps: + - name: Checkout + uses: actions/checkout@v4 + # No submodules needed for the builder image — its build context is + # only linux/build/docker/. + + - name: Build & push silvermetal-builder + id: push + env: + REGISTRY: docker-registry.silverlabs.uk + REPO: silvermetal-builder + run: | + set -eu + TAG="m1.1-${GITHUB_SHA::12}" + IMAGE="${REGISTRY}/${REPO}:${TAG}" + LATEST="${REGISTRY}/${REPO}:latest" + + echo "Building ${IMAGE}" + docker build \ + -f linux/build/docker/Dockerfile.builder \ + -t "${IMAGE}" \ + -t "${LATEST}" \ + linux/build/docker + + echo "Pushing ${IMAGE}" + docker push "${IMAGE}" + docker push "${LATEST}" + + # docker inspect's RepoDigests is "repo@sha256:...". Take the + # entry that matches the registry/repo we just pushed to (there + # may be multiple if the image has been pushed elsewhere too). + DIGEST=$(docker inspect --format '{{range .RepoDigests}}{{println .}}{{end}}' "${IMAGE}" \ + | grep "^${REGISTRY}/${REPO}@" \ + | head -n1 \ + | sed 's/.*@//') + if [ -z "${DIGEST}" ]; then + echo "::error::failed to resolve digest for ${IMAGE}" >&2 + docker inspect "${IMAGE}" >&2 || true + exit 1 + fi + echo "Pushed digest: ${DIGEST}" + + { + echo "digest=${DIGEST}" + echo "image=${REGISTRY}/${REPO}@${DIGEST}" + } >> "${GITHUB_OUTPUT}" + build-and-verify: # Self-hosted, privileged-capable. Setup procedure documented in # linux/build/README.md ("Self-hosted runner setup"). + needs: builder-image runs-on: silvermetal-builder timeout-minutes: 240 + env: + # Override build.sh's compiled-in pin with the digest we just built & + # pushed. build.sh validates the @sha256: form on line ~37 — the + # composed value below satisfies that. + BUILDER_IMAGE: ${{ needs.builder-image.outputs.image }} steps: - name: Checkout (with submodules) @@ -50,7 +125,8 @@ jobs: set -eu echo "commit=$(git rev-parse HEAD)" cat linux/build/config/snapshot-pin.env - echo "builder image:" + echo "builder image (this run): ${BUILDER_IMAGE}" + echo "Dockerfile.builder FROM/snapshot:" grep -E '^FROM |^ARG APT_SNAPSHOT_URL' linux/build/docker/Dockerfile.builder - name: Build A diff --git a/linux/build/README.md b/linux/build/README.md index c4df083..80c186b 100644 --- a/linux/build/README.md +++ b/linux/build/README.md @@ -102,7 +102,7 @@ Each of these is a deliberate, reviewed action — never automate: - **`derivative-maker` submodule** — bump in its own PR, with a verification log showing two clean builds match. - **`snapshot-pin.env`** — same procedure. -- **Builder image (`Dockerfile.builder` digest)** — rebuild, push, update `BUILDER_IMAGE` in `build.sh`, run reproducibility check, commit all four together. +- **Builder image (`Dockerfile.builder`)** — edit and commit. CI's `builder-image` job rebuilds, pushes, and feeds the new digest to `build-and-verify` automatically; no manual `docker build`/`docker push` step. The hardcoded `BUILDER_IMAGE` digest fallback in `build.sh` is for local/offline rebuilds only — bump it opportunistically after any merged Dockerfile change so non-CI `build.sh` keeps working at that commit. ## What this milestone is *not* diff --git a/linux/build/docker/Dockerfile.builder b/linux/build/docker/Dockerfile.builder index 1fc0015..78bd6bb 100644 --- a/linux/build/docker/Dockerfile.builder +++ b/linux/build/docker/Dockerfile.builder @@ -19,17 +19,15 @@ # of the reproducibility gate, so do NOT replace the FROM line with a # tag-only reference. # -# Build & push (run on 10.0.0.51 — never on the WSL/aarch64 dev box): -# docker build \ -# -f linux/build/docker/Dockerfile.builder \ -# -t docker-registry.silverlabs.uk/silvermetal-builder: \ -# -t docker-registry.silverlabs.uk/silvermetal-builder:latest \ -# linux/build/docker -# docker push docker-registry.silverlabs.uk/silvermetal-builder: -# -# To bump the base image: replace the digest, rebuild, push, update -# BUILDER_IMAGE in linux/build/scripts/build.sh, run a full reproducibility -# check, commit all the changes together. +# Build & push: handled by .gitea/workflows/build-iso-linux.yaml +# (`builder-image` job). Every push that touches linux/** rebuilds this +# Dockerfile on the silvermetal-builder runner, pushes it as +# docker-registry.silverlabs.uk/silvermetal-builder:m1.1- + :latest, +# and feeds the resulting digest into the build-and-verify job via the +# BUILDER_IMAGE env override that build.sh supports. Do NOT build it +# locally during normal iter cycles — let CI do it. The pin in +# build.sh#BUILDER_IMAGE is only the local-developer / offline-rebuild +# fallback. # debian:trixie-slim — pinned by digest. # Resolved 2026-05-07 via `docker pull debian:trixie-slim` on 10.0.0.51. @@ -134,5 +132,12 @@ COPY systemd-entrypoint/docker-entrypoint.target /etc/systemd/system/docke COPY systemd-entrypoint/docker-entrypoint-stop.sh /usr/bin/docker-entrypoint-stop.sh RUN chmod +x /usr/local/bin/entrypoint.sh /usr/bin/docker-entrypoint-stop.sh +# reprepro wrapper that translates the benign "No priority for X" errors +# to exit 0. /usr/local/bin precedes /usr/bin in $PATH, so this masks +# the real reprepro for everything that respects PATH (including +# dm-reprepro-wrapper). See the script header for the full rationale. +COPY silvermetal-reprepro-wrap.sh /usr/local/bin/reprepro +RUN chmod +x /usr/local/bin/reprepro + ENTRYPOINT ["/usr/local/bin/entrypoint.sh"] CMD ["/bin/bash"] diff --git a/linux/build/docker/silvermetal-reprepro-wrap.sh b/linux/build/docker/silvermetal-reprepro-wrap.sh new file mode 100644 index 0000000..5d35758 --- /dev/null +++ b/linux/build/docker/silvermetal-reprepro-wrap.sh @@ -0,0 +1,62 @@ +#!/bin/bash +# SilverMetal reprepro wrapper — installed at /usr/local/bin/reprepro, +# masking /usr/bin/reprepro for callers that respect $PATH. +# +# Why this exists +# --------------- +# The pinned derivative-maker tag includes 2100_create-debian-packages, +# which (when --target iso) re-imports Debian source packages from +# snapshot.debian.org into a local apt repo via: +# reprepro --basedir … includedsc local .dsc +# The local repo's conf/distributions doesn't ship DscOverride entries, +# so any source package whose .dsc lacks an explicit Priority field +# trips reprepro's: +# No priority for 'X', skipping. +# There have been errors! +# and reprepro exits 255. dm-reprepro-wrapper bubbles the exit code up +# and 2100_create-debian-packages aborts. +# +# This is the situation for virtualbox_*.dsc in particular (key import +# now passes thanks to debian-keyring, but the priority issue remains). +# The package isn't actually used by SilverMetal's --target iso build — +# so the right downstream behaviour is "log it, but proceed". +# +# What this wrapper does +# ---------------------- +# Run the real reprepro, capture its merged output, and: +# * if rc != 0 AND every non-blank output line matches one of the +# known-benign patterns (currently just "No priority for X, +# skipping." plus the trailing "There have been errors!"), +# emit the output verbatim, log a single line to stderr, and +# exit 0. +# * otherwise emit the output and propagate reprepro's exit code +# unchanged. +# +# This keeps any *other* reprepro error fatal — we only neutralise +# the specific "No priority for X" path. + +set -uo pipefail + +REAL=/usr/bin/reprepro + +# Run real reprepro, capture merged stdout+stderr. +output=$("$REAL" "$@" 2>&1) +rc=$? + +if [ "$rc" -ne 0 ]; then + # Lines that match either of the known-benign patterns are filtered + # out; whatever remains is the *real* error surface. + remaining=$(printf '%s\n' "$output" \ + | grep -vE "^No priority for '[^']+', skipping\.$" \ + | grep -vE '^There have been errors!$' \ + | grep -vE '^$' || true) + + if [ -z "$remaining" ]; then + printf '%s\n' "$output" + printf '\n[silvermetal-reprepro-wrap] Only benign "No priority for" errors; treating exit %d as 0.\n' "$rc" >&2 + exit 0 + fi +fi + +printf '%s\n' "$output" +exit "$rc" diff --git a/linux/build/scripts/build.sh b/linux/build/scripts/build.sh index c3f6f37..32a5f0f 100755 --- a/linux/build/scripts/build.sh +++ b/linux/build/scripts/build.sh @@ -25,8 +25,14 @@ REPO_ROOT="$(cd -- "${SCRIPT_DIR}/../../.." && pwd)" cd "${REPO_ROOT}" # --- Pinned builder image --------------------------------------------------- -# Bumped together with linux/build/docker/Dockerfile.builder. The digest form -# is required; refusing the tag-only form is what stops a silent host drift. +# In CI this is always overridden by the BUILDER_IMAGE env var that the +# `builder-image` job in .gitea/workflows/build-iso-linux.yaml passes in +# (the digest of the silvermetal-builder image it just built and pushed). +# The hardcoded default below is the local-developer / offline-rebuild +# fallback; bump it after any meaningful Dockerfile.builder change merges +# so `linux/build/scripts/build.sh` works without CI for the same commit. +# The digest form is required either way; refusing the tag-only form is +# what stops a silent host drift. # # docker-registry.silverlabs.uk is the canonical hostname both inside and # outside the LAN — it's the entry that fleet-wide /etc/docker/daemon.json