Files
SilverMetal/.gitea/workflows/build-iso-linux.yaml
SysAdmin c9e67d8b47
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m36s
fix(linux/build): staged divergence diagnostic, avoid OOM (M1.1 iter26)
Run #4273 confirmed two things:

1. The reproducibility gate works end-to-end. Both builds produced
   ISOs (1077194752 vs 1077202944 bytes — 8 KiB delta, exactly one
   squashfs block worth of compressed-payload drift) and the compare
   step caught it.

2. diffoscope, run on the whole 1 GB ISO inside the silvermetal-builder
   container, gets OOM-killed before producing any output:

       diagnose-divergence.sh: line 44:    13 Killed
         diffoscope --max-report-size 100000000 --html ... --text ... A.iso B.iso

   The host has 19 GiB free, but diffoscope's full recursion through
   ISO -> squashfs -> ~thousands of inner files needs more memory than
   that for a 1 GB image. Setting --max-report-size only caps the
   output, not the working-set.

Rewrite diagnose-divergence.sh to do staged, cheap-to-expensive
analysis:
  1. sha256 + sizes (always)
  2. xorriso TOC of both ISOs (every node: mode/size/mtime/path) -> diff
  3. Pull just live/filesystem.squashfs out of each ISO,
     sha256 it + `unsquashfs -ll` it, diff the listings — this is
     where the per-file-size signal lives.
  4. Targeted diffoscope on the squashfs payload only, with
     --max-container-depth 2 + --max-text-report-size 5MB + --no-html
     + a 10-minute timeout. Bounded enough to finish without the OOM.

Drops `set -e` — every step `|| true`s itself so we get partial output
even when one stage fails.

Workflow tail-into-log step now prints the new staged outputs:
  * toc-diff.txt   — what changed at the ISO level
  * sqfs-ls-diff.txt — which inner files have different sizes/mtimes
  * sqfs-diff.txt   — diffoscope on the squashfs only
  * squashfs-sha256.txt
  * iso-header-cmp.txt — first-8KB cmp -l for header-level drift
  * sizes.txt / sha256.txt / checklist.md as before

Should land us a focused list of "these N files inside the squashfs
have different bytes" — that's what we need to find what's leaking
non-determinism into the build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 19:54:35 +01:00

287 lines
12 KiB
YAML

name: Build SilverMetal Linux ISO (reproducibility-gated)
# M1.1 exit-criterion check. Builds the ISO twice from a clean checkout in
# isolated directories and gates on byte-identical SHA256. On a tag push, the
# verified ISO and its SHA256SUMS are attached to a Gitea release.
#
# Two-stage:
# 1. builder-image — rebuilds linux/build/docker/Dockerfile.builder, pushes
# to docker-registry.silverlabs.uk/silvermetal-builder:m1.1-<sha>, and
# surfaces the resulting digest as a job output. This is what previously
# had to be done by hand on 10.0.0.51 between every iter.
# 2. build-and-verify — reproducibility-gated double build, using the
# digest from step 1 via the BUILDER_IMAGE env override that build.sh
# already supports (and validates is digest-form).
#
# The release-upload pattern (create-if-not-exists then attach asset) is
# lifted from SilverLABS/SilverVPN/.gitea/workflows/build-linux-client.yaml
# lines 77-117. Keep them in sync if either changes.
on:
push:
branches: [main]
paths:
- 'linux/**'
- 'shared/branding/linux-iso-meta.yaml'
- '.gitea/workflows/build-iso-linux.yaml'
tags:
- 'v*'
pull_request:
branches: [main]
paths:
- 'linux/**'
- 'shared/branding/linux-iso-meta.yaml'
- '.gitea/workflows/build-iso-linux.yaml'
workflow_dispatch:
# Two reproducibility-gated builds in flight at once would just compete for
# loop devices on the privileged runner. Serialise per ref.
concurrency:
group: build-iso-linux-${{ github.ref }}
cancel-in-progress: true
jobs:
builder-image:
# Build + push the silvermetal-builder image on the runner host's docker
# daemon (DooD via /var/run/docker.sock, mounted into the act_runner job
# container by linux/build/runner/docker-compose.yml). The runner host
# is also already authenticated to docker-registry.silverlabs.uk via
# /root/.docker (mounted read-only into the runner) so `docker push`
# works without an explicit login step here.
runs-on: silvermetal-builder
timeout-minutes: 30
outputs:
digest: ${{ steps.push.outputs.digest }}
image: ${{ steps.push.outputs.image }}
steps:
- name: Checkout
uses: actions/checkout@v4
# No submodules needed for the builder image — its build context is
# only linux/build/docker/.
- name: Build & push silvermetal-builder
id: push
env:
REGISTRY: docker-registry.silverlabs.uk
REPO: silvermetal-builder
run: |
set -eu
TAG="m1.1-${GITHUB_SHA::12}"
IMAGE="${REGISTRY}/${REPO}:${TAG}"
LATEST="${REGISTRY}/${REPO}:latest"
echo "Building ${IMAGE}"
docker build \
-f linux/build/docker/Dockerfile.builder \
-t "${IMAGE}" \
-t "${LATEST}" \
linux/build/docker
echo "Pushing ${IMAGE}"
docker push "${IMAGE}"
docker push "${LATEST}"
# docker inspect's RepoDigests is "repo@sha256:...". Take the
# entry that matches the registry/repo we just pushed to (there
# may be multiple if the image has been pushed elsewhere too).
DIGEST=$(docker inspect --format '{{range .RepoDigests}}{{println .}}{{end}}' "${IMAGE}" \
| grep "^${REGISTRY}/${REPO}@" \
| head -n1 \
| sed 's/.*@//')
if [ -z "${DIGEST}" ]; then
echo "::error::failed to resolve digest for ${IMAGE}" >&2
docker inspect "${IMAGE}" >&2 || true
exit 1
fi
echo "Pushed digest: ${DIGEST}"
{
echo "digest=${DIGEST}"
echo "image=${REGISTRY}/${REPO}@${DIGEST}"
} >> "${GITHUB_OUTPUT}"
build-and-verify:
# Self-hosted, privileged-capable. Setup procedure documented in
# linux/build/README.md ("Self-hosted runner setup").
needs: builder-image
runs-on: silvermetal-builder
timeout-minutes: 240
env:
# Override build.sh's compiled-in pin with the digest we just built &
# pushed. build.sh validates the @sha256: form on line ~37 — the
# composed value below satisfies that.
BUILDER_IMAGE: ${{ needs.builder-image.outputs.image }}
steps:
- name: Checkout (with submodules)
uses: actions/checkout@v4
with:
submodules: recursive
fetch-depth: 0 # need history so SOURCE_DATE_EPOCH = HEAD commit time
- name: Show pinned inputs
run: |
set -eu
echo "commit=$(git rev-parse HEAD)"
cat linux/build/config/snapshot-pin.env
echo "builder image (this run): ${BUILDER_IMAGE}"
echo "Dockerfile.builder FROM/snapshot:"
grep -E '^FROM |^ARG APT_SNAPSHOT_URL' linux/build/docker/Dockerfile.builder
- name: Build A
env:
BUILD_DIR: ${{ github.workspace }}/build-a
run: linux/build/scripts/build.sh
- name: Build B (clean second build)
env:
BUILD_DIR: ${{ github.workspace }}/build-b
run: linux/build/scripts/build.sh
- name: Compare SHA256
id: compare
run: |
set -eu
A=$(sha256sum "${{ github.workspace }}/build-a"/*.iso | cut -d' ' -f1)
B=$(sha256sum "${{ github.workspace }}/build-b"/*.iso | cut -d' ' -f1)
echo "A=${A}"
echo "B=${B}"
echo "iso_sha256=${A}" >> "${GITHUB_OUTPUT}"
if [ "${A}" != "${B}" ]; then
echo "::error::ISO SHA256 mismatch — A=${A} B=${B}"
# The catthehacker job container has cmp but not diffoscope.
# We do have the silvermetal-builder image (with
# diffoscope-minimal baked in via Dockerfile.builder) on the
# host docker daemon — built fresh at the top of this run by
# the builder-image job. Run diagnose-divergence inside that
# image so we get the rich, package-aware diff. Mount the
# workspace at the same path so REPO_ROOT-relative resolution
# in diagnose-divergence.sh works.
ISO_A="$(ls "${{ github.workspace }}/build-a"/*.iso | head -n1)"
ISO_B="$(ls "${{ github.workspace }}/build-b"/*.iso | head -n1)"
mkdir -p "${{ github.workspace }}/divergence"
SELF_CID=""
for cid in $(docker ps -q --no-trunc 2>/dev/null); do
if docker inspect "$cid" --format \
'{{range .Mounts}}{{if eq .Destination "/workspace/SilverLABS/SilverMetal"}}match{{end}}{{end}}' \
2>/dev/null | grep -q match; then
SELF_CID="$cid"; break
fi
done
if [ -n "${SELF_CID}" ]; then
docker run --rm \
--volumes-from "${SELF_CID}" \
-e ISO_A="${ISO_A}" \
-e ISO_B="${ISO_B}" \
-e REPORT_DIR="${{ github.workspace }}/divergence" \
--entrypoint /bin/bash \
"${BUILDER_IMAGE}" \
"${{ github.workspace }}/linux/build/scripts/diagnose-divergence.sh" \
|| true
else
echo "::warning::Could not find self container, falling back to host cmp"
ISO_A="${ISO_A}" ISO_B="${ISO_B}" \
REPORT_DIR="${{ github.workspace }}/divergence" \
linux/build/scripts/diagnose-divergence.sh || true
fi
# Tail key signal directly into the workflow log so we see it
# without needing artifact download (Gitea 1.25's API doesn't
# surface upload-artifact@v3 payloads through any v1 endpoint
# we've found). Print sizes, sha, checklist, and the new
# staged outputs from diagnose-divergence.sh: ISO TOC diff
# and squashfs file listing diff first (small, high signal),
# then the targeted diffoscope output on the squashfs payload.
DIVDIR="${{ github.workspace }}/divergence"
print_section() {
local title="$1" path="$2" head_lines="${3:-0}"
[ -e "${path}" ] || return 0
echo ""
echo "=== ${title} ==="
if [ "${head_lines}" -gt 0 ]; then
head -n "${head_lines}" "${path}" 2>/dev/null || true
else
cat "${path}" 2>/dev/null || true
fi
}
print_section "ISO sizes" "${DIVDIR}/sizes.txt"
print_section "SHA256 (ISO)" "${DIVDIR}/sha256.txt"
print_section "SHA256 (squashfs payload)" "${DIVDIR}/squashfs-sha256.txt"
print_section "checklist" "${DIVDIR}/checklist.md"
print_section "ISO TOC diff (xorriso lsdl)" "${DIVDIR}/toc-diff.txt" 400
print_section "squashfs file listing diff" "${DIVDIR}/sqfs-ls-diff.txt" 600
print_section "diffoscope (squashfs)" "${DIVDIR}/sqfs-diff.txt" 600
print_section "ISO header cmp -l (first 8KB)" "${DIVDIR}/iso-header-cmp.txt" 100
echo ""
echo "(Full report uploaded as divergence-report-${{ github.run_id }})"
exit 1
fi
echo "Reproducibility gate PASSED at ${A}"
- name: Upload divergence report on failure
if: failure()
uses: actions/upload-artifact@v3
with:
name: divergence-report-${{ github.run_id }}
path: ${{ github.workspace }}/divergence
if-no-files-found: ignore
retention-days: 14
- name: Stage release artefacts
if: startsWith(github.ref, 'refs/tags/')
run: |
set -eu
mkdir -p release
cp "${{ github.workspace }}/build-a"/*.iso release/
cp "${{ github.workspace }}/build-a"/SHA256SUMS release/
cp "${{ github.workspace }}/build-a"/BUILD_INFO release/
cp "${{ github.workspace }}/build-a"/snapshot-pin.env release/snapshot-pin.env
ls -la release/
- name: Upload to Gitea release (tag only)
if: startsWith(github.ref, 'refs/tags/')
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
set -eu
TAG="${{ github.ref_name }}"
API="${{ github.server_url }}/api/v1"
REPO="${{ github.repository }}"
# Create-if-not-exists, then attach assets. Pattern lifted from
# SilverLABS/SilverVPN build-linux-client.yaml:89-115.
RELEASE_ID=$(curl -s "${API}/repos/${REPO}/releases/tags/${TAG}" \
-H "Authorization: token ${GITHUB_TOKEN}" \
| jq -r '.id // empty' 2>/dev/null || true)
if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "0" ]; then
RESPONSE=$(curl -s -w "\n%{http_code}" -X POST "${API}/repos/${REPO}/releases" \
-H "Authorization: token ${GITHUB_TOKEN}" \
-H "Content-Type: application/json" \
-d "{\"tag_name\":\"${TAG}\",\"name\":\"SilverMetal Linux ${TAG}\",\"body\":\"Reproducibility-verified ISO. SHA256 in SHA256SUMS asset.\",\"draft\":false,\"prerelease\":true}")
HTTP_CODE=$(echo "${RESPONSE}" | tail -1)
BODY=$(echo "${RESPONSE}" | sed '$d')
if [ "${HTTP_CODE}" -ge 400 ]; then
echo "ERROR: Failed to create release (${HTTP_CODE}): ${BODY}"
exit 1
fi
RELEASE_ID=$(echo "${BODY}" | jq -r '.id')
echo "Created release ID: ${RELEASE_ID}"
else
echo "Found existing release ID: ${RELEASE_ID}"
fi
for asset in release/*; do
name=$(basename "${asset}")
echo "Attaching ${name}"
curl -sf -X POST \
"${API}/repos/${REPO}/releases/${RELEASE_ID}/assets?name=${name}" \
-H "Authorization: token ${GITHUB_TOKEN}" \
-H "Content-Type: application/octet-stream" \
--data-binary "@${asset}"
done