Files
SilverMetal/linux/build/scripts/build-inner.sh
SysAdmin 34bc442dd8
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m40s
fix(linux/build): cover all ISO9660 dates + locate residual byte drift (M1.1 iter34)
Run #4281 cleared every layer above the ISO9660 wrapper:

    SHA256 (squashfs payload)
    caed117ca72c6c1d9204c49dd749d5f7b372f3a19cac1b2a7e66bee452a8d501  /tmp/.../a.squashfs
    caed117ca72c6c1d9204c49dd749d5f7b372f3a19cac1b2a7e66bee452a8d501  /tmp/.../b.squashfs

…squashfs is now byte-identical, ISO TOC is identical, file listing
diff is empty, but ISO SHA still differs. The remaining drift is in
the ISO9660 metadata region between the system area (first 32 KiB)
and the file payload start.

Two complementary changes:

1. xorriso post-process now sets *every* date field xorriso writes,
   not just the obvious two:

     -alter_date_r all     — atime + mtime + btime on all nodes,
                             not just mtime. ISO9660 directory
                             records carry creation+modification
                             timestamps.
     -volume_date c m x f u s — every volume-descriptor date:
       c=creation  m=modification  x=expiration  f=effective
       u=system area  s=path table
     Default for any unset volume_date is "now", which is what was
     leaking through despite us setting c+m.

2. diagnose-divergence.sh now does whole-file cmp -l (capped at 200
   lines so 1 GiB of all-different doesn't drown the report) and on
   any divergence, dumps a 128-byte xxd window from each ISO around
   the first differing byte plus a unified diff between the two
   windows. This tells us in the next failure log "first byte differs
   at offset N (LBA M), bytes around it look like X" — pinpoints the
   ISO9660 region without needing artifact download.

Workflow tail-into-log step wired up the two new files
(iso-cmp-first-200.txt, iso-around-first-diff.diff).

If iter34 still fails the gate, the new diagnostic tells us exactly
which structure (volume descriptor, path table, directory record,
boot catalog…) is still drifting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 00:29:37 +01:00

299 lines
15 KiB
Bash
Executable File

#!/usr/bin/env bash
# SilverMetal Linux — inner build step.
#
# Runs *inside* the silvermetal-builder container, as the unprivileged
# `user` (uid 1000). build.sh's docker-run cmd chowns the workspace and
# sudoes here. The container's PID 1 is systemd (upstream's
# systemd-in-container pattern), so any `systemctl` calls derivative-
# maker makes — to start approx, daemon-reload, etc. — actually do
# what they're supposed to. derivative-maker uses sudo internally for
# its privileged ops.
#
# Why this is its own file:
# The previous incarnation lived as a heredoc inside build.sh's docker
# run command. Once we needed to drop privileges from root to user,
# the nested-heredoc / nested-quoting situation became unreadable; a
# plain script with normal quoting is far easier to maintain.
#
# Required env vars (set by build.sh and forwarded into the container):
# REPO_ROOT — absolute path to the SilverMetal repo root
# BUILD_DIR — where to drop the resulting *.iso and manifests
# SOURCE_DATE_EPOCH — reproducibility timestamp (forwarded to live-build)
# SNAPSHOT_TIMESTAMP — apt snapshot pin (forwarded to live-build)
set -euo pipefail
: "${REPO_ROOT:?REPO_ROOT must be set}"
: "${BUILD_DIR:?BUILD_DIR must be set}"
# Explicit user_name pin.
# derivative-maker/help-steps/variables (lines 80-93) computes user_name
# from $SUDO_USER as its first non-empty fallback. We enter this script
# via `sudo --preserve-env -u user --` from root, which makes sudo set
# SUDO_USER=root (the *calling* user). Variables.sh then picks
# user_name="root" and computes HOMEVAR=/home/root — which doesn't exist
# (root's home is /root). The first thing that breaks under that path
# is the aptgetopt config tee in 1100_sanity-tests:
# tee: /home/root/derivative-binary/30_derivative-maker.conf:
# No such file or directory
# Setting user_name explicitly satisfies the first-priority check in
# variables.sh and short-circuits the SUDO_USER fallback.
export user_name=user
# Create the binary output directory derivative-maker writes into.
# variables.sh sets binary_build_folder_dist=$HOMEVAR/derivative-binary
# (= /home/user/derivative-binary), and 1100_sanity-tests / later steps
# expect it to exist. Upstream's docker-start does the equivalent
# `mkdir --parents -- "${HOME}/derivative-binary"`; we replicate that
# here so we don't depend on upstream's wrapper.
mkdir -p "${HOME}/derivative-binary"
# Import Debian developer keys into the user's GPG keyring.
# 2100_create-debian-packages calls `dm-reprepro-wrapper includedsc`
# on Debian source packages it pulls in (e.g. virtualbox_*.dsc, even
# for --target iso — see 2100_create-debian-packages line 114), and
# reprepro verifies each .dsc's signature against the user's keyring.
# Without this, every dsc with a Debian-uploader signature fails:
# Could not check validity of signature with '<fingerprint>' in
# '...virtualbox_7.2.8-dfsg-1.dsc' as public key missing!
# There have been errors!
# debian-keyring (~40 MB, snapshot-pinned) provides the developer
# keys; importing it once at the start of the build seeds the keyring
# reprepro will consult.
if [ -d /usr/share/keyrings ]; then
for f in /usr/share/keyrings/debian-keyring.gpg \
/usr/share/keyrings/debian-maintainers.gpg \
/usr/share/keyrings/debian-nonupload.gpg; do
[ -f "$f" ] || continue
gpg --quiet --no-tty --import "$f" 2>/dev/null || true
done
fi
# Bridge the upstream-blessed checkout path.
# 1500_local-deps installs the in-repo `developer-meta-files.deb` system-
# wide. That deb ships /usr/bin/dm-reprepro-wrapper, which begins with
# source "${0%/*}/../../help-steps/pre"
# resolved against the install location (/usr/bin) and a user_name-relative
# layout that lands at /home/user/derivative-maker/help-steps/pre.
# When 2100_create-debian-packages calls `dm-reprepro-wrapper` via PATH
# (e.g. through `genmkfile reprepro-remove`), the system copy wins over
# the in-repo one, and the source fails:
# /usr/bin/dm-reprepro-wrapper: line 28:
# /home/user/derivative-maker/help-steps/pre: No such file or directory
# Make /home/user/derivative-maker resolve to our actual checkout so both
# the in-repo and system-installed wrappers find the same support files.
ln -sfn "${REPO_ROOT}/linux/build/derivative-maker" /home/user/derivative-maker
# shellcheck disable=SC1091
source "${REPO_ROOT}/linux/build/config/silvermetal-base.conf"
cd "${REPO_ROOT}/linux/build/derivative-maker"
# CLI grammar comes from derivative-maker/help-steps/parse-cmd. The
# valid options are a closed set; passing anything else (including
# --build, --dist, or --config) trips the "unknown option" guard at
# parse-cmd line 725. Spelling matters too: upstream uses --flavor
# (American), not --flavour. --freedom is mandatory for amd64/i386.
# Dist is implicit from --flavor (kicksecure-cli => trixie), and
# the silvermetal-base.conf is sourced into the env above rather than
# passed as a flag because derivative-maker has no --config option.
#
# --allow-untagged true / --allow-uncommitted true: the pinned upstream
# tag (18.1.7.4-developers-only — name says it all) deliberately ships
# with some submodules at intermediate / merge commits. sq-git still
# verifies every signature in the chain — these flags only relax the
# additional "must be at a release tag" check. Appropriate for a
# downstream consumer pinned to a developer tag.
./derivative-maker \
--flavor "${DERIVATIVE_FLAVOUR}" \
--target "${DERIVATIVE_BUILD_TARGET}" \
--arch "${DERIVATIVE_TARGET_ARCH}" \
--freedom "${DERIVATIVE_FREEDOM}" \
--allow-untagged true \
--allow-uncommitted true
# --- Reproducibility post-processing ---------------------------------------
# Run #4276's diffoscope pinned the divergence to exactly two files in the
# rootfs squashfs:
#
# /etc/nvme/hostid
# Random UUID written by nvme-cli's postinst at install time. Two
# independent CI runs of the same commit produce different UUIDs.
# At runtime nvme-cli regenerates this on first boot if it's
# missing, so dropping it from the ISO is safe and standard
# practice for reproducible Debian rebuilders.
#
# /var/lib/dkms/<module>/<version>/<kver>/<arch>/log/make.log
# Build log captured during DKMS module compilation (currently
# only `tirdad`). Embeds wall-clock start/end times and elapsed
# seconds. Not needed at runtime — DKMS only consults make.log
# when troubleshooting a failed build, and that's a development
# activity, not a runtime one. /var/lib/dkms/<…>/log entire dir
# is dropped.
#
# We can't fix this in derivative-maker without forking it (the
# offending postinsts are part of upstream Debian packages, not
# Kicksecure's own scripts), so the surgical place is here, between
# the chroot being assembled and the squashfs being sealed. We
# rebuild the squashfs from the (cleaned) chroot and patch it back
# into the ISO.
#
# This adds ~5-7 minutes per build (mksquashfs of ~1 GiB, then
# xorriso replace) but guarantees byte-equality between A and B.
post_process_for_reproducibility() {
local chroot_dir iso_file new_sqfs
chroot_dir=$(find "${HOME}/derivative-binary" -maxdepth 6 -type d \
-path '*/live-build/chroot' -print -quit 2>/dev/null || true)
iso_file=$(find "${HOME}/derivative-binary" -maxdepth 6 -type f \
-name '*.iso' -print -quit 2>/dev/null || true)
if [[ -z "${chroot_dir}" || -z "${iso_file}" ]]; then
echo "post-process: chroot or ISO not found, skipping reproducibility scrub" >&2
echo " chroot=${chroot_dir:-<missing>}"
echo " iso=${iso_file:-<missing>}"
return 0
fi
echo "post-process: chroot=${chroot_dir}"
echo "post-process: iso=${iso_file}"
# Files we know to be non-deterministic. sudo because the chroot
# is owned by root.
#
# Why each one:
# /etc/nvme/host{id,nqn} — random UUIDs (nvme-cli postinst).
# nvme-cli regenerates on first boot.
# /var/lib/dkms/<…>/log — wall-clock build timestamps in
# DKMS make.log; not consulted at
# runtime.
# /var/cache/apt/{,src}pkgcache.bin
# — apt's compiled package index, has
# internal pointers/timestamps that
# vary run-to-run. Regenerated on
# first `apt-get update` (and
# transparently triggered by anything
# that needs it).
# /var/cache/ldconfig/aux-cache
# — ldconfig auxiliary cache, also
# with internal nondet state.
# Regenerated by ldconfig.
sudo --non-interactive rm -f \
"${chroot_dir}/etc/nvme/hostid" \
"${chroot_dir}/etc/nvme/hostnqn" \
"${chroot_dir}/var/cache/apt/pkgcache.bin" \
"${chroot_dir}/var/cache/apt/srcpkgcache.bin" \
"${chroot_dir}/var/cache/ldconfig/aux-cache" \
"${chroot_dir}/var/log/dpkg.log" \
"${chroot_dir}/var/log/alternatives.log"
# /var/log/apt/* — apt history/term logs, every line stamped with
# wall-clock time of the build. Regenerated on first use.
sudo --non-interactive rm -f "${chroot_dir}"/var/log/apt/*.log
# /var/lib/apt/lists/* — downloaded apt index files. The signed
# InRelease for each repo carries the repo's signing timestamp
# (FastTrack re-signs every 24h or so; the local kicksecure repo
# built by 2100_create-debian-packages stamps with reprepro's
# wall-clock time). Regenerated on first `apt-get update`.
# Keep `lock` and `partial/` so apt's own metadata structure
# survives.
sudo --non-interactive find "${chroot_dir}/var/lib/apt/lists" \
-mindepth 1 -maxdepth 1 -not -name lock -not -name partial \
-exec rm -rf {} + 2>/dev/null || true
sudo --non-interactive find "${chroot_dir}/var/lib/dkms" \
-mindepth 1 -type d -name log -prune -exec rm -rf {} + \
2>/dev/null || true
# Repack squashfs. -reproducible + -mkfs-time + -all-time together
# zero out every timestamp source mksquashfs knows about, so the
# output is a pure function of the chroot contents (which we've
# just made deterministic) plus our flags.
new_sqfs=$(mktemp --suffix=.squashfs --tmpdir=/tmp silvermetal-rebuilt-XXXXXX)
sudo --non-interactive rm -f "${new_sqfs}"
echo "post-process: repacking squashfs (this takes ~3-5 min)"
sudo --non-interactive mksquashfs "${chroot_dir}" "${new_sqfs}" \
-no-progress \
-no-exports -no-xattrs -all-root \
-reproducible \
-mkfs-time "${SOURCE_DATE_EPOCH}" \
-all-time "${SOURCE_DATE_EPOCH}" \
-comp xz -b 1M -Xdict-size 100% \
-no-recovery
# Substitute the new squashfs into the ISO. xorriso's `-update`
# rewrites just the named file then re-emits the ISO; -boot_image
# any keep preserves the existing El Torito + GPT/UEFI bits so the
# image stays bootable.
#
# -return_with SORRY 0: by default xorriso exits 32 when it raises
# a SORRY-severity diagnostic, even when the write itself succeeded.
# The post-write re-assessment of the new ISO produces one
# unavoidably:
# libburn : SORRY : Read start address 525977s larger than
# number of readable blocks 506240
# …because we just shrunk the ISO (smaller squashfs) and the
# protective MBR header still records the *original* size. The
# GPT and El Torito records inside are correct and self-consistent;
# the protective MBR is vestigial and bootloaders don't consult its
# size field. Demoting SORRY -> exit 0 lets xorriso warn but still
# report success on the actually-completed write.
local new_iso="${iso_file%.iso}.silvermetal-clean.iso"
sudo --non-interactive rm -f "${new_iso}"
echo "post-process: replacing /live/filesystem.squashfs in ISO"
# Force every date xorriso writes into the ISO9660 structure to the
# pinned epoch.
# -alter_date_r all — atime/mtime/btime on every file & dir
# (just `m` left btime drifting in run #4281: byte-identical
# squashfs, byte-identical TOC, but still-different ISO bytes).
# -volume_date c/m/x/f/u/s — every volume-descriptor date
# (creation, modification, expiration, effective, system area,
# path table). xorriso defaults to "now" for any not-explicitly-set
# volume date.
# `--` terminates the variable-length path list of -alter_date_r;
# without it the following -volume_date is consumed as a path and
# xorriso bails with "Cannot find path '/-volume_date'" (run #4279).
sudo --non-interactive xorriso \
-return_with SORRY 0 \
-indev "${iso_file}" \
-outdev "${new_iso}" \
-boot_image any keep \
-update "${new_sqfs}" /live/filesystem.squashfs \
-alter_date_r all "=${SOURCE_DATE_EPOCH}" / -- \
-volume_date c "=${SOURCE_DATE_EPOCH}" \
-volume_date m "=${SOURCE_DATE_EPOCH}" \
-volume_date x "=${SOURCE_DATE_EPOCH}" \
-volume_date f "=${SOURCE_DATE_EPOCH}" \
-volume_date u "=${SOURCE_DATE_EPOCH}" \
-volume_date s "=${SOURCE_DATE_EPOCH}" \
-commit
sudo --non-interactive mv -f "${new_iso}" "${iso_file}"
sudo --non-interactive rm -f "${new_sqfs}"
echo "post-process: ISO rebuilt with reproducible squashfs"
sha256sum "${iso_file}"
}
post_process_for_reproducibility
# derivative-maker writes its outputs into ${HOME}/derivative-binary
# (per help-steps/variables: binary_build_folder_dist=$HOMEVAR/derivative-binary),
# *not* into the source tree. Collect from there into BUILD_DIR.
# Exact upstream output paths can shift between tags — keep this tolerant.
#
# stderr+exit suppression is essential: $HOME/derivative-binary contains
# the live-build chroot, and several of the chroot's own subdirs
# (/usr/src, /etc/sudoers.d, /etc/cron.*, /boot, /root, /run/sudo,
# cache/bootstrap/root, ...) are 0700 root-owned because the chroot
# creation step ran under sudo. As `user` (uid 1000) we can't traverse
# them. find emits "Permission denied" on each and exits non-zero;
# pipefail then kills the entire build script *after* the ISO has
# already been copied — exactly what happened on run #4271 (15:24
# clean derivative-maker run, ISO produced, build-inner died on this
# pipeline). Suppress and rely on build.sh's host-side
# "no *.iso in BUILD_DIR" check (exit 4) to surface a real miss.
find "${HOME}/derivative-binary" -maxdepth 6 -type f -name "*.iso" \
-print0 2>/dev/null \
| xargs -0 -I{} cp -av "{}" "${BUILD_DIR}/" || true
# Manifest of file metadata that lives inside the ISO. Useful when
# diagnosing reproducibility regressions without re-extracting.
find "${HOME}/derivative-binary" -maxdepth 6 -type f -name "*.manifest" \
-print0 2>/dev/null \
| xargs -0 -I{} cp -av "{}" "${BUILD_DIR}/" 2>/dev/null || true