Commit Graph

9 Commits

Author SHA1 Message Date
38ac4f8a96 fix(linux/build): systemd-in-container build host (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 15m34s
Run #4258 cleared the systemctl shim only to die two seconds later on
the *next* expectation derivative-maker has of a real systemd host:
its sources.list points at http://127.0.0.1:9977/debian (the approx
package-cache socket-activated by systemd) and apt-get update could
not reach the daemon because nothing was actually started by the
no-op shim:

    Err:1 http://127.0.0.1:9977/debian trixie InRelease
      Could not connect to 127.0.0.1:9977 (127.0.0.1).
      - connect (111: Connection refused)

Whack-a-mole'ing each service derivative-maker tries to start (approx
today, then journald, then systemd-logind, then who-knows-what
tomorrow) is going to keep failing for a while — derivative-maker is
fundamentally designed for a real systemd-managed Debian host. The
container pattern upstream itself ships
(linux/build/derivative-maker/docker/) runs systemd as PID 1 inside
the container; this commit adopts that approach.

Architecture:

  - PID 1 in the build container is now systemd. Upstream's vendored
    entrypoint.sh records the user-supplied command into
    /etc/docker-entrypoint-cmd, captures env into
    /etc/docker-entrypoint-env, masks irrelevant units, and execs
    systemd. systemd boots, docker-entrypoint.service runs the
    command, docker-entrypoint-stop.sh propagates the exit code via
    `systemctl exit <code>` so the container exits with the right
    status.

  - The four entrypoint files (entrypoint.sh,
    docker-entrypoint.service / .target, docker-entrypoint-stop.sh)
    are vendored at linux/build/docker/systemd-entrypoint/ rather
    than COPY'd from the submodule path — Docker build context can
    only reach below itself, and bumping is tracked in that dir's
    README.

  - Container runtime now requires --cgroupns=host, --tmpfs /run,
    --tmpfs /run/lock, and -v /sys/fs/cgroup:/sys/fs/cgroup:rw so
    systemd can manage cgroups properly. -t allocates a TTY,
    satisfying entrypoint.sh's `[ ! -t 0 ] && exit 1` check in CI
    where stdin is otherwise /dev/null.

  - User renamed builder → user (uid 1000, passwordless sudo) to
    match upstream's USER=user / HOME=/home/user convention. chown
    in build.sh now uses uid 1000:1000 so it's name-agnostic.

  - Image package list grew to match upstream's
    derivative-maker-docker-setup (sq stack + dbus + approx + the
    rest) plus our ISO toolchain (live-build / debootstrap / xorriso
    / squashfs-tools / etc.). Snapshot.debian.org pinning is
    preserved (same APT_SNAPSHOT_URL, two-phase install pattern).

Verified:

  Smoke test on 10.0.0.51 — `docker run --rm --privileged
  --cgroupns=host --tmpfs /run --tmpfs /run/lock -v /sys/fs/cgroup:...:rw
  -t <image> /bin/bash -c 'echo OK'` — booted systemd, ran the
  command via docker-entrypoint.service, captured the output, shut
  down filesystems and exited cleanly.

build.sh BUILDER_IMAGE pin → sha256:dc9dd29d…8811. Image rebuilt
natively on 10.0.0.51, pushed to docker-registry.silverlabs.uk.

The systemctl shim is removed by virtue of the Dockerfile rewrite —
real systemd makes it unnecessary. The previous "iter6 / iter7"
intermediate digests stay in the registry until we GC; the live one
is m1.1-iter8-systemd.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:06:47 +01:00
7058fb775c fix(linux/build): add systemctl no-op shim for the build container (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 2m20s
Run #4257 cleared sanity-tests entirely (sq-git verification of every
submodule signature: ; tag/uncommitted relaxation: ) and reached
1200_prepare-build-machine, where it died:

    + sudo systemctl daemon-reload
    sudo: systemctl: command not found
    ERROR detected in script!: ././build-steps.d/1200_prepare-build-machine

derivative-maker assumes systemd is PID 1 on the build host. Upstream's
own container (linux/build/derivative-maker/docker/) runs
systemd-as-init via an entrypoint that masks irrelevant units and
declares its own. We don't want that surgery for M1.1 — it pulls in
cgroup mounts, --cgroupns=host, and a much bigger debugging surface.

Shim approach instead: install /usr/local/bin/systemctl that logs the
attempt to stderr and exits 0. /usr/local/bin precedes /usr/bin in
both default $PATH and sudo's secure_path, so it satisfies any
systemctl call regardless of whether the real binary later gets pulled
in by a package install. Standard pattern for systemd-aware Debian
build scripts in transient containers.

Risk if it doesn't suffice: the shim makes daemon-reload / restart /
mask calls succeed, but doesn't actually run any service. If a later
build step depends on (say) approx actually being up to serve cached
debs, we'll see the next failure and decide whether to escalate to
real systemd-in-container or skip the relevant build step.

Changes:
- Dockerfile.builder: add the shim with a brief log line to stderr;
  comment block documents the trade-off.
- build.sh: BUILDER_IMAGE digest re-pinned to sha256:70f160ab…5460
  (built natively on 10.0.0.51, shim verified working with
  `docker run … systemctl daemon-reload` returning 0).

Verified: shim emits "systemctl-shim: daemon-reload" to stderr and
exits 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:45:13 +01:00
2a163bb9e7 fix(linux/build): install sq-git/Sequoia stack for derivative-maker (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m21s
Run #4255 reached deeper into 1100_sanity-tests, finished its apt-get
phase, and then died at the supply-chain verification step:

    /workspace/.../help-steps/git_sanity_test: line 184: sq-git: command not found
    ERROR: sq-git verification failed: main repo
    INFO: If this is intentional, configure your own sq-git policy file.
          See 'buildconfig.d/30_signing_key.conf'.

derivative-maker uses sq-git (sequoia-git) to authenticate the commit
chain against an OpenPGP policy file before building. The policy file
itself ships in the upstream repo (./openpgp-policy.toml) and the
trust-root defaults are correctly configured by help-steps/variables
(line 232 + 290) for non-redistributable builds — i.e. the verification
machinery is fully wired and just needs the binary.

Aligns with the upstream container's package list at
linux/build/derivative-maker/docker/derivative-maker-docker-setup.

Changes:
- Dockerfile.builder: add sq, sqv, sqop, sequoia-git,
  sequoia-chameleon-gnupg, gpg-agent. All available in trixie main.
- build.sh: BUILDER_IMAGE digest re-pinned to sha256:c1490bab…5c97
  (rebuilt on 10.0.0.51, sq-git binary verified present at /usr/bin/sq-git).

No reproducibility implications — image rebuilds against the same
pinned snapshot timestamp.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:31:03 +01:00
433eb18947 fix(linux/build): bump builder base bookworm → trixie (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m19s
Run #4254 finally got past every harness issue and into derivative-
maker's actual sanity-tests, where it died with:

    You are attempting to build on an unsupported operating system or version.
    detected operating system codename: 'bookworm'
    expected operating system codename: 'trixie'

The pinned derivative-maker tag (18.1.7.4-developers-only) requires
Debian 13 (trixie) as the build host. Upstream's own
linux/build/derivative-maker/docker/Dockerfile uses
`FROM debian:trixie-slim`. We picked bookworm originally and the tag
mismatch wasn't caught until the build actually ran.

Changes:

- Dockerfile.builder: FROM debian:bookworm-slim →
  debian:trixie-slim @ sha256:cedb1ef4…2c5a (resolved 2026-05-07 on
  the runner host). sources.list suite names follow:
  `bookworm` → `trixie`, `bookworm-security` → `trixie-security`.
  snapshot.debian.org pin (20260415T000000Z) is unchanged — snapshots
  are date-keyed, so the same timestamp resolves trixie's dists/.
- silvermetal-base.conf: DERIVATIVE_DIST `bookworm` → `trixie` for
  consistency (the value isn't passed to derivative-maker — there's
  no --dist option — but it's referenced by the build.sh prologue
  and we shouldn't have a stale codename floating around).
- build.sh: BUILDER_IMAGE digest re-pinned to sha256:7d893178…1890
  (rebuilt natively on 10.0.0.51 against the new base, pushed).

The reproducibility guarantee is unchanged in shape — same snapshot
timestamp, same source-date-epoch derivation, just a different stable
host OS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:25:40 +01:00
b20e568b19 fix(linux/build): run derivative-maker as unprivileged builder user (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m14s
Run #4251 advanced past checkout and into derivative-maker, then died
immediately:

    ERROR: This must NOT be run as root (sudo)!
    ERROR: Exiting ./derivative-maker with non-zero exit code 1.
           Errors Detected: 0. Execution Time: 00:00:00.

Kicksecure's derivative-maker explicitly refuses to run as root — it
expects a regular user with passwordless sudo and uses sudo internally
for the privileged operations (debootstrap, mksquashfs, chroot mounts).
Our minimal debian-slim builder image had a `builder` user (uid 1000)
but no sudo, no sudoers entry, and the container ran as root.

Aligns with the upstream Kicksecure container pattern at
linux/build/derivative-maker/docker/derivative-maker-docker-setup
(uses USER=user with `${USER} ALL=(ALL) NOPASSWD:ALL`).

Changes:
- Dockerfile.builder: install `sudo` (and `fakeroot` while we're here —
  upstream sanity-tests pulls this in via apt at build time, but having
  it baked avoids a snapshot.debian.org round-trip every run); add
  passwordless sudoers entry for builder; correct the misleading
  comment that claimed root was needed.
- New scripts/build-inner.sh: the inner derivative-maker invocation
  pulled out of build.sh's heredoc. Once we needed to drop privileges
  via runuser, the nested-heredoc / nested-quoting situation became
  unmaintainable; a regular script with normal quoting is far cleaner.
- build.sh: inner heredoc now just chowns the workspace to builder and
  runuser's into build-inner.sh. ${REPO_ROOT} and ${BUILD_DIR} continue
  to be forwarded into the container via -e.
- build.sh: BUILDER_IMAGE digest re-pinned to sha256:f8f0db37…1bedc
  (rebuilt and pushed natively on 10.0.0.51 — never on the WSL/aarch64
  dev box, see reference_silvermetal_runner.md memory).

Verified: bash -n on both scripts; image builds and pushes cleanly.
Pushing this commit triggers a fresh CI run that will exercise it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:09:42 +01:00
1d0e58739c fix(linux/build): handle DooD bind-mount in CI (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m18s
build.sh ran fine locally but failed in Gitea Actions on the first
reproducibility-gated run (#4250) with:

    bash: line 3: /work/linux/build/config/silvermetal-base.conf:
    No such file or directory

Root cause: classic Docker-out-of-Docker confusion. build.sh runs
inside the act_runner job container, which talks to the host's docker
daemon via the mounted /var/run/docker.sock. The "-v ${REPO_ROOT}:/work"
flag was being interpreted by the host daemon against the host
filesystem, where /workspace/SilverLABS/SilverMetal does not exist;
docker silently auto-created an empty dir there and mounted that as
/work, so the config source target was missing.

Fix: detect GITHUB_ACTIONS and use --volumes-from "$(hostname)" in CI
to inherit the parent job container's /workspace mount intact. Locally
we keep a bind mount, but use the same path inside and outside
(${REPO_ROOT}:${REPO_ROOT}) so the inner heredoc is identical in both
modes. Inner script now references "${REPO_ROOT}/..." and
"${BUILD_DIR}/..." instead of the synthetic /work and /out paths.

No reproducibility implications — bind topology doesn't affect bytes
inside the ISO.

Verified locally: bash -n passes; structural change only, behaviour
preserved for the non-CI path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:01:06 +01:00
eae2b98906 fix(linux/build): re-pin BUILDER_IMAGE to amd64 registry digest
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 11s
Two corrections to f9e606d:

1. Registry hostname: docker-registry:5000 isn't DNS-resolvable on the
   SLAB docker host (verified). The fleet-wide convention is the canonical
   docker-registry.silverlabs.uk URL, registered as an insecure-registry
   in /etc/docker/daemon.json on every docker host.

2. Architecture: the original push from WSL2-on-aarch64 produced an arm64
   image that won't run on the amd64 runner. Rebuilt natively on the docker
   host. New manifest digest (amd64-only):
     sha256:9e7161f9f180483f434074d7f32c27c907955232bd0c44efe6dc0ee1d9e56ae0

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 11:59:52 +01:00
f9e606d22d fix(linux/build): pin BUILDER_IMAGE to pushed registry digest (M1.1)
Image built from Dockerfile.builder@36f7672 was pushed to both
docker-registry:5000 (internal) and docker-registry.silverlabs.uk
(external) under tags m1.1-bootstrap + latest. Both URLs serve the
same registry, so the manifest digest is identical:

  sha256:cedef039425e0b0f5901c1023eda820c7aa38ab4b81c2bb1e12d64cadb3d6c85

Default points at the internal hostname for CI; external dev overrides
via BUILDER_IMAGE env var.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 11:48:48 +01:00
4444dc11f3 feat(linux/build): scaffold reproducible ISO build pipeline (M1.1)
Vendors Kicksecure derivative-maker as a pinned submodule (18.1.7.4),
adds the wrapper + verify + diagnose scripts, the pinned builder image,
and the reproducibility-gated Gitea Actions workflow. Base flavour only —
no hardening overlay (that's M1.2).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 04:25:48 +01:00