ea5adacac330420bb16bb260dd6bf5c6765dadb6
13 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
| 5918305fd7 |
fix(linux/build): find self via docker inspect, cgroupns hides cgroup path (M1.1 iter22)
iter21's /proc/self/cgroup approach hit:
build.sh: cgroup contents:
0::/
Empty path — act_runner runs job containers with cgroupns enabled, so
the in-container view of cgroup paths is rooted at the namespace, with
no trace of the host-side container ID. Same blocker as `hostname`.
The host docker daemon does know who we are, and we have its socket.
We're the only running container with /workspace/SilverLABS/SilverMetal
as a mount destination (concurrency: 1 in the workflow), so iterate
docker ps and match by mount destination. Found CID becomes the
--volumes-from argument; if no match, dump docker ps to the log and
fail loud.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
|||
| 4a837e07ed |
fix(linux/build): discover job container ID from cgroup, not hostname (M1.1 iter21)
Run #4268's build-and-verify died <1s into Build A: docker: Error response from daemon: No such container: docker Cause: build.sh's CI path uses `--volumes-from "$(hostname)"` to inherit the parent job container's /workspace mount, but in the new runner config (network: host applied via the now-actually-loaded config.yaml) `hostname` returns the literal string "docker" inside catthehacker/ubuntu:act-latest — the image bakes that into /etc/hostname and act_runner doesn't override it. So `--volumes-from docker` looks for a container literally named "docker", finds nothing, exits. This worked in earlier runs (#4260) only because config.yaml *wasn't being loaded* (see iter18 commit), so the runner ran on its built-in defaults — which kept the container's hostname as the auto-generated container ID. Fixing config.yaml exposed this latent bug. Right way to learn your own container ID inside a Linux container is /proc/self/cgroup, which contains the 64-char hex ID on every cgroup driver: cgroup v1: 12:devices:/docker/<64-hex> cgroup v2: 0::/system.slice/docker-<64-hex>.scope awk extracts the first 64-hex run; that becomes the --volumes-from argument. If extraction fails (would only happen on a non-docker runtime), fail loud rather than silent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
| e260fe1c81 |
ci(linux/build): self-host the builder image build + iter16 reprepro wrap (M1.1)
Two coupled changes that unblock the M1.1 iter loop. Both belong in CI;
iter1-15 was wrong to require human-in-the-loop steps to make progress.
1. **CI now builds Dockerfile.builder.**
`.gitea/workflows/build-iso-linux.yaml` grows a `builder-image` job
that runs ahead of `build-and-verify`. It rebuilds the silvermetal-
builder image from `linux/build/docker/Dockerfile.builder`, pushes it
to `docker-registry.silverlabs.uk/silvermetal-builder:m1.1-<sha>` (and
`:latest`), reads the resulting digest off `docker inspect`, and
feeds it forward as a job output. `build-and-verify` consumes that
digest as the `BUILDER_IMAGE` env override that `build.sh` already
honours (and validates is digest-form on line ~37).
That kills the old workflow where every Dockerfile.builder change
required a human to `docker build` + `docker push` on 10.0.0.51 by
hand and then bump the digest in `build.sh` in lockstep. The crash
that triggered this (exit 126 mid-iter16 build run) was a symptom of
that off-CI step still existing.
Both jobs run on the existing `silvermetal-builder` runner; the host
docker daemon is shared via DooD and is already authenticated to
`docker-registry.silverlabs.uk` (linux/build/runner/docker-compose.yml
mounts `/root/.docker:/root/.docker:ro`), so no extra login step.
The hardcoded `BUILDER_IMAGE` digest in `build.sh` stays as the
local-developer / offline-rebuild fallback. Comments updated in
`build.sh`, `Dockerfile.builder`, and `linux/build/README.md` to
match the new flow.
2. **reprepro wrapper for the benign "No priority for X" case.**
Pinned derivative-maker's `2100_create-debian-packages` (with
--target iso) re-imports source packages from snapshot.debian.org
into a local apt repo via `reprepro --basedir … includedsc local
<foo>.dsc`. The local repo's `conf/distributions` ships no
`DscOverride` entries, so any source package whose `.dsc` lacks an
explicit Priority field trips:
No priority for 'X', skipping.
There have been errors!
…and reprepro exits 255. dm-reprepro-wrapper bubbles that up,
2100_create-debian-packages aborts. The current offender is
`virtualbox_*.dsc` (key import is now fine — debian-keyring landed in
commit
|
|||
| 4aa59ba633 |
fix(linux/build): non-interactive mode + visible output + key import (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 11m33s
Run #4260 cleared every harness layer and ran for 18 minutes — past sanity-tests, prepare-build-machine, cowbuilder-setup, local-deps — into 2100_create-debian-packages, where it died on: Could not check validity of signature with '92978A6E195E4921825F7FF0F34F09744E9F5DD9' in '/home/user/derivative-binary/temp_packages_debian_sid/virtualbox_7.2.8-dfsg-1.dsc' as public key missing! …and then *also* hung the runner indefinitely because, on any error, derivative-maker's exception_handler_general detected a TTY (we passed `docker run -t`) and dropped into an interactive `read -p 'Answer? '` prompt that nothing was ever going to answer. The orphan docker run in turn orphaned the act_runner job container, blocking the runner until manual cleanup. Three coordinated fixes, validated end-to-end with docker-side smoke tests on 10.0.0.51: 1. **Non-interactive mode without losing output visibility.** The original architectural goal: keep derivative-maker out of interactive mode (`[ -t 0 ]` must be false) AND keep the build log visible to docker run / Gitea Actions (PTY needed somewhere). Resolution: - `docker run -t` is kept (required for /dev/console to be a real PTY back to docker), but no `-i`, so fd 0 stays /dev/null. - docker-entrypoint.service: `StandardInput=tty-force` → `StandardInput=null` so the service's fd 0 is /dev/null too. Verified inside the container: `[ -t 0 ]` returns false. - entrypoint.sh now wraps the user command with an explicit `> /dev/console 2>&1` redirect before writing it to /etc/docker-entrypoint-cmd. systemd's `StandardOutput=inherit` does NOT propagate PID-1's stdout to services in this PID-1- systemd-in-container topology — the service log was going nowhere visible. /dev/console under `docker run -t` IS the allocated PTY back to docker, so the redirect surfaces the log to the act_runner / Gitea Actions log. - entrypoint.sh's `[ ! -t 0 ] && exit 1` guard removed (it would now always trigger). 2. **debian-keyring for reprepro source-package signature checks.** 2100_create-debian-packages calls dm-reprepro-wrapper includedsc on every .dsc in temp_packages_debian_sid (including virtualbox_*.dsc, even for `--target iso` — see line 114 of that build step). reprepro verifies the dsc signature against the user's GPG keyring; without the maintainer keys it fails. Adds `debian-keyring` to Dockerfile.builder. build-inner.sh now imports debian-keyring.gpg / debian-maintainers.gpg / debian-nonupload.gpg into the user's keyring before running derivative-maker. 3. **BUILDER_IMAGE digest re-pinned.** Built natively on 10.0.0.51 (per memory: never on WSL/aarch64). New digest: sha256:2f680c96…f0db. Smoke-test results (against this exact image): ==> START ← user output reaches docker stdout (keyring present) ← debian-keyring imported successfully STDIN_NOT_TTY ← derivative-maker WILL stay non-interactive ==> END ← clean shutdown docker run exit: 42 ← exit code propagates correctly on failure Files: Dockerfile.builder, systemd-entrypoint/entrypoint.sh, systemd-entrypoint/docker-entrypoint.service, scripts/build.sh, scripts/build-inner.sh. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
| 38ac4f8a96 |
fix(linux/build): systemd-in-container build host (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 15m34s
Run #4258 cleared the systemctl shim only to die two seconds later on the *next* expectation derivative-maker has of a real systemd host: its sources.list points at http://127.0.0.1:9977/debian (the approx package-cache socket-activated by systemd) and apt-get update could not reach the daemon because nothing was actually started by the no-op shim: Err:1 http://127.0.0.1:9977/debian trixie InRelease Could not connect to 127.0.0.1:9977 (127.0.0.1). - connect (111: Connection refused) Whack-a-mole'ing each service derivative-maker tries to start (approx today, then journald, then systemd-logind, then who-knows-what tomorrow) is going to keep failing for a while — derivative-maker is fundamentally designed for a real systemd-managed Debian host. The container pattern upstream itself ships (linux/build/derivative-maker/docker/) runs systemd as PID 1 inside the container; this commit adopts that approach. Architecture: - PID 1 in the build container is now systemd. Upstream's vendored entrypoint.sh records the user-supplied command into /etc/docker-entrypoint-cmd, captures env into /etc/docker-entrypoint-env, masks irrelevant units, and execs systemd. systemd boots, docker-entrypoint.service runs the command, docker-entrypoint-stop.sh propagates the exit code via `systemctl exit <code>` so the container exits with the right status. - The four entrypoint files (entrypoint.sh, docker-entrypoint.service / .target, docker-entrypoint-stop.sh) are vendored at linux/build/docker/systemd-entrypoint/ rather than COPY'd from the submodule path — Docker build context can only reach below itself, and bumping is tracked in that dir's README. - Container runtime now requires --cgroupns=host, --tmpfs /run, --tmpfs /run/lock, and -v /sys/fs/cgroup:/sys/fs/cgroup:rw so systemd can manage cgroups properly. -t allocates a TTY, satisfying entrypoint.sh's `[ ! -t 0 ] && exit 1` check in CI where stdin is otherwise /dev/null. - User renamed builder → user (uid 1000, passwordless sudo) to match upstream's USER=user / HOME=/home/user convention. chown in build.sh now uses uid 1000:1000 so it's name-agnostic. - Image package list grew to match upstream's derivative-maker-docker-setup (sq stack + dbus + approx + the rest) plus our ISO toolchain (live-build / debootstrap / xorriso / squashfs-tools / etc.). Snapshot.debian.org pinning is preserved (same APT_SNAPSHOT_URL, two-phase install pattern). Verified: Smoke test on 10.0.0.51 — `docker run --rm --privileged --cgroupns=host --tmpfs /run --tmpfs /run/lock -v /sys/fs/cgroup:...:rw -t <image> /bin/bash -c 'echo OK'` — booted systemd, ran the command via docker-entrypoint.service, captured the output, shut down filesystems and exited cleanly. build.sh BUILDER_IMAGE pin → sha256:dc9dd29d…8811. Image rebuilt natively on 10.0.0.51, pushed to docker-registry.silverlabs.uk. The systemctl shim is removed by virtue of the Dockerfile rewrite — real systemd makes it unnecessary. The previous "iter6 / iter7" intermediate digests stay in the registry until we GC; the live one is m1.1-iter8-systemd. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
| 7058fb775c |
fix(linux/build): add systemctl no-op shim for the build container (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 2m20s
Run #4257 cleared sanity-tests entirely (sq-git verification of every submodule signature: ✅; tag/uncommitted relaxation: ✅) and reached 1200_prepare-build-machine, where it died: + sudo systemctl daemon-reload sudo: systemctl: command not found ERROR detected in script!: ././build-steps.d/1200_prepare-build-machine derivative-maker assumes systemd is PID 1 on the build host. Upstream's own container (linux/build/derivative-maker/docker/) runs systemd-as-init via an entrypoint that masks irrelevant units and declares its own. We don't want that surgery for M1.1 — it pulls in cgroup mounts, --cgroupns=host, and a much bigger debugging surface. Shim approach instead: install /usr/local/bin/systemctl that logs the attempt to stderr and exits 0. /usr/local/bin precedes /usr/bin in both default $PATH and sudo's secure_path, so it satisfies any systemctl call regardless of whether the real binary later gets pulled in by a package install. Standard pattern for systemd-aware Debian build scripts in transient containers. Risk if it doesn't suffice: the shim makes daemon-reload / restart / mask calls succeed, but doesn't actually run any service. If a later build step depends on (say) approx actually being up to serve cached debs, we'll see the next failure and decide whether to escalate to real systemd-in-container or skip the relevant build step. Changes: - Dockerfile.builder: add the shim with a brief log line to stderr; comment block documents the trade-off. - build.sh: BUILDER_IMAGE digest re-pinned to sha256:70f160ab…5460 (built natively on 10.0.0.51, shim verified working with `docker run … systemctl daemon-reload` returning 0). Verified: shim emits "systemctl-shim: daemon-reload" to stderr and exits 0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
| 2a163bb9e7 |
fix(linux/build): install sq-git/Sequoia stack for derivative-maker (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m21s
Run #4255 reached deeper into 1100_sanity-tests, finished its apt-get phase, and then died at the supply-chain verification step: /workspace/.../help-steps/git_sanity_test: line 184: sq-git: command not found ERROR: sq-git verification failed: main repo INFO: If this is intentional, configure your own sq-git policy file. See 'buildconfig.d/30_signing_key.conf'. derivative-maker uses sq-git (sequoia-git) to authenticate the commit chain against an OpenPGP policy file before building. The policy file itself ships in the upstream repo (./openpgp-policy.toml) and the trust-root defaults are correctly configured by help-steps/variables (line 232 + 290) for non-redistributable builds — i.e. the verification machinery is fully wired and just needs the binary. Aligns with the upstream container's package list at linux/build/derivative-maker/docker/derivative-maker-docker-setup. Changes: - Dockerfile.builder: add sq, sqv, sqop, sequoia-git, sequoia-chameleon-gnupg, gpg-agent. All available in trixie main. - build.sh: BUILDER_IMAGE digest re-pinned to sha256:c1490bab…5c97 (rebuilt on 10.0.0.51, sq-git binary verified present at /usr/bin/sq-git). No reproducibility implications — image rebuilds against the same pinned snapshot timestamp. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
| 433eb18947 |
fix(linux/build): bump builder base bookworm → trixie (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m19s
Run #4254 finally got past every harness issue and into derivative- maker's actual sanity-tests, where it died with: You are attempting to build on an unsupported operating system or version. detected operating system codename: 'bookworm' expected operating system codename: 'trixie' The pinned derivative-maker tag (18.1.7.4-developers-only) requires Debian 13 (trixie) as the build host. Upstream's own linux/build/derivative-maker/docker/Dockerfile uses `FROM debian:trixie-slim`. We picked bookworm originally and the tag mismatch wasn't caught until the build actually ran. Changes: - Dockerfile.builder: FROM debian:bookworm-slim → debian:trixie-slim @ sha256:cedb1ef4…2c5a (resolved 2026-05-07 on the runner host). sources.list suite names follow: `bookworm` → `trixie`, `bookworm-security` → `trixie-security`. snapshot.debian.org pin (20260415T000000Z) is unchanged — snapshots are date-keyed, so the same timestamp resolves trixie's dists/. - silvermetal-base.conf: DERIVATIVE_DIST `bookworm` → `trixie` for consistency (the value isn't passed to derivative-maker — there's no --dist option — but it's referenced by the build.sh prologue and we shouldn't have a stale codename floating around). - build.sh: BUILDER_IMAGE digest re-pinned to sha256:7d893178…1890 (rebuilt natively on 10.0.0.51 against the new base, pushed). The reproducibility guarantee is unchanged in shape — same snapshot timestamp, same source-date-epoch derivation, just a different stable host OS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
| b20e568b19 |
fix(linux/build): run derivative-maker as unprivileged builder user (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m14s
Run #4251 advanced past checkout and into derivative-maker, then died immediately: ERROR: This must NOT be run as root (sudo)! ERROR: Exiting ./derivative-maker with non-zero exit code 1. Errors Detected: 0. Execution Time: 00:00:00. Kicksecure's derivative-maker explicitly refuses to run as root — it expects a regular user with passwordless sudo and uses sudo internally for the privileged operations (debootstrap, mksquashfs, chroot mounts). Our minimal debian-slim builder image had a `builder` user (uid 1000) but no sudo, no sudoers entry, and the container ran as root. Aligns with the upstream Kicksecure container pattern at linux/build/derivative-maker/docker/derivative-maker-docker-setup (uses USER=user with `${USER} ALL=(ALL) NOPASSWD:ALL`). Changes: - Dockerfile.builder: install `sudo` (and `fakeroot` while we're here — upstream sanity-tests pulls this in via apt at build time, but having it baked avoids a snapshot.debian.org round-trip every run); add passwordless sudoers entry for builder; correct the misleading comment that claimed root was needed. - New scripts/build-inner.sh: the inner derivative-maker invocation pulled out of build.sh's heredoc. Once we needed to drop privileges via runuser, the nested-heredoc / nested-quoting situation became unmaintainable; a regular script with normal quoting is far cleaner. - build.sh: inner heredoc now just chowns the workspace to builder and runuser's into build-inner.sh. ${REPO_ROOT} and ${BUILD_DIR} continue to be forwarded into the container via -e. - build.sh: BUILDER_IMAGE digest re-pinned to sha256:f8f0db37…1bedc (rebuilt and pushed natively on 10.0.0.51 — never on the WSL/aarch64 dev box, see reference_silvermetal_runner.md memory). Verified: bash -n on both scripts; image builds and pushes cleanly. Pushing this commit triggers a fresh CI run that will exercise it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
| 1d0e58739c |
fix(linux/build): handle DooD bind-mount in CI (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m18s
build.sh ran fine locally but failed in Gitea Actions on the first reproducibility-gated run (#4250) with: bash: line 3: /work/linux/build/config/silvermetal-base.conf: No such file or directory Root cause: classic Docker-out-of-Docker confusion. build.sh runs inside the act_runner job container, which talks to the host's docker daemon via the mounted /var/run/docker.sock. The "-v ${REPO_ROOT}:/work" flag was being interpreted by the host daemon against the host filesystem, where /workspace/SilverLABS/SilverMetal does not exist; docker silently auto-created an empty dir there and mounted that as /work, so the config source target was missing. Fix: detect GITHUB_ACTIONS and use --volumes-from "$(hostname)" in CI to inherit the parent job container's /workspace mount intact. Locally we keep a bind mount, but use the same path inside and outside (${REPO_ROOT}:${REPO_ROOT}) so the inner heredoc is identical in both modes. Inner script now references "${REPO_ROOT}/..." and "${BUILD_DIR}/..." instead of the synthetic /work and /out paths. No reproducibility implications — bind topology doesn't affect bytes inside the ISO. Verified locally: bash -n passes; structural change only, behaviour preserved for the non-CI path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
| eae2b98906 |
fix(linux/build): re-pin BUILDER_IMAGE to amd64 registry digest
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 11s
Two corrections to
|
|||
| f9e606d22d |
fix(linux/build): pin BUILDER_IMAGE to pushed registry digest (M1.1)
Image built from Dockerfile.builder@36f7672 was pushed to both docker-registry:5000 (internal) and docker-registry.silverlabs.uk (external) under tags m1.1-bootstrap + latest. Both URLs serve the same registry, so the manifest digest is identical: sha256:cedef039425e0b0f5901c1023eda820c7aa38ab4b81c2bb1e12d64cadb3d6c85 Default points at the internal hostname for CI; external dev overrides via BUILDER_IMAGE env var. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
| 4444dc11f3 |
feat(linux/build): scaffold reproducible ISO build pipeline (M1.1)
Vendors Kicksecure derivative-maker as a pinned submodule (18.1.7.4), adds the wrapper + verify + diagnose scripts, the pinned builder image, and the reproducibility-gated Gitea Actions workflow. Base flavour only — no hardening overlay (that's M1.2). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |