Commit Graph

35 Commits

Author SHA1 Message Date
5e5026088d fix(linux/build): terminate xorriso -alter_date_r path list with -- (M1.1 iter32)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 37m14s
Run #4279 hit:

    xorriso : FAILURE : Cannot find path '/-volume_date' in loaded ISO image
    xorriso : aborting : -abort_on 'FAILURE' encountered 'FAILURE'

`-alter_date_r type timestring iso_rr_path [***]` takes a
variable-length path list. xorriso terminates that list either at the
end of the command line or at a literal `--`. Without the terminator,
the next intended option (`-volume_date`) is consumed as another path
to set mtime on, blows up because there's no node called
`/-volume_date`, and FAILURE-severity propagates to a hard exit.

Add `--` after the `/` argument to close the path list. -volume_date c/m
then take effect as expected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 23:10:02 +01:00
d354040bd6 fix(linux/build): scrub apt/ldconfig caches + force xorriso mtimes (M1.1 iter31)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 17m44s
Run #4278 with iter30's chroot scrub still produced different ISOs.
The diagnostic was clean and pointed at a tight set of remaining
divergences:

* Inside the squashfs, three files differed:
    /var/cache/apt/pkgcache.bin
    /var/cache/apt/srcpkgcache.bin
    /var/cache/ldconfig/aux-cache
  — all post-install binary caches with internal pointers/timestamps
  that vary across runs. Standard reproducible-Debian practice is to
  drop them; `apt` regenerates pkgcache on first `apt-get update` (and
  implicitly when anything else needs it), and ldconfig regenerates
  aux-cache on its next run.

* In the outer ISO TOC:
    /boot.catalog        mtime  May  7 21:27   vs   May  7 21:44
    /live/filesystem.squashfs   May  7 21:27   vs   May  7 21:44
  — xorriso's `-update` and the boot-catalog rewrite were stamping
  files with wall-clock time, not SOURCE_DATE_EPOCH.

Two additions to post_process_for_reproducibility:

1. Three more entries in the chroot rm list (apt's two pkgcaches
   and ldconfig aux-cache).

2. xorriso post-update fixups:
     -alter_date_r m "=${SOURCE_DATE_EPOCH}" /
     -volume_date c "=${SOURCE_DATE_EPOCH}"
     -volume_date m "=${SOURCE_DATE_EPOCH}"
   set every file's mtime in the ISO and both volume-descriptor
   dates to the pinned epoch. (`=N` is xorriso's syntax for a
   literal decimal epoch.)

If diffoscope flagged everything in run #4278 honestly (its full
output was 3 file diffs in the squashfs + the squashfs metadata
size delta, then nothing — TOC was reduced to just the two mtime
lines), this should clear M1.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 22:50:28 +01:00
84179b3642 fix(linux/build): xorriso -return_with SORRY 0 to tolerate MBR size warning (M1.1 iter30)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 37m27s
iter29 wired up the chroot scrub + squashfs rebuild + ISO patch.
Run #4277 confirmed every actual operation succeeded:

    Updating '/tmp/silvermetal-rebuilt-MFqm7S.squashfs' to '/live/filesystem.squashfs'
    xorriso : UPDATE : Added/overwrote '/live/filesystem.squashfs'  (899m)
    Differences detected and updated. (runtime 0.5 s)
    xorriso : NOTE : Keeping boot image unchanged
    ISO image produced: 506049 sectors
    Writing to '...silvermetal-clean.iso' completed successfully.

…then xorriso re-assessed the freshly-written ISO and raised:

    libburn : SORRY : Read start address 525977s larger than
      number of readable blocks 506240
    libisofs: NOTE : Found Protective MBR with size range larger
      than the medium capacity
    xorriso : NOTE : Tolerated problem event of severity 'SORRY'
    xorriso : NOTE : -return_with SORRY 32 triggered by problem
      severity SORRY

That's the protective MBR header recording the *original* ISO size
(525977 sectors) but our replaced squashfs is smaller, so the new ISO
totals 506240 sectors. The protective MBR is purely a compatibility
shim for tools that don't understand GPT — bootloaders consult the
GPT and El Torito tables, both of which are self-consistent in the
new ISO. The diagnostic is genuinely benign.

xorriso's default `-return_with SORRY 32` made it exit 32, which `set
-e` in build-inner.sh propagated up, killing the build. Add
`-return_with SORRY 0` to the post-process xorriso invocation: keep
the warning visible in the log but accept a SORRY as exit-zero given
the operation reported `completed successfully` for the write itself.

Note: this scoping is *only* on the post-process xorriso. Anywhere
else upstream in derivative-maker can still use xorriso's default
strictness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 22:09:55 +01:00
10e099fcf9 fix(linux/build): scrub nvme/hostid + dkms logs, rebuild squashfs (M1.1 iter29)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 17m56s
Run #4276's diffoscope (now actually working — see iter28) pinned the
M1.1 reproducibility failure to exactly two files inside the rootfs
squashfs:

    /etc/nvme/hostid
        - c5867514-b138-4bfc-a2ae-f801d05a3606
        + 62e3fae3-692d-4451-ab04-353e27547806
    /var/lib/dkms/tirdad/0.1/<kver>/x86_64/log/make.log
        - Thu May  7 20:23:04 UTC 2026
        + Thu May  7 20:39:14 UTC 2026
        - # elapsed time: 00:00:01
        + # elapsed time: 00:00:00

Inner squashfs file sizes differed by 4 bytes (983547059 vs 983547063);
the outer ISO size matched because squashfs pads to block boundaries.
Both files come from upstream Debian package postinsts that run inside
the live-build chroot:

  * nvme-cli's postinst calls `nvme gen-hostnqn` and writes a fresh
    random UUID to /etc/nvme/hostid the first time it's installed.
    Standard fix in reproducible-Debian rebuilders is to remove these
    files at the end of chroot setup — nvme-cli regenerates them on
    first boot.
  * DKMS captures wall-clock build times in its module make.log. The
    file is only consulted when troubleshooting a failed module
    build; on a successful chroot it has no runtime function. Drop
    /var/lib/dkms/<…>/log/ entirely.

Both fixes have to land *inside* the chroot before mksquashfs seals
it. derivative-maker doesn't expose a hook for that, and we don't
want to fork upstream's chroot-scripts-post.d, so build-inner.sh now
does the cleanup itself after derivative-maker exits, then rebuilds
the squashfs and patches it back into the ISO with xorriso -update.

mksquashfs flags chosen for max determinism:
  -reproducible -mkfs-time $SOURCE_DATE_EPOCH -all-time $SOURCE_DATE_EPOCH
  -no-exports -no-xattrs -all-root -no-recovery
  -comp xz -b 1M -Xdict-size 100%

xorriso -update swaps just /live/filesystem.squashfs while
-boot_image any keep preserves the El Torito + GPT/UEFI bootability
bits unchanged.

Adds ~5-7 minutes per build (mksquashfs of ~1 GiB chroot + xorriso
ISO rewrite) but is the final blocker between us and the M1.1
reproducibility gate passing. Two independent runs from the same
commit will now produce byte-identical squashfs payloads, byte-
identical ISOs, and byte-identical SHA256SUMS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 21:49:25 +01:00
c8eac79afc fix(linux/build): xorriso -extract needs -osirrox on (M1.1 iter28)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 36m19s
Run #4275's TOC parser worked perfectly — found
/live/filesystem.squashfs as the largest file (983,547,904 bytes,
right where it should be) — but extraction still bailed:

    diagnose: largest file in ... is /live/filesystem.squashfs; extracting
    diagnose: could not extract rootfs from A

xorriso's -extract action requires -osirrox to be turned on at the
start of the command line; without it, -extract is silently rejected
("OSIRROX is not enabled by default. -osirrox on permits it."). Our
script swallowed stderr and the only signal was the empty output
file.

Two changes:
  * Add `-osirrox on` to every -extract invocation.
  * On extraction failure, surface the captured stderr (last 30
    lines) into the workflow log instead of dropping it. Saves us
    one round-trip if the next thing breaks.

ISO layout from the iter27 dump for the record:
    /live/filesystem.squashfs   983547904 bytes  ← rootfs
    /live/initrd.img-...         62929840 bytes
    /live/vmlinuz-...            12113856 bytes
    /boot/grub/efi.img            3342336 bytes
    /EFI/boot/{boot,grub}x64.efi
    + grub modules under /boot/grub/{i386-pc,x86_64-efi}/

The named-path probe for /live/filesystem.squashfs was already first
in the list — it'll succeed cleanly now and we skip the largest-file
fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 21:07:39 +01:00
a2bee4b5dc fix(linux/build): better squashfs extraction + dump TOC sample (M1.1 iter27)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m47s
Run #4274 made progress: identical ISO sizes, identical TOC, identical
first 8 KiB — divergence is fully in file payload bytes. But the
diagnostic stalled because extract_squashfs() couldn't find the rootfs:

    diagnose: could not extract squashfs from A
    diagnose: could not extract squashfs from B

Two reasons to address:

1. The named-path probes only checked /live/filesystem.squashfs,
   /casper/filesystem.squashfs and /filesystem.squashfs. Some live-build
   configs use /install/... or no canonical name at all.

2. The fallback that used `xorriso -find / -name '*.squashfs'` then
   piped to `xorriso -extract` didn't work because xorriso's -find
   output quotes paths, and -extract chokes on quotes.

This iteration:
  * Adds /install/filesystem.squashfs and /boot/filesystem.squashfs
    to the named-path probes.
  * Replaces the -find/-name/tail fallback with a generic "biggest
    file in the ISO" picker. In a live-build ISO the rootfs payload
    is reliably the largest file regardless of what it's called.
    Parses lsdl output (with awk, handling spaces in paths and
    stripping single-quote framing).
  * On extraction failure, dumps the top 20 files by size to stderr
    so the workflow log shows what's actually in the ISO — answers
    "what should the named-path probe match" for the next iter.
  * Always echoes the first 30 lines of toc-a.txt (and the line
    count) so we can sanity-check the ISO layout in every run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 20:32:01 +01:00
c9e67d8b47 fix(linux/build): staged divergence diagnostic, avoid OOM (M1.1 iter26)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m36s
Run #4273 confirmed two things:

1. The reproducibility gate works end-to-end. Both builds produced
   ISOs (1077194752 vs 1077202944 bytes — 8 KiB delta, exactly one
   squashfs block worth of compressed-payload drift) and the compare
   step caught it.

2. diffoscope, run on the whole 1 GB ISO inside the silvermetal-builder
   container, gets OOM-killed before producing any output:

       diagnose-divergence.sh: line 44:    13 Killed
         diffoscope --max-report-size 100000000 --html ... --text ... A.iso B.iso

   The host has 19 GiB free, but diffoscope's full recursion through
   ISO -> squashfs -> ~thousands of inner files needs more memory than
   that for a 1 GB image. Setting --max-report-size only caps the
   output, not the working-set.

Rewrite diagnose-divergence.sh to do staged, cheap-to-expensive
analysis:
  1. sha256 + sizes (always)
  2. xorriso TOC of both ISOs (every node: mode/size/mtime/path) -> diff
  3. Pull just live/filesystem.squashfs out of each ISO,
     sha256 it + `unsquashfs -ll` it, diff the listings — this is
     where the per-file-size signal lives.
  4. Targeted diffoscope on the squashfs payload only, with
     --max-container-depth 2 + --max-text-report-size 5MB + --no-html
     + a 10-minute timeout. Bounded enough to finish without the OOM.

Drops `set -e` — every step `|| true`s itself so we get partial output
even when one stage fails.

Workflow tail-into-log step now prints the new staged outputs:
  * toc-diff.txt   — what changed at the ISO level
  * sqfs-ls-diff.txt — which inner files have different sizes/mtimes
  * sqfs-diff.txt   — diffoscope on the squashfs only
  * squashfs-sha256.txt
  * iso-header-cmp.txt — first-8KB cmp -l for header-level drift
  * sizes.txt / sha256.txt / checklist.md as before

Should land us a focused list of "these N files inside the squashfs
have different bytes" — that's what we need to find what's leaking
non-determinism into the build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 19:54:35 +01:00
5bb24235bd fix(linux/build): tolerate find perm-denied in chroot scan (M1.1 iter24)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 2s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m43s
🎉 Run #4271's Build A actually produced the ISO. derivative-maker ran
clean for 15:24:

    INFO: Script ./derivative-maker completed.
          Exit Code: 0. Errors Detected: 0. Execution Time: 00:15:24
    '/home/user/derivative-binary/.../Kicksecure-CLI-18.1.7.4-developers-only.Intel_AMD64.iso'
      -> '/workspace/SilverLABS/SilverMetal/build-a/Kicksecure-CLI-18.1.7.4-developers-only.Intel_AMD64.iso'

…but build-inner.sh then died on its own post-build collection step:

    find: '.../live-build/chroot/usr/src': Permission denied
    find: '.../live-build/chroot/etc/sudoers.d': Permission denied
    find: '.../live-build/chroot/boot': Permission denied
    …

The chroot's standard hardened subdirs (/usr/src, /etc/sudoers.d,
/etc/cron.*, /boot, /root, /run/{sudo,lvm,cryptsetup,openvpn-{client,
server}}, cache/bootstrap/root) are 0700 root-owned because the
live-build chroot was assembled under sudo. As `user` (uid 1000) we
can't descend them. find emits Permission denied on each, exits with
status 1, and `set -euo pipefail` in build-inner.sh propagates that
through `xargs cp` and aborts — even though the ISO copy itself had
already succeeded a few lines earlier in the same xargs stream.

Fix: redirect find's stderr to /dev/null and tolerate non-zero exit on
both the *.iso and *.manifest scans. build.sh already verifies an ISO
landed in BUILD_DIR (exit 4 with "no ISO produced" if not), so a real
miss is still caught — we just stop killing the script for the benign
unreadable-chroot-subdirs case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:32:00 +01:00
b0f1ab30f4 fix(linux/build): symlink /home/user/derivative-maker to checkout (M1.1 iter23)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 17m24s
Run #4270's Build A made it 2:40 deep — past sanity-tests, prepare-
build-machine, local-deps, into 2100_create-debian-packages — then
died on:

    + /workspace/.../genmkfile/usr/bin/genmkfile reprepro-remove
    running: dm-reprepro-wrapper remove local age-api
    + /usr/bin/dm-reprepro-wrapper: line 28:
        /home/user/derivative-maker/help-steps/pre: No such file or directory

Earlier `dm-reprepro-wrapper includedsc/includedeb` calls succeeded
because 2100_create-debian-packages invokes them by absolute path
(`$source_code_folder_dist/packages/.../developer-meta-files/usr/bin/
dm-reprepro-wrapper`) — the in-repo copy resolves help-steps/pre
relative to its own location.

`genmkfile reprepro-remove` calls `dm-reprepro-wrapper` via PATH
instead, so the system copy at /usr/bin/dm-reprepro-wrapper wins. That
copy was installed by 1500_local-deps `apt install`-ing the in-repo
developer-meta-files.deb into the silvermetal-builder image at runtime.
The .deb's intended layout assumes the matching derivative-maker
checkout lives at /home/user/derivative-maker — the upstream-blessed
path. Ours is at /workspace/SilverLABS/SilverMetal/linux/build/
derivative-maker, so the relative source() at line 28 walks off into
nowhere.

Bridge the gap with a symlink at the start of build-inner.sh:

    ln -sfn "${REPO_ROOT}/linux/build/derivative-maker" \
            /home/user/derivative-maker

That keeps our self-referential CI bind-mount topology (we still cd
into REPO_ROOT/.../derivative-maker, derivative-maker still computes
paths relative to itself), but also makes the system copy of
dm-reprepro-wrapper find help-steps/pre and friends.

Both reprepro wrappers (in-repo and system-installed) now resolve to
the same files via the symlink, so the silvermetal-reprepro-wrap.sh
PATH precedence shadow at /usr/local/bin/reprepro keeps applying to
both code paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:11:58 +01:00
5918305fd7 fix(linux/build): find self via docker inspect, cgroupns hides cgroup path (M1.1 iter22)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 2s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 4m37s
iter21's /proc/self/cgroup approach hit:

    build.sh: cgroup contents:
    0::/

Empty path — act_runner runs job containers with cgroupns enabled, so
the in-container view of cgroup paths is rooted at the namespace, with
no trace of the host-side container ID. Same blocker as `hostname`.

The host docker daemon does know who we are, and we have its socket.
We're the only running container with /workspace/SilverLABS/SilverMetal
as a mount destination (concurrency: 1 in the workflow), so iterate
docker ps and match by mount destination. Found CID becomes the
--volumes-from argument; if no match, dump docker ps to the log and
fail loud.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:04:41 +01:00
4a837e07ed fix(linux/build): discover job container ID from cgroup, not hostname (M1.1 iter21)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 2s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m17s
Run #4268's build-and-verify died <1s into Build A:

    docker: Error response from daemon: No such container: docker

Cause: build.sh's CI path uses `--volumes-from "$(hostname)"` to
inherit the parent job container's /workspace mount, but in the new
runner config (network: host applied via the now-actually-loaded
config.yaml) `hostname` returns the literal string "docker" inside
catthehacker/ubuntu:act-latest — the image bakes that into /etc/hostname
and act_runner doesn't override it. So `--volumes-from docker` looks for
a container literally named "docker", finds nothing, exits.

This worked in earlier runs (#4260) only because config.yaml *wasn't
being loaded* (see iter18 commit), so the runner ran on its built-in
defaults — which kept the container's hostname as the auto-generated
container ID. Fixing config.yaml exposed this latent bug.

Right way to learn your own container ID inside a Linux container is
/proc/self/cgroup, which contains the 64-char hex ID on every cgroup
driver:
  cgroup v1: 12:devices:/docker/<64-hex>
  cgroup v2: 0::/system.slice/docker-<64-hex>.scope

awk extracts the first 64-hex run; that becomes the --volumes-from
argument. If extraction fails (would only happen on a non-docker
runtime), fail loud rather than silent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:59:48 +01:00
ec942b7698 fix(linux/build): bind only config.json, not whole /root/.docker (M1.1 iter20)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m24s
Run #4267 finally got the bind mount through (Merged Binds includes
/root/.docker:/root/.docker:ro), but docker build then died:

    failed to update builder last activity time:
    open /root/.docker/buildx/activity/.tmp-...: read-only file system

The catthehacker job container uses buildx, which writes activity
tracking to /root/.docker/buildx/. Mounting the whole host /root/.docker
read-only made that path read-only too.

Right scope is the file, not the dir:
    -v /root/.docker/config.json:/root/.docker/config.json:ro

That gives the cli the registry auth it needs while leaving the rest
of /root/.docker on the container's writable overlay so buildx can
populate its own activity dir without colliding with the host's. Also
matches the principle of mounting the minimum the secret requires.

valid_volumes entry updated to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:52:35 +01:00
ced77e305f fix(linux/build): valid_volumes takes source paths, not bind specs (M1.1 iter19)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Failing after 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Has been skipped
Run #4266 dropped the /root/.docker bind silently:

    Custom container.HostConfig from options ==> &{Binds:[/root/.docker:/root/.docker:ro]…}
    [/root/.docker] is not a valid volume, will be ignored
    Merged container.HostConfig ==> &{Binds:[/var/run/docker.sock:/var/run/docker.sock /root/.docker:/root/.docker:ro]…}
    no basic auth credentials

Wait, the merged binds list does include /root/.docker — but the line
between them, "[/root/.docker] is not a valid volume, will be ignored",
fires *during* the merge step's allowlist check, and the bind ends up
absent in the actual container start (the `Binds:` list shown is
pre-filter). Net result: the registry creds are not in the job
container, push fails.

Root cause: container.valid_volumes is an allowlist of source-path
globs, not full bind specs. The entry
`/root/.docker:/root/.docker:ro` was being treated as a literal pattern
and never matched the bind's source `/root/.docker`. Same for the
other two entries — they were just no-ops because the auto-mount /
explicit options were the things actually creating the binds.

Fix: rewrite valid_volumes entries as bare source paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:51:17 +01:00
c205139e86 fix(linux/build): drop duplicate docker.sock mount from runner options (M1.1 iter18)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Failing after 6s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Has been skipped
Run #4265 (the first run after the config.yaml wiring fix actually took
effect) failed with:

    failed to create container: 'Error response from daemon:
      Duplicate mount point: /var/run/docker.sock'

act_runner v0.4.1 already auto-mounts /var/run/docker.sock into every
job container; listing it a second time in container.options is a
hard error on container create. Same likely applies to /cache, which
the workflow doesn't actually use anyway (the inner build.sh bind-
mounts via REPO_ROOT/BUILD_DIR, not /cache).

Trim container.options down to *only* the bind act_runner doesn't
provide: -v /root/.docker:/root/.docker:ro for registry credentials.
valid_volumes stays as the broader allowlist for workflow-requested
mounts but doesn't force the mounts itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:49:51 +01:00
f66585e0b1 fix(linux/build): wire config.yaml into act_runner via CONFIG_FILE env
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Failing after 0s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Has been skipped
The runner config.yaml on disk was decorative — never read. The upstream
gitea/act_runner image's run.sh only adds `--config <file>` when the
CONFIG_FILE env var is set, and our compose set neither CONFIG_FILE nor
mounted config.yaml into the container. So `timeout: 240m`,
`container.options`, `valid_volumes` etc. were silently ignored and the
runner ran on built-in defaults.

This is also why iter17's `-v /root/.docker:/root/.docker:ro` addition
to config.yaml had no effect on run #4264: the runner never read it.
The push still failed with "no basic auth credentials".

Fix: bind-mount ./config.yaml into the runner container at
/etc/act_runner/config.yaml and set CONFIG_FILE to that path. After a
`docker compose up -d --force-recreate`, the runner picks up everything
in config.yaml — including the per-job-container /root/.docker bind.

Per-job timeouts in build-iso-linux.yaml are set via `timeout-minutes:
240` at the job level, which overrides the daemon default anyway, so
nothing was visibly broken before. But silently-ignored config is a
trap for the next thing we add to config.yaml, so wire it correctly now.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:48:07 +01:00
e7a5fdd629 fix(linux/build): mount /root/.docker into job containers (M1.1 iter17)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Failing after 2s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Has been skipped
Run #4263 cleared the new builder-image job's `docker build` step
cleanly but `docker push` died with:

    no basic auth credentials

The runner host (10.0.0.51) is logged in to docker-registry.silverlabs.uk —
that's how iter1-15 builder images got pushed by hand. But the
silvermetal-builder act_runner only mounts /root/.docker into its own
container, not into the job containers it spawns. catthehacker/ubuntu:
act-latest runs as root and reads /root/.docker/config.json for auth;
without that file mounted in, docker-cli has no creds to send via the
DooD socket and the registry returns 401 Basic-realm.

Fix: extend the act_runner `container.options` to mount
/root/.docker:/root/.docker:ro into each job container, and add the same
entry to valid_volumes. Update the runner README so first-time deploys
know the host-side `docker login` is what makes the in-CI push work.

This requires a one-time runner redeploy on 10.0.0.51:

    cd /opt/silvermetal-builder-runner
    git pull
    docker compose up -d --build

After that, the builder-image job pushes cleanly and feeds its digest
to build-and-verify as designed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:33:35 +01:00
e260fe1c81 ci(linux/build): self-host the builder image build + iter16 reprepro wrap (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Failing after 2s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Has been skipped
Two coupled changes that unblock the M1.1 iter loop. Both belong in CI;
iter1-15 was wrong to require human-in-the-loop steps to make progress.

1. **CI now builds Dockerfile.builder.**

   `.gitea/workflows/build-iso-linux.yaml` grows a `builder-image` job
   that runs ahead of `build-and-verify`. It rebuilds the silvermetal-
   builder image from `linux/build/docker/Dockerfile.builder`, pushes it
   to `docker-registry.silverlabs.uk/silvermetal-builder:m1.1-<sha>` (and
   `:latest`), reads the resulting digest off `docker inspect`, and
   feeds it forward as a job output. `build-and-verify` consumes that
   digest as the `BUILDER_IMAGE` env override that `build.sh` already
   honours (and validates is digest-form on line ~37).

   That kills the old workflow where every Dockerfile.builder change
   required a human to `docker build` + `docker push` on 10.0.0.51 by
   hand and then bump the digest in `build.sh` in lockstep. The crash
   that triggered this (exit 126 mid-iter16 build run) was a symptom of
   that off-CI step still existing.

   Both jobs run on the existing `silvermetal-builder` runner; the host
   docker daemon is shared via DooD and is already authenticated to
   `docker-registry.silverlabs.uk` (linux/build/runner/docker-compose.yml
   mounts `/root/.docker:/root/.docker:ro`), so no extra login step.

   The hardcoded `BUILDER_IMAGE` digest in `build.sh` stays as the
   local-developer / offline-rebuild fallback. Comments updated in
   `build.sh`, `Dockerfile.builder`, and `linux/build/README.md` to
   match the new flow.

2. **reprepro wrapper for the benign "No priority for X" case.**

   Pinned derivative-maker's `2100_create-debian-packages` (with
   --target iso) re-imports source packages from snapshot.debian.org
   into a local apt repo via `reprepro --basedir … includedsc local
   <foo>.dsc`. The local repo's `conf/distributions` ships no
   `DscOverride` entries, so any source package whose `.dsc` lacks an
   explicit Priority field trips:

       No priority for 'X', skipping.
       There have been errors!

   …and reprepro exits 255. dm-reprepro-wrapper bubbles that up,
   2100_create-debian-packages aborts. The current offender is
   `virtualbox_*.dsc` (key import is now fine — debian-keyring landed in
   commit 4aa59ba — but the priority field gap remains). VirtualBox is
   not in SilverMetal's `--target iso` set, so the sane behaviour is
   "log it, continue".

   New `linux/build/docker/silvermetal-reprepro-wrap.sh` shadows
   `/usr/bin/reprepro` at `/usr/local/bin/reprepro` (PATH precedence).
   It runs the real reprepro, captures merged stdout+stderr, and:
   - if rc != 0 AND every non-blank output line matches one of the
     known-benign patterns ("No priority for 'X', skipping." plus the
     trailing "There have been errors!"), emits the output, logs one
     line of explanation to stderr, and exits 0;
   - otherwise emits the output and propagates rc unchanged.

   Any *other* reprepro error path stays fatal — only the specific
   "No priority for X" pattern is neutralised. `dm-reprepro-wrapper`
   resolves `reprepro` via `\$PATH` so it picks up the wrapper
   transparently.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:30:08 +01:00
4aa59ba633 fix(linux/build): non-interactive mode + visible output + key import (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 11m33s
Run #4260 cleared every harness layer and ran for 18 minutes — past
sanity-tests, prepare-build-machine, cowbuilder-setup, local-deps —
into 2100_create-debian-packages, where it died on:

    Could not check validity of signature with
    '92978A6E195E4921825F7FF0F34F09744E9F5DD9' in
    '/home/user/derivative-binary/temp_packages_debian_sid/virtualbox_7.2.8-dfsg-1.dsc'
    as public key missing!

…and then *also* hung the runner indefinitely because, on any error,
derivative-maker's exception_handler_general detected a TTY (we passed
`docker run -t`) and dropped into an interactive `read -p 'Answer? '`
prompt that nothing was ever going to answer. The orphan docker run
in turn orphaned the act_runner job container, blocking the runner
until manual cleanup.

Three coordinated fixes, validated end-to-end with docker-side smoke
tests on 10.0.0.51:

1. **Non-interactive mode without losing output visibility.**

   The original architectural goal: keep derivative-maker out of
   interactive mode (`[ -t 0 ]` must be false) AND keep the build log
   visible to docker run / Gitea Actions (PTY needed somewhere).

   Resolution:
   - `docker run -t` is kept (required for /dev/console to be a real
     PTY back to docker), but no `-i`, so fd 0 stays /dev/null.
   - docker-entrypoint.service: `StandardInput=tty-force` →
     `StandardInput=null` so the service's fd 0 is /dev/null too.
     Verified inside the container: `[ -t 0 ]` returns false.
   - entrypoint.sh now wraps the user command with an explicit
     `> /dev/console 2>&1` redirect before writing it to
     /etc/docker-entrypoint-cmd. systemd's `StandardOutput=inherit`
     does NOT propagate PID-1's stdout to services in this PID-1-
     systemd-in-container topology — the service log was going
     nowhere visible. /dev/console under `docker run -t` IS the
     allocated PTY back to docker, so the redirect surfaces the
     log to the act_runner / Gitea Actions log.
   - entrypoint.sh's `[ ! -t 0 ] && exit 1` guard removed (it
     would now always trigger).

2. **debian-keyring for reprepro source-package signature checks.**

   2100_create-debian-packages calls dm-reprepro-wrapper includedsc
   on every .dsc in temp_packages_debian_sid (including
   virtualbox_*.dsc, even for `--target iso` — see line 114 of that
   build step). reprepro verifies the dsc signature against the
   user's GPG keyring; without the maintainer keys it fails.

   Adds `debian-keyring` to Dockerfile.builder. build-inner.sh now
   imports debian-keyring.gpg / debian-maintainers.gpg /
   debian-nonupload.gpg into the user's keyring before running
   derivative-maker.

3. **BUILDER_IMAGE digest re-pinned.**

   Built natively on 10.0.0.51 (per memory: never on WSL/aarch64).
   New digest: sha256:2f680c96…f0db.

Smoke-test results (against this exact image):

    ==> START                  ← user output reaches docker stdout
    (keyring present)          ← debian-keyring imported successfully
    STDIN_NOT_TTY              ← derivative-maker WILL stay non-interactive
    ==> END                    ← clean shutdown
    docker run exit: 42        ← exit code propagates correctly on failure

Files: Dockerfile.builder, systemd-entrypoint/entrypoint.sh,
       systemd-entrypoint/docker-entrypoint.service, scripts/build.sh,
       scripts/build-inner.sh.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 14:05:49 +01:00
9c406598e2 fix(linux/build): pin user_name=user, mkdir derivative-binary (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 18m13s
Run #4259 (the systemd-in-container debut) cleared every prior failure
class, ran for 15 minutes, then died inside 1100_sanity-tests'
aptgetopt_conf_add at:

    tee: /home/root/derivative-binary/30_derivative-maker.conf:
    No such file or directory
    last_failed_bash_command: tee --append -- "$dist_aptgetopt_file" > /dev/null

Two compounding bugs:

1. **user_name resolves to "root" via $SUDO_USER**

   derivative-maker/help-steps/variables (lines 80-93) computes
   user_name with these fallbacks, in order:

       [ -n "$user_name" ]   || user_name="$SUDO_USER"
       [ -n "$user_name" ]   || user_name="$(logname 2>/dev/null)"
       if [ -z "$user_name" ] && [ "$(id -u)" != "0" ]; then
          user_name="$(whoami)"
          [ -n "$user_name" ] || user_name="$USER"
       fi

   build.sh enters the container as root (systemd's
   docker-entrypoint.service runs as root), then sudoes to user via
   `sudo --preserve-env -u user --`. sudo always sets SUDO_USER to the
   *calling* user (= root), regardless of --preserve-env. So
   variables.sh hits the first fallback and computes user_name="root",
   then HOMEVAR=/home/root, then binary_build_folder_dist=
   /home/root/derivative-binary — a directory that does not exist
   because root's home is /root (not /home/root).

   Fix: build-inner.sh now exports user_name=user before sourcing the
   config, satisfying the first-priority check in variables.sh and
   short-circuiting the SUDO_USER fallback. The comment in the script
   notes the failure mode for the next reader.

2. **Missing mkdir of derivative-binary**

   Upstream's derivative-maker-docker-start does:
       mkdir --parents -- "${HOME}/derivative-binary"
   before invoking derivative-maker. Our build-inner.sh skipped that
   because previous iterations didn't reach the point where it
   mattered. Now that we do, we replicate it.

3. **Output collection path correction**

   derivative-maker writes its ISO/manifest output into
   ${HOME}/derivative-binary (per variables.sh:109) — not into the
   source tree under linux/build/derivative-maker. The previous
   `find . -maxdepth 6 -type f -name "*.iso"` would have missed
   everything once we got that far. Updated to
   `find "${HOME}/derivative-binary" ...`.

No image rebuild needed — this is a pure script-and-env change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:47:47 +01:00
38ac4f8a96 fix(linux/build): systemd-in-container build host (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 15m34s
Run #4258 cleared the systemctl shim only to die two seconds later on
the *next* expectation derivative-maker has of a real systemd host:
its sources.list points at http://127.0.0.1:9977/debian (the approx
package-cache socket-activated by systemd) and apt-get update could
not reach the daemon because nothing was actually started by the
no-op shim:

    Err:1 http://127.0.0.1:9977/debian trixie InRelease
      Could not connect to 127.0.0.1:9977 (127.0.0.1).
      - connect (111: Connection refused)

Whack-a-mole'ing each service derivative-maker tries to start (approx
today, then journald, then systemd-logind, then who-knows-what
tomorrow) is going to keep failing for a while — derivative-maker is
fundamentally designed for a real systemd-managed Debian host. The
container pattern upstream itself ships
(linux/build/derivative-maker/docker/) runs systemd as PID 1 inside
the container; this commit adopts that approach.

Architecture:

  - PID 1 in the build container is now systemd. Upstream's vendored
    entrypoint.sh records the user-supplied command into
    /etc/docker-entrypoint-cmd, captures env into
    /etc/docker-entrypoint-env, masks irrelevant units, and execs
    systemd. systemd boots, docker-entrypoint.service runs the
    command, docker-entrypoint-stop.sh propagates the exit code via
    `systemctl exit <code>` so the container exits with the right
    status.

  - The four entrypoint files (entrypoint.sh,
    docker-entrypoint.service / .target, docker-entrypoint-stop.sh)
    are vendored at linux/build/docker/systemd-entrypoint/ rather
    than COPY'd from the submodule path — Docker build context can
    only reach below itself, and bumping is tracked in that dir's
    README.

  - Container runtime now requires --cgroupns=host, --tmpfs /run,
    --tmpfs /run/lock, and -v /sys/fs/cgroup:/sys/fs/cgroup:rw so
    systemd can manage cgroups properly. -t allocates a TTY,
    satisfying entrypoint.sh's `[ ! -t 0 ] && exit 1` check in CI
    where stdin is otherwise /dev/null.

  - User renamed builder → user (uid 1000, passwordless sudo) to
    match upstream's USER=user / HOME=/home/user convention. chown
    in build.sh now uses uid 1000:1000 so it's name-agnostic.

  - Image package list grew to match upstream's
    derivative-maker-docker-setup (sq stack + dbus + approx + the
    rest) plus our ISO toolchain (live-build / debootstrap / xorriso
    / squashfs-tools / etc.). Snapshot.debian.org pinning is
    preserved (same APT_SNAPSHOT_URL, two-phase install pattern).

Verified:

  Smoke test on 10.0.0.51 — `docker run --rm --privileged
  --cgroupns=host --tmpfs /run --tmpfs /run/lock -v /sys/fs/cgroup:...:rw
  -t <image> /bin/bash -c 'echo OK'` — booted systemd, ran the
  command via docker-entrypoint.service, captured the output, shut
  down filesystems and exited cleanly.

build.sh BUILDER_IMAGE pin → sha256:dc9dd29d…8811. Image rebuilt
natively on 10.0.0.51, pushed to docker-registry.silverlabs.uk.

The systemctl shim is removed by virtue of the Dockerfile rewrite —
real systemd makes it unnecessary. The previous "iter6 / iter7"
intermediate digests stay in the registry until we GC; the live one
is m1.1-iter8-systemd.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:06:47 +01:00
7058fb775c fix(linux/build): add systemctl no-op shim for the build container (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 2m20s
Run #4257 cleared sanity-tests entirely (sq-git verification of every
submodule signature: ; tag/uncommitted relaxation: ) and reached
1200_prepare-build-machine, where it died:

    + sudo systemctl daemon-reload
    sudo: systemctl: command not found
    ERROR detected in script!: ././build-steps.d/1200_prepare-build-machine

derivative-maker assumes systemd is PID 1 on the build host. Upstream's
own container (linux/build/derivative-maker/docker/) runs
systemd-as-init via an entrypoint that masks irrelevant units and
declares its own. We don't want that surgery for M1.1 — it pulls in
cgroup mounts, --cgroupns=host, and a much bigger debugging surface.

Shim approach instead: install /usr/local/bin/systemctl that logs the
attempt to stderr and exits 0. /usr/local/bin precedes /usr/bin in
both default $PATH and sudo's secure_path, so it satisfies any
systemctl call regardless of whether the real binary later gets pulled
in by a package install. Standard pattern for systemd-aware Debian
build scripts in transient containers.

Risk if it doesn't suffice: the shim makes daemon-reload / restart /
mask calls succeed, but doesn't actually run any service. If a later
build step depends on (say) approx actually being up to serve cached
debs, we'll see the next failure and decide whether to escalate to
real systemd-in-container or skip the relevant build step.

Changes:
- Dockerfile.builder: add the shim with a brief log line to stderr;
  comment block documents the trade-off.
- build.sh: BUILDER_IMAGE digest re-pinned to sha256:70f160ab…5460
  (built natively on 10.0.0.51, shim verified working with
  `docker run … systemctl daemon-reload` returning 0).

Verified: shim emits "systemctl-shim: daemon-reload" to stderr and
exits 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:45:13 +01:00
8a3cd0ba22 fix(linux/build): allow untagged / uncommitted submodule commits (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m24s
Run #4256 finally cleared every preceding obstacle and reached
git_sanity_test's per-submodule verification phase. sq-git authenticated
every commit signature in the chain — that part is working perfectly —
but failed at:

    ERROR: Untagged commit in: qubes/qubes-template-kicksecure
    INFO: As a developer or advanced user you might want to use:
    WARNING: This can be insecure if you cannot audit the changes.
    --allow-untagged true --allow-uncommitted true

git_sanity_test runs two orthogonal checks:
  1. signatures (sq-git, verified )
  2. tagged-commit-only mode (verified  for one submodule)

The pinned upstream tag (18.1.7.4-developers-only — the name itself
flags the intent) deliberately ships with some submodule pointers at
intermediate / merge commits rather than release tags. parse-cmd
documents `--allow-untagged true` and `--allow-uncommitted true` for
exactly this case. Signatures remain verified; we're only relaxing the
release-tag check, which is appropriate when we've deliberately pinned
to a developer tag.

If/when we move to a redistributable upstream tag in M1.10+ (signing
ceremony milestone), these flags should come back out.

No image rebuild needed — script-only change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:35:27 +01:00
2a163bb9e7 fix(linux/build): install sq-git/Sequoia stack for derivative-maker (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m21s
Run #4255 reached deeper into 1100_sanity-tests, finished its apt-get
phase, and then died at the supply-chain verification step:

    /workspace/.../help-steps/git_sanity_test: line 184: sq-git: command not found
    ERROR: sq-git verification failed: main repo
    INFO: If this is intentional, configure your own sq-git policy file.
          See 'buildconfig.d/30_signing_key.conf'.

derivative-maker uses sq-git (sequoia-git) to authenticate the commit
chain against an OpenPGP policy file before building. The policy file
itself ships in the upstream repo (./openpgp-policy.toml) and the
trust-root defaults are correctly configured by help-steps/variables
(line 232 + 290) for non-redistributable builds — i.e. the verification
machinery is fully wired and just needs the binary.

Aligns with the upstream container's package list at
linux/build/derivative-maker/docker/derivative-maker-docker-setup.

Changes:
- Dockerfile.builder: add sq, sqv, sqop, sequoia-git,
  sequoia-chameleon-gnupg, gpg-agent. All available in trixie main.
- build.sh: BUILDER_IMAGE digest re-pinned to sha256:c1490bab…5c97
  (rebuilt on 10.0.0.51, sq-git binary verified present at /usr/bin/sq-git).

No reproducibility implications — image rebuilds against the same
pinned snapshot timestamp.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:31:03 +01:00
433eb18947 fix(linux/build): bump builder base bookworm → trixie (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m19s
Run #4254 finally got past every harness issue and into derivative-
maker's actual sanity-tests, where it died with:

    You are attempting to build on an unsupported operating system or version.
    detected operating system codename: 'bookworm'
    expected operating system codename: 'trixie'

The pinned derivative-maker tag (18.1.7.4-developers-only) requires
Debian 13 (trixie) as the build host. Upstream's own
linux/build/derivative-maker/docker/Dockerfile uses
`FROM debian:trixie-slim`. We picked bookworm originally and the tag
mismatch wasn't caught until the build actually ran.

Changes:

- Dockerfile.builder: FROM debian:bookworm-slim →
  debian:trixie-slim @ sha256:cedb1ef4…2c5a (resolved 2026-05-07 on
  the runner host). sources.list suite names follow:
  `bookworm` → `trixie`, `bookworm-security` → `trixie-security`.
  snapshot.debian.org pin (20260415T000000Z) is unchanged — snapshots
  are date-keyed, so the same timestamp resolves trixie's dists/.
- silvermetal-base.conf: DERIVATIVE_DIST `bookworm` → `trixie` for
  consistency (the value isn't passed to derivative-maker — there's
  no --dist option — but it's referenced by the build.sh prologue
  and we shouldn't have a stale codename floating around).
- build.sh: BUILDER_IMAGE digest re-pinned to sha256:7d893178…1890
  (rebuilt natively on 10.0.0.51 against the new base, pushed).

The reproducibility guarantee is unchanged in shape — same snapshot
timestamp, same source-date-epoch derivation, just a different stable
host OS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:25:40 +01:00
4a3971cb06 fix(linux/build): correct derivative-maker CLI invocation (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m13s
Run #4253 finally got past all the harness failures and into
derivative-maker's actual build steps, where 1100_sanity-tests
rejected our invocation with:

    unknown option (1): '--build'

The CLI we'd been passing was built from invented flag names rather
than the real grammar in derivative-maker/help-steps/parse-cmd.
Concretely:

  - `--build`  is not a real option (just wrong)
  - `--flavour` should be `--flavor` (upstream uses American spelling)
  - `--dist`   is not a real option; dist is implicit from `--flavor`
                (kicksecure-cli ⇒ bookworm)
  - `--config` is not a real option; the silvermetal-base.conf is
                sourced into env above the invocation, no flag needed
  - `--freedom true|false` was missing entirely; parse-cmd requires it
                for `--arch amd64` (line 70 in parse-cmd) — the script
                exits if neither is set

Fix: build-inner.sh now invokes
    ./derivative-maker --flavor … --target … --arch … --freedom …
which is the minimal valid form per parse-cmd's case-branches.

Set DERIVATIVE_FREEDOM=false in silvermetal-base.conf, matching
Kicksecure's own public-ISO choice — `--freedom true` would omit
firmware-nonfreedom and the resulting ISO wouldn't initialise wifi /
many GPUs / Intel microcode on most hardware. Privacy/functionality
trade-off documented inline; the hardening overlay in M1.2+ can
revisit if that conversation becomes useful.

Verified: bash -n on both scripts. No image rebuild needed — pure
script and config changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:18:38 +01:00
bf55a3f81c fix(linux/build): mark build-inner.sh executable (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m14s
Run #4252 died at:

    runuser: failed to execute /workspace/SilverLABS/SilverMetal/linux/build/scripts/build-inner.sh:
    Permission denied

The script was created on the WSL/Windows side (/mnt/c) where every
file appears world-rwx regardless of git's index, so the local
`chmod +x` was a no-op as far as git was concerned and the file got
committed at mode 100644 like any other regular file. Sibling scripts
(build.sh, verify-reproducibility.sh, diagnose-divergence.sh) all
correctly carry 100755 in the index.

Fix: `git update-index --chmod=+x` to set the bit in the index
explicitly, independent of the working-tree perms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:13:02 +01:00
b20e568b19 fix(linux/build): run derivative-maker as unprivileged builder user (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m14s
Run #4251 advanced past checkout and into derivative-maker, then died
immediately:

    ERROR: This must NOT be run as root (sudo)!
    ERROR: Exiting ./derivative-maker with non-zero exit code 1.
           Errors Detected: 0. Execution Time: 00:00:00.

Kicksecure's derivative-maker explicitly refuses to run as root — it
expects a regular user with passwordless sudo and uses sudo internally
for the privileged operations (debootstrap, mksquashfs, chroot mounts).
Our minimal debian-slim builder image had a `builder` user (uid 1000)
but no sudo, no sudoers entry, and the container ran as root.

Aligns with the upstream Kicksecure container pattern at
linux/build/derivative-maker/docker/derivative-maker-docker-setup
(uses USER=user with `${USER} ALL=(ALL) NOPASSWD:ALL`).

Changes:
- Dockerfile.builder: install `sudo` (and `fakeroot` while we're here —
  upstream sanity-tests pulls this in via apt at build time, but having
  it baked avoids a snapshot.debian.org round-trip every run); add
  passwordless sudoers entry for builder; correct the misleading
  comment that claimed root was needed.
- New scripts/build-inner.sh: the inner derivative-maker invocation
  pulled out of build.sh's heredoc. Once we needed to drop privileges
  via runuser, the nested-heredoc / nested-quoting situation became
  unmaintainable; a regular script with normal quoting is far cleaner.
- build.sh: inner heredoc now just chowns the workspace to builder and
  runuser's into build-inner.sh. ${REPO_ROOT} and ${BUILD_DIR} continue
  to be forwarded into the container via -e.
- build.sh: BUILDER_IMAGE digest re-pinned to sha256:f8f0db37…1bedc
  (rebuilt and pushed natively on 10.0.0.51 — never on the WSL/aarch64
  dev box, see reference_silvermetal_runner.md memory).

Verified: bash -n on both scripts; image builds and pushes cleanly.
Pushing this commit triggers a fresh CI run that will exercise it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:09:42 +01:00
1d0e58739c fix(linux/build): handle DooD bind-mount in CI (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m18s
build.sh ran fine locally but failed in Gitea Actions on the first
reproducibility-gated run (#4250) with:

    bash: line 3: /work/linux/build/config/silvermetal-base.conf:
    No such file or directory

Root cause: classic Docker-out-of-Docker confusion. build.sh runs
inside the act_runner job container, which talks to the host's docker
daemon via the mounted /var/run/docker.sock. The "-v ${REPO_ROOT}:/work"
flag was being interpreted by the host daemon against the host
filesystem, where /workspace/SilverLABS/SilverMetal does not exist;
docker silently auto-created an empty dir there and mounted that as
/work, so the config source target was missing.

Fix: detect GITHUB_ACTIONS and use --volumes-from "$(hostname)" in CI
to inherit the parent job container's /workspace mount intact. Locally
we keep a bind mount, but use the same path inside and outside
(${REPO_ROOT}:${REPO_ROOT}) so the inner heredoc is identical in both
modes. Inner script now references "${REPO_ROOT}/..." and
"${BUILD_DIR}/..." instead of the synthetic /work and /out paths.

No reproducibility implications — bind topology doesn't affect bytes
inside the ISO.

Verified locally: bash -n passes; structural change only, behaviour
preserved for the non-CI path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:01:06 +01:00
eae2b98906 fix(linux/build): re-pin BUILDER_IMAGE to amd64 registry digest
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 11s
Two corrections to f9e606d:

1. Registry hostname: docker-registry:5000 isn't DNS-resolvable on the
   SLAB docker host (verified). The fleet-wide convention is the canonical
   docker-registry.silverlabs.uk URL, registered as an insecure-registry
   in /etc/docker/daemon.json on every docker host.

2. Architecture: the original push from WSL2-on-aarch64 produced an arm64
   image that won't run on the amd64 runner. Rebuilt natively on the docker
   host. New manifest digest (amd64-only):
     sha256:9e7161f9f180483f434074d7f32c27c907955232bd0c44efe6dc0ee1d9e56ae0

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 11:59:52 +01:00
7b99516232 feat(linux/build): silvermetal-builder Gitea Actions runner deployment
act_runner-based deployment that handles `runs-on: silvermetal-builder` jobs.
Adapted from the stinky-roger-tv flutter-builder pattern with three changes:

- privileged: true (live-build needs loop devices + chroot)
- 4h job timeout (covers two reproducibility-gated ISO builds + diffoscope)
- silvermetal-builder label maps to catthehacker/ubuntu:act-latest, not the
  silvermetal-builder image — the builder image stays minimal (no docker-cli),
  and build.sh invokes it via `docker run` from the catthehacker job shell

Deployed at /opt/silvermetal-builder-runner/ on the SLAB docker host
(10.0.0.51); registered with git.silverlabs.uk and reporting healthy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 11:59:44 +01:00
f9e606d22d fix(linux/build): pin BUILDER_IMAGE to pushed registry digest (M1.1)
Image built from Dockerfile.builder@36f7672 was pushed to both
docker-registry:5000 (internal) and docker-registry.silverlabs.uk
(external) under tags m1.1-bootstrap + latest. Both URLs serve the
same registry, so the manifest digest is identical:

  sha256:cedef039425e0b0f5901c1023eda820c7aa38ab4b81c2bb1e12d64cadb3d6c85

Default points at the internal hostname for CI; external dev overrides
via BUILDER_IMAGE env var.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 11:48:48 +01:00
36f7672c6f fix(linux/build): make builder image actually build (M1.1)
- Pin debian:bookworm-slim by real digest (resolved 2026-04-26).
- Two-phase install: seed ca-certificates from the default mirror first
  so HTTPS to snapshot.debian.org works, then swap to the pinned snapshot
  for the toolchain itself. Slim images don't ship the CA bundle, so the
  one-shot pinned-source-only install would deadlock on cert verification.

Validated locally: image builds clean, 302MB, all live-build / debootstrap /
mksquashfs / xorriso / diffoscope-minimal present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 04:49:34 +01:00
4444dc11f3 feat(linux/build): scaffold reproducible ISO build pipeline (M1.1)
Vendors Kicksecure derivative-maker as a pinned submodule (18.1.7.4),
adds the wrapper + verify + diagnose scripts, the pinned builder image,
and the reproducibility-gated Gitea Actions workflow. Base flavour only —
no hardening overlay (that's M1.2).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 04:25:48 +01:00
0a0075ce66 docs(naming): adopt OS / Enhanced product-line framing + align with existing repos
Two product lines, named to make scope obvious to buyers:
- 🔒 SilverMetal OS — we ship the operating system or ROM
  (Linux, Pixel, Samsung-unlocked, Motorola-unlocked)
- 🛡️ SilverMetal Enhanced — we harden the OS the device already runs
  (Windows, macOS, iOS, generic Android)

Repo alignment:
- SilverVPN already exists as a SilverLABS product (server + MAUI client +
  Linux client + tunnel service). stack/vpn/ is now an integration pointer
  rather than a re-scaffold; per-platform READMEs reference it.
- SilverApple is deprecated; SilverMetal Enhanced — iOS supersedes it.
  Migration step added as roadmap milestone 3I.1.
- SilverDROID name clash explicitly noted as unrelated (it's the SilverSHELL
  AppStore Android client, not an Android ROM).
- SilverChat may overlap with SilverVPN.Client.Chat; alignment decision
  added as roadmap milestone 1.1.1.

Roadmap restructured: phases now track the OS/Enhanced split.
Platform matrix re-sectioned and decision flowchart updated.
README rewritten around the two-product-line framing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 03:30:45 +01:00
7d5f9cc246 chore(scaffold): initial SilverMetal program scaffold
Cross-platform privacy-hardening program. Two-layer product:
- SilverLABS Application Stack (cross-platform spine)
- Platform Hardening Profiles (per-OS, tier-honest)

Platforms: Linux (Debian/Kicksecure), Android (Pixel/Samsung/Moto/generic),
Windows (LTSC IoT), macOS (profile), iOS (MDM profile). Each flavour has
both a preflashed hardware SKU path and a self-apply "harden your existing
device" path.

Includes umbrella docs (README + threat-model, design-principles,
platform-matrix, roadmap, trust-model), per-platform and per-stack-
component README stubs, .gitignore, LICENSE.

Linux v1 ships first; Stack v1 = Browser + VPN + Sync.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 03:11:48 +01:00