21 Commits

Author SHA1 Message Date
sysadmin
9fa613b8c1 ci(windows): free build working set before persist copy (oscdimg OK, persist OOM)
Some checks failed
Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 45s
Build got through the ISO repack but failed copying the 5GB ISO to C:\silvermetal\out
('not enough space'): the build's working set (extracted ISO tree + expanded install.wim
+ 5GB base ISO) fills the single-volume runner, leaving <5GB for the persist copy. The
image grew again with the injected driver. Delete RUNNER_TEMP\smbuild + base.iso (no
longer needed post-build/validate) right before the copy to reclaim ~10GB.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 16:35:01 +01:00
sysadmin
7eec584a66 ci(windows): free disk space before build (clear prior ISO output)
Some checks failed
Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 21s
The boot.wim now carries WinPE-NetFx/PowerShell (collector), growing the image ~0.4GB,
and each build persists a ~5GB ISO to C:\silvermetal\out. On the single-volume runner
that accumulation starved oscdimg ('Insufficient disk space'). Clear prior output +
stale smbuild work dirs at job start so free space self-heals each run.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 09:50:00 +01:00
sysadmin
e6c292da25 ci(windows): install ADK WinPE add-on so boot.wim collector can be staged
All checks were successful
Build SilverMetal Enhanced - Windows ISO / build (pull_request) Successful in 7m26s
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 09:38:13 +01:00
sysadmin
bc847ea6d9 fix(build): discard stale image mounts at startup + ephemeral CI WorkDir
All checks were successful
Build SilverMetal Enhanced - Windows ISO / build (pull_request) Successful in 4m54s
A prior aborted build left a DISM image mounted in the fixed WorkDir,
locking install.wim and breaking the Stage 2 extract clean-up. Add a
Stage 0 that discards any orphaned SilverMetal mounts + loaded hives
before recreating the work dirs, and run CI in an ephemeral per-job
RUNNER_TEMP WorkDir so concurrent/aborted runs can't collide.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 15:14:48 +01:00
sysadmin
500e21f186 ci(branding): force Pester 5 + use v5 config object (fix -Output ambiguity vs Pester 3)
Some checks failed
Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 46s
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 15:10:52 +01:00
sysadmin
6aa963f024 docs(tests): document branding test suite + elevation requirement
ci(branding): run branding Pester suite before Build packed ISO step
2026-06-09 14:14:13 +01:00
sysadmin
f39823339f ci(welcome): pin .NET 9 SDK via setup-dotnet so MAUI workload band matches 2026-06-09 03:54:18 +01:00
sysadmin
0b1057d0fa ci(welcome): build + test the Welcome solution before the ISO build 2026-06-09 03:50:35 +01:00
sysadmin
d26595d26f ci(windows): persist validated ISO to stable runner path
All checks were successful
Build SilverMetal Enhanced - Windows ISO / build (push) Successful in 3m23s
RUNNER_TEMP is ephemeral; copy the validated build output to C:\silvermetal\out\
so it can be retrieved out of band (e.g. for VM boot-testing).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 21:42:16 +01:00
sysadmin
6d23a892b9 ci: remove throwaway runner-probe/runner-prep diagnostics
Some checks failed
Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 1m51s
Their job is done (runner topology mapped, C: extended, ISO staged). The build
+ offline-validation pipeline is green on the runner.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 21:13:06 +01:00
sysadmin
3effd5e338 ci(windows): pin base-ISO SHA + verify; ISO staged locally on runner
Some checks failed
Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 1m55s
Base eval ISO staged at C:\silvermetal\base.iso on GITEA-RUN-WIN (SHA256
2CEE70BD...CB29 pinned in inputs.manifest.json). Repo var now points at that
local path, so the build reads locally - no NAS share auth / no CI creds.
Dropped -SkipInputVerify so the build verifies the pinned hash.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 20:58:07 +01:00
sysadmin
ee34b8e373 ci: probe credential-less net use as SYSTEM (stored cmdkey)
Some checks failed
Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 4s
2026-06-08 20:54:33 +01:00
sysadmin
78d4d84f88 ci: runner-prep workflow (extend C: only); drop in-CI ISO staging
Some checks failed
Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 4s
Master creds must not live in this public repo's Actions, so ISO staging is
handled out-of-band. runner-prep now only extends C: into the resized virtual
disk. Quoted the step name (trailing-colon YAML fix).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 19:47:56 +01:00
sysadmin
cc01675056 ci: add throwaway runner-probe workflow to discover runner topology
Some checks failed
Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 4s
Temporary diagnostic to see the silverlabs-runner-win host identity, drives,
share mounts/stored creds, and ISO reachability before wiring the base-ISO
source. Removed once the source is settled.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 19:33:37 +01:00
sysadmin
5e42da619e ci(windows): make base-ISO acquire step path-aware (UNC/local + optional SMB creds)
Some checks failed
Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 4s
SILVERMETAL_BASE_ISO_URL now accepts an HTTP(S) URL or a UNC/local path. For a
UNC share that the SYSTEM-context runner can't read anonymously, optional repo
secrets SILVERMETAL_ISO_SHARE_USER/_PASS map the share root via net use first.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 19:19:40 +01:00
sysadmin
1c886deca3 ci(windows): implement M2 ISO build + Gitea Windows-runner workflow
Some checks failed
Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 34s
Implement build.ps1 (M2): mount/extract the base ISO, offline-service
install.wim (inject GPD drivers if staged, debloat appx, bake SetupComplete.cmd
+ hardening modules into \Windows\Setup\Scripts), inject autounattend.xml,
oscdimg UEFI repack, emit SHA-256 + SBOM. Elevation + oscdimg guarded.

Add .gitea/workflows/build-iso-windows.yaml: runs on the self-hosted
silverlabs-runner-win (windows-latest), ensures ADK Deployment Tools, acquires
the base ISO from repo var SILVERMETAL_BASE_ISO_URL or a pre-staged path, builds,
validates the baked payload offline, uploads SBOM/SHA (+ISO on dispatch/tag),
attaches to a Gitea release on win-v* tags. Mirrors build-iso-linux.yaml.

Add tests/Assert-IsoStructure.ps1: the no-nested-virt CI gate - mounts the built
ISO + install.wim read-only and asserts autounattend.xml, SetupComplete.cmd, and
the hardening modules are correctly baked. Full QEMU boot+Verify is a follow-on.

Switch autounattend to Windows' native SetupComplete.cmd auto-run (SYSTEM, end
of setup) instead of a duplicate FirstLogonCommands call.

Untested until first runner execution (dev box is ARM64). All PS parse-clean;
autounattend XML + workflow YAML valid.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 18:11:05 +01:00
34bc442dd8 fix(linux/build): cover all ISO9660 dates + locate residual byte drift (M1.1 iter34)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m40s
Run #4281 cleared every layer above the ISO9660 wrapper:

    SHA256 (squashfs payload)
    caed117ca72c6c1d9204c49dd749d5f7b372f3a19cac1b2a7e66bee452a8d501  /tmp/.../a.squashfs
    caed117ca72c6c1d9204c49dd749d5f7b372f3a19cac1b2a7e66bee452a8d501  /tmp/.../b.squashfs

…squashfs is now byte-identical, ISO TOC is identical, file listing
diff is empty, but ISO SHA still differs. The remaining drift is in
the ISO9660 metadata region between the system area (first 32 KiB)
and the file payload start.

Two complementary changes:

1. xorriso post-process now sets *every* date field xorriso writes,
   not just the obvious two:

     -alter_date_r all     — atime + mtime + btime on all nodes,
                             not just mtime. ISO9660 directory
                             records carry creation+modification
                             timestamps.
     -volume_date c m x f u s — every volume-descriptor date:
       c=creation  m=modification  x=expiration  f=effective
       u=system area  s=path table
     Default for any unset volume_date is "now", which is what was
     leaking through despite us setting c+m.

2. diagnose-divergence.sh now does whole-file cmp -l (capped at 200
   lines so 1 GiB of all-different doesn't drown the report) and on
   any divergence, dumps a 128-byte xxd window from each ISO around
   the first differing byte plus a unified diff between the two
   windows. This tells us in the next failure log "first byte differs
   at offset N (LBA M), bytes around it look like X" — pinpoints the
   ISO9660 region without needing artifact download.

Workflow tail-into-log step wired up the two new files
(iso-cmp-first-200.txt, iso-around-first-diff.diff).

If iter34 still fails the gate, the new diagnostic tells us exactly
which structure (volume descriptor, path table, directory record,
boot catalog…) is still drifting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 00:29:37 +01:00
c9e67d8b47 fix(linux/build): staged divergence diagnostic, avoid OOM (M1.1 iter26)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m36s
Run #4273 confirmed two things:

1. The reproducibility gate works end-to-end. Both builds produced
   ISOs (1077194752 vs 1077202944 bytes — 8 KiB delta, exactly one
   squashfs block worth of compressed-payload drift) and the compare
   step caught it.

2. diffoscope, run on the whole 1 GB ISO inside the silvermetal-builder
   container, gets OOM-killed before producing any output:

       diagnose-divergence.sh: line 44:    13 Killed
         diffoscope --max-report-size 100000000 --html ... --text ... A.iso B.iso

   The host has 19 GiB free, but diffoscope's full recursion through
   ISO -> squashfs -> ~thousands of inner files needs more memory than
   that for a 1 GB image. Setting --max-report-size only caps the
   output, not the working-set.

Rewrite diagnose-divergence.sh to do staged, cheap-to-expensive
analysis:
  1. sha256 + sizes (always)
  2. xorriso TOC of both ISOs (every node: mode/size/mtime/path) -> diff
  3. Pull just live/filesystem.squashfs out of each ISO,
     sha256 it + `unsquashfs -ll` it, diff the listings — this is
     where the per-file-size signal lives.
  4. Targeted diffoscope on the squashfs payload only, with
     --max-container-depth 2 + --max-text-report-size 5MB + --no-html
     + a 10-minute timeout. Bounded enough to finish without the OOM.

Drops `set -e` — every step `|| true`s itself so we get partial output
even when one stage fails.

Workflow tail-into-log step now prints the new staged outputs:
  * toc-diff.txt   — what changed at the ISO level
  * sqfs-ls-diff.txt — which inner files have different sizes/mtimes
  * sqfs-diff.txt   — diffoscope on the squashfs only
  * squashfs-sha256.txt
  * iso-header-cmp.txt — first-8KB cmp -l for header-level drift
  * sizes.txt / sha256.txt / checklist.md as before

Should land us a focused list of "these N files inside the squashfs
have different bytes" — that's what we need to find what's leaking
non-determinism into the build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 19:54:35 +01:00
3f51b2fd7f feat(linux/build): run diffoscope inside silvermetal-builder + tail diff to log (M1.1 iter25)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 35m22s
Run #4272 hit the M1.1 reproducibility gate as designed — both builds
completed, ISOs differed (A=ff2e7444…, B=9ec7f3da…), diagnose-divergence
fired. Two things stopped that diagnostic from being useful:

1. **diffoscope wasn't available.** diagnose-divergence.sh runs in the
   catthehacker job container, which has cmp but no diffoscope. The
   silvermetal-builder image we built two minutes earlier *does* have
   diffoscope-minimal (Dockerfile.builder line 109). Run the diagnostic
   inside that image: docker run --volumes-from $self_cid + the digest
   the builder-image job passed in via BUILDER_IMAGE. Mounts the same
   /workspace path so REPO_ROOT-relative resolution in
   diagnose-divergence.sh works unchanged.

2. **The artifact was unreachable.** actions/upload-artifact@v3 against
   Gitea 1.25.2 reports "successfully uploaded" but the
   /api/v1/repos/.../actions/runs/{id}/artifacts list comes back empty,
   and every download path probed returns 404. Known v3 incompatibility
   — v3 uses the legacy GitHub Services API endpoint that Gitea
   doesn't expose for retrieval.

   Workaround: tail the divergence content into the workflow log
   directly, so it shows up in `gitea actions logs` regardless of
   upload-artifact's behaviour. Specifically: sizes.txt, sha256.txt,
   checklist.md, head -n 400 of diff.txt (or cmp.txt as fallback).
   That's enough to see what's diverging without needing the artifact.
   Upload-artifact step kept in place for whenever Gitea's API gets
   sorted (fix-once-then-forget).

The self-discovery loop (docker ps + inspect filtering by
/workspace/SilverLABS/SilverMetal mount destination) is the same one
build.sh uses; concurrency: 1 in this workflow guarantees a single
match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 19:14:44 +01:00
e260fe1c81 ci(linux/build): self-host the builder image build + iter16 reprepro wrap (M1.1)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Failing after 2s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Has been skipped
Two coupled changes that unblock the M1.1 iter loop. Both belong in CI;
iter1-15 was wrong to require human-in-the-loop steps to make progress.

1. **CI now builds Dockerfile.builder.**

   `.gitea/workflows/build-iso-linux.yaml` grows a `builder-image` job
   that runs ahead of `build-and-verify`. It rebuilds the silvermetal-
   builder image from `linux/build/docker/Dockerfile.builder`, pushes it
   to `docker-registry.silverlabs.uk/silvermetal-builder:m1.1-<sha>` (and
   `:latest`), reads the resulting digest off `docker inspect`, and
   feeds it forward as a job output. `build-and-verify` consumes that
   digest as the `BUILDER_IMAGE` env override that `build.sh` already
   honours (and validates is digest-form on line ~37).

   That kills the old workflow where every Dockerfile.builder change
   required a human to `docker build` + `docker push` on 10.0.0.51 by
   hand and then bump the digest in `build.sh` in lockstep. The crash
   that triggered this (exit 126 mid-iter16 build run) was a symptom of
   that off-CI step still existing.

   Both jobs run on the existing `silvermetal-builder` runner; the host
   docker daemon is shared via DooD and is already authenticated to
   `docker-registry.silverlabs.uk` (linux/build/runner/docker-compose.yml
   mounts `/root/.docker:/root/.docker:ro`), so no extra login step.

   The hardcoded `BUILDER_IMAGE` digest in `build.sh` stays as the
   local-developer / offline-rebuild fallback. Comments updated in
   `build.sh`, `Dockerfile.builder`, and `linux/build/README.md` to
   match the new flow.

2. **reprepro wrapper for the benign "No priority for X" case.**

   Pinned derivative-maker's `2100_create-debian-packages` (with
   --target iso) re-imports source packages from snapshot.debian.org
   into a local apt repo via `reprepro --basedir … includedsc local
   <foo>.dsc`. The local repo's `conf/distributions` ships no
   `DscOverride` entries, so any source package whose `.dsc` lacks an
   explicit Priority field trips:

       No priority for 'X', skipping.
       There have been errors!

   …and reprepro exits 255. dm-reprepro-wrapper bubbles that up,
   2100_create-debian-packages aborts. The current offender is
   `virtualbox_*.dsc` (key import is now fine — debian-keyring landed in
   commit 4aa59ba — but the priority field gap remains). VirtualBox is
   not in SilverMetal's `--target iso` set, so the sane behaviour is
   "log it, continue".

   New `linux/build/docker/silvermetal-reprepro-wrap.sh` shadows
   `/usr/bin/reprepro` at `/usr/local/bin/reprepro` (PATH precedence).
   It runs the real reprepro, captures merged stdout+stderr, and:
   - if rc != 0 AND every non-blank output line matches one of the
     known-benign patterns ("No priority for 'X', skipping." plus the
     trailing "There have been errors!"), emits the output, logs one
     line of explanation to stderr, and exits 0;
   - otherwise emits the output and propagates rc unchanged.

   Any *other* reprepro error path stays fatal — only the specific
   "No priority for X" pattern is neutralised. `dm-reprepro-wrapper`
   resolves `reprepro` via `\$PATH` so it picks up the wrapper
   transparently.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:30:08 +01:00
4444dc11f3 feat(linux/build): scaffold reproducible ISO build pipeline (M1.1)
Vendors Kicksecure derivative-maker as a pinned submodule (18.1.7.4),
adds the wrapper + verify + diagnose scripts, the pinned builder image,
and the reproducibility-gated Gitea Actions workflow. Base flavour only —
no hardening overlay (that's M1.2).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 04:25:48 +01:00