SilverMetal

Author	SHA1	Message	Date
sysadmin	9c65c1c3a0	docs(windows): Welcome spec revisions per review All checks were successful Build SilverMetal Enhanced - Windows ISO / build (push) Successful in 3m50s Details - Daily account defaults to Standard User (least-privilege) + separate SilverOS Admin elevation account; single-admin model demoted to an option. - Hardened baseline applies to ALL flavours (none unhardened); Daily-Driver is the default/recommended (balanced middle), Privacy-Max is opt-in strictest. - Name confirmed: SilverOS Welcome. Stack installs remain gated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 01:47:45 +01:00
sysadmin	b5cfd26f5f	docs(windows): SilverOS Welcome app spec (v1) All checks were successful Build SilverMetal Enhanced - Windows ISO / build (push) Successful in 3m42s Details First-logon Blazor Hybrid (MAUI) onboarding app: bootstrap auto-login -> wizard (persona->flavour, account + BitLocker PIN, prefs) -> apply via the existing §A-H PowerShell modules per a JSON flavour manifest -> create real account, enrol BitLocker, self-destruct bootstrap. Resolves the repo-throwaway-password and interactive-PIN gaps. v1 = interactive auto-launch only; silent pre-baked mode + fleet enrolment + Linux-shared model deferred. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 01:41:48 +01:00
sysadmin	638d08696d	feat(windows): set local-account creds + UK keyboard/region All checks were successful Build SilverMetal Enhanced - Windows ISO / build (push) Successful in 4m33s Details - Local admin password -> "open sesame" (still a placeholder for the public repo; SKU pipeline must replace per-device). - UK keyboard (InputLocale 0809) + UK region/formats (SystemLocale/UserLocale en-GB). Display UILanguage stays en-US because the eval media is en-US and lacks the en-GB display pack -- true en-GB display needs en-GB LTSC media or an injected language pack (future build step). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 01:14:08 +01:00
sysadmin	a0b9c2c989	fix(windows/hardening): tolerate missing hibernation (module G) All checks were successful Build SilverMetal Enhanced - Windows ISO / build (push) Successful in 3m51s Details VM run: `powercfg /hibernate on` writes to stderr where hibernation is unsupported (VMs), which under ErrorActionPreference=Stop aborted module G after its earlier lock-screen settings applied. Wrap it so the module completes cleanly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 00:46:13 +01:00
sysadmin	ba3ef0d45a	fix(windows): hardening modules never ran (SetupComplete quoting bug) All checks were successful Build SilverMetal Enhanced - Windows ISO / build (push) Successful in 4m12s Details VM runtime test (offline disk mount) revealed SetupComplete.cmd ran but its inline multi-line `powershell -Command` (cmd ^-continuation + nested escaped quotes) failed to parse ("string is missing the terminator") -> the §A-H modules never executed. Offline CI assertions only proved the files were BAKED, not that they RUN. Fix: move the module runner into hardening/Invoke-Hardening.ps1 and call it with -File (no cmd quoting). Runner runs 00..08 in order then Verify (writes verify-report.json in-line as SYSTEM; reboot/PIN-dependent gates show pending). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 00:34:05 +01:00
sysadmin	d690b14fc4	feat(windows): automate OOBE region/keyboard (oobeSystem International-Core) All checks were successful Build SilverMetal Enhanced - Windows ISO / build (push) Successful in 4m33s Details VM run reached OOBE but the region/keyboard pages were still interactive: the oobeSystem pass lacked Microsoft-Windows-International-Core, so 24H2 OOBE (CloudExperienceHost) prompted for them even under legacy Setup. Add it + HideOEMRegistrationScreen + HideLocalAccountScreen so OOBE is fully hands-off to the local account / desktop. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 00:16:49 +01:00
sysadmin	448de1c570	fix(windows/build): revert to prompt boot image (no-prompt caused reinstall loop) All checks were successful Build SilverMetal Enhanced - Windows ISO / build (push) Successful in 4m29s Details The no-prompt efisys + media-first boot order reboot-loops: every post-copy reboot re-boots the media before the disk install completes, so it never finishes (symptom: "no bootable device" after ejecting). Standard efisys.bin (press-any-key) lets reboots fall through to the installed disk. Legacy-Setup boot.wim patch + /unattend retained (the real fix). Documented VM-verified result + the residual one-click WinPE language page in iso-builder.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 23:58:12 +01:00
sysadmin	17b2ec2be7	fix(windows/build): launch legacy Setup with explicit /unattend All checks were successful Build SilverMetal Enhanced - Windows ISO / build (push) Successful in 4m47s Details Legacy Setup (forced via boot.wim CmdLine) still showed the language page because implicit answer-file search is unreliable when setup is launched via CmdLine. Inject autounattend.xml into boot.wim (X:\autounattend.xml) and set CmdLine to "X:\sources\setup.exe /unattend:X:\autounattend.xml" so all passes are consumed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 23:31:37 +01:00
sysadmin	5e6303d48e	feat(windows): force legacy Setup on 24H2 to fix hands-off install All checks were successful Build SilverMetal Enhanced - Windows ISO / build (push) Successful in 4m15s Details VM test proved Win11 24H2 redesigned "ConX" Setup ignores the windowsPE pass of autounattend.xml (manual language/keyboard/region prompts). Deep-research-verified fix: patch sources\boot.wim index 2 to launch the legacy installer. build.ps1 stage 2b: mount boot.wim idx2, load offline SYSTEM hive, set HKLM\SYSTEM\Setup\CmdLine=X:\sources\setup.exe, unload, commit. Also place autounattend.xml in \sources as well as ISO root. Legacy engine consumes all four passes -> fully hands-off. Documented in iso-builder.md §3a (incl. rejected winpeshl.ini / RunSynchronous alternatives + ConX-may-change caveat). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 23:20:37 +01:00
sysadmin	b4d303cbaa	feat(windows): unattended install — noprompt boot + disk config (M2) All checks were successful Build SilverMetal Enhanced - Windows ISO / build (push) Successful in 3m25s Details VM boot test proved the ISO boots under UEFI+SecureBoot+TPM2 but stopped at the "press any key" prompt and (post-boot) the disk screen. Enable hands-off install: - build.ps1: use efisys_noprompt.bin (fall back to efisys.bin) so the ISO boots without a keypress. - autounattend.xml: add GPT/UEFI DiskConfiguration (wipe disk 0 -> EFI/MSR/Win), ImageInstall index 1, AcceptEula (eval = no key). Bootstrap local-admin pw is a PLACEHOLDER the SKU pipeline must replace. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 21:55:47 +01:00
sysadmin	d26595d26f	ci(windows): persist validated ISO to stable runner path All checks were successful Build SilverMetal Enhanced - Windows ISO / build (push) Successful in 3m23s Details RUNNER_TEMP is ephemeral; copy the validated build output to C:\silvermetal\out\ so it can be retrieved out of band (e.g. for VM boot-testing). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 21:42:16 +01:00
SilverLABS	a6afc604c5	Merge pull request 'ci(windows): M2 ISO build + Gitea Windows-runner workflow' (#3 ) from ci/build-iso-windows into main Some checks failed Build SilverMetal Enhanced - Windows ISO / build (push) Failing after 19s Details	2026-06-08 20:13:11 +00:00
sysadmin	6d23a892b9	ci: remove throwaway runner-probe/runner-prep diagnostics Some checks failed Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 1m51s Details Their job is done (runner topology mapped, C: extended, ISO staged). The build + offline-validation pipeline is green on the runner. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 21:13:06 +01:00
sysadmin	5dbbaaf22c	fix(windows/build): drop oscdimg -bootdata inner quotes (PS arg mangling) All checks were successful Build SilverMetal Enhanced - Windows ISO / build (pull_request) Successful in 3m24s Details Stages 1-5 pass; oscdimg failed with Error 123 because PowerShell doubled the embedded quotes in -bootdata. Work paths have no spaces, so omit the inner quotes around etfsboot.com/efisys.bin entirely. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 21:08:33 +01:00
sysadmin	3effd5e338	ci(windows): pin base-ISO SHA + verify; ISO staged locally on runner Some checks failed Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 1m55s Details Base eval ISO staged at C:\silvermetal\base.iso on GITEA-RUN-WIN (SHA256 2CEE70BD...CB29 pinned in inputs.manifest.json). Repo var now points at that local path, so the build reads locally - no NAS share auth / no CI creds. Dropped -SkipInputVerify so the build verifies the pinned hash. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 20:58:07 +01:00
sysadmin	ee34b8e373	ci: probe credential-less net use as SYSTEM (stored cmdkey) Some checks failed Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 4s Details	2026-06-08 20:54:33 +01:00
sysadmin	78d4d84f88	ci: runner-prep workflow (extend C: only); drop in-CI ISO staging Some checks failed Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 4s Details Master creds must not live in this public repo's Actions, so ISO staging is handled out-of-band. runner-prep now only extends C: into the resized virtual disk. Quoted the step name (trailing-colon YAML fix). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 19:47:56 +01:00
sysadmin	cc01675056	ci: add throwaway runner-probe workflow to discover runner topology Some checks failed Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 4s Details Temporary diagnostic to see the silverlabs-runner-win host identity, drives, share mounts/stored creds, and ISO reachability before wiring the base-ISO source. Removed once the source is settled. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 19:33:37 +01:00
sysadmin	5e42da619e	ci(windows): make base-ISO acquire step path-aware (UNC/local + optional SMB creds) Some checks failed Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 4s Details SILVERMETAL_BASE_ISO_URL now accepts an HTTP(S) URL or a UNC/local path. For a UNC share that the SYSTEM-context runner can't read anonymously, optional repo secrets SILVERMETAL_ISO_SHARE_USER/_PASS map the share root via net use first. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 19:19:40 +01:00
sysadmin	1c886deca3	ci(windows): implement M2 ISO build + Gitea Windows-runner workflow Some checks failed Build SilverMetal Enhanced - Windows ISO / build (pull_request) Failing after 34s Details Implement build.ps1 (M2): mount/extract the base ISO, offline-service install.wim (inject GPD drivers if staged, debloat appx, bake SetupComplete.cmd + hardening modules into \Windows\Setup\Scripts), inject autounattend.xml, oscdimg UEFI repack, emit SHA-256 + SBOM. Elevation + oscdimg guarded. Add .gitea/workflows/build-iso-windows.yaml: runs on the self-hosted silverlabs-runner-win (windows-latest), ensures ADK Deployment Tools, acquires the base ISO from repo var SILVERMETAL_BASE_ISO_URL or a pre-staged path, builds, validates the baked payload offline, uploads SBOM/SHA (+ISO on dispatch/tag), attaches to a Gitea release on win-v* tags. Mirrors build-iso-linux.yaml. Add tests/Assert-IsoStructure.ps1: the no-nested-virt CI gate - mounts the built ISO + install.wim read-only and asserts autounattend.xml, SetupComplete.cmd, and the hardening modules are correctly baked. Full QEMU boot+Verify is a follow-on. Switch autounattend to Windows' native SetupComplete.cmd auto-run (SYSTEM, end of setup) instead of a duplicate FirstLogonCommands call. Untested until first runner execution (dev box is ARM64). All PS parse-clean; autounattend XML + workflow YAML valid. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 18:11:05 +01:00
SilverLABS	d58aa3ec17	Merge pull request 'docs(windows): Enhanced-Windows hardening spec (GPD Pocket 4 reference)' (#2 ) from docs/enhanced-windows-hardening-spec into main	2026-06-08 14:45:15 +00:00
sysadmin	3a30a0421e	docs(windows): add ISO-builder design + scaffold the windows/ tree Add windows/iso-builder.md: reproducible custom-packed-ISO pipeline design for SilverMetal Enhanced - Windows on IoT Enterprise LTSC. Covers the licensing frame (IoT = blessed channel for preinstalled custom images; self-apply stays a builder), 7 build stages (verify/extract/DISM-service/inject-unattend/brand/ oscdimg-repack/attest), the offline-vs-first-boot-vs-firmware control split, an honest reproducibility scope (pinned inputs + SBOM + attestation, NOT bit- identical on Windows), and M0-M4 milestones. Scaffold windows/ per the planned layout: - installer/ build.ps1 (7-stage orchestrator, stages stubbed to M2), inputs.manifest.json (pinned-input schema), autounattend.xml (local-account OOBE), oem/SetupComplete.cmd (first-boot runner) - hardening/ shared §A-H PowerShell modules + Verify-SilverMetalWindows.ps1 (used by BOTH the ISO first-boot path and the self-apply track). BitLocker module enforces TPM+PIN and blocks TPM-only. - policies/ wdac/ debloat/ stack-installer/ drivers/ tests/ scaffolded with READMEs; wdac/ documents audit->enforce; debloat/ flags Tiny11/NTLite as an anti-pattern; rename applocker/ -> wdac/ realised. All 11 PowerShell scripts parse clean; manifest JSON + autounattend XML valid. Module bodies are M1 scaffold (safe: log + policy-set; interactive/firmware steps documented, not faked). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 15:35:13 +01:00
sysadmin	ea2de4339d	docs(windows): add Enhanced-Windows hardening spec (Pocket 4 reference) Add windows/hardening-spec.md: the detailed config-layer hardening spec for SilverMetal Enhanced - Windows, with the GPD Pocket 4 (AMD Strix Point) as reference device. Eight control domains (provisioning, boot/firmware trust, data-at-rest, kernel/credential isolation, app control, network/radios, physical/lock-screen, privacy/update) each with verification commands, a buyer-facing residual-risk statement, and one-off -> SKU productization notes. Refine the windows/README.md v1 scope to match, grounded in the 2026-06-08 deep-research assessment: - BitLocker TPM+PIN (never TPM-only) - PIN defeats the faulTPM-class offline fTPM attack that is literally a BitLocker VMK extraction - WDAC (App Control), kernel-enforced, audit-first then enforce, as primary; AppLocker demoted to fallback (rename planned applocker/ -> wdac/) - Telemetry at GP+service+firewall layers, NOT hosts-file blocking of MS domains (that breaks Windows Update; violates "update or die") - Add VBS/HVCI/Credential Guard/Kernel DMA Protection to scope + verify gates - Note Enterprise (prototype) vs IoT Enterprise LTSC (SKU target) equivalence Bound by docs/threat-model.md and docs/design-principles.md; nation-state / firmware tier explicitly NOT claimed on consumer UMPC silicon. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 15:19:37 +01:00
SysAdmin	303f602d38	fix(linux/build): keep file handle open through TF patch loop (M1.1 iter38) All checks were successful Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Successful in 35m9s Details Run #4285 hit: Traceback (most recent call last): File "<stdin>", line 26, in <module> ValueError: seek of closed file iter37's Python heredoc had the search/seek/write loop OUTSIDE the `with open(...) as f:` block — the file closes when the `with` body finishes, and `data = f.read()` was the only statement inside it. Indent the loop inside the with-suite. No semantic changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:06:45 +01:00
SysAdmin	6bafa85231	fix(linux/build): byte-patch Rock Ridge TF dates after xorriso (M1.1 iter37) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 17m27s Details Run #4284's diagnostic (iter36) confirmed xorriso ignores every date-setting command we throw at it for the node it just -updated: flag=0x0e → CREATION + MODIFICATION + ACCESS (short form) CREATION ✅ (set from source file btime via touch -d): 7e 05 08 00 2c 3a 00 (= SOURCE_DATE_EPOCH) MODIFICATION ❌ (still wall-clock): A=7e 05 08 01 02 2c 00 B=7e 05 08 01 12 33 00 ACCESS ❌ (still wall-clock): A=7e 05 08 01 02 2c 00 B=7e 05 08 01 12 32 00 Tested across iters 34-36: * `-alter_date_r all "=N" /` — only fixed CREATION (b) * `-alter_date all "=N" path` after -update — same * `-volume_date c m x f u s "=N"` — volume-level only * `touch -d "@N" "${new_sqfs}"` before — fixed CREATION via btime * various orderings, with/without `--` terminators None override xorriso's wall-clock stamping of MOD/ACCESS at -commit. Concede that fight and just patch the bytes after xorriso writes the ISO. We KNOW exactly what's wrong — the TF entry for /live/filesystem.squashfs has its CREATION slot correct (= 7-byte ISO9660 short-form encoding of SOURCE_DATE_EPOCH) but MODIFICATION and ACCESS still hold the post-process commit time. So copy the 7 CREATION bytes over the 7 MODIFICATION bytes and 7 ACCESS bytes. The patcher (embedded Python, since silvermetal-builder ships python3): * Finds every TF entry header (`54 46 1a 01 0e`) near the "filesystem.squashfs" NM tag (96-byte window — anchors both ends so we don't touch some other file's TF entry). * Copies CREATION (offset +5..+12) onto MODIFICATION (+12..+19) and ACCESS (+19..+26). * Skips entries already correct (so re-running is a no-op). * Reports how many entries were patched. This is surgical: only the entry we know is broken, and only when its MOD/ACCESS actually differ from the (known-correct) CREATION. If the next run still drifts, the diagnostic byte-offset will tell us where the residual leak is (almost certainly in some volume descriptor field we haven't covered yet — at which point we extend the patcher). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 02:22:56 +01:00
SysAdmin	60384e70c8	fix(linux/build): explicit -alter_date all on updated squashfs node (M1.1 iter36) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m27s Details Run #4283's enriched diagnostic gave us a precise, low-level reading of what's still drifting: Hex around first ISO divergence: flag=0x0e → CREATION + MODIFICATION + ACCESS (Rock Ridge TF, short form) CREATION: `7e 05 08 00 06 2d 00` (=SOURCE_DATE_EPOCH, both A and B ✅) MODIFICATION: A=`7e 05 08 00 18 10 00` → 2026-05-08 00:24:16 B=`7e 05 08 00 28 14 00` → 2026-05-08 00:40:20 ACCESS: A=`7e 05 08 00 18 0f 00` → 2026-05-08 00:24:15 B=`7e 05 08 00 28 13 00` → 2026-05-08 00:40:19 The MODIFICATION/ACCESS times match the wall-clock minute when each build's xorriso -commit fired. So: * iter35's `touch -d "@${SDE}" "${new_sqfs}"` did nothing for mtime — xorriso doesn't propagate the source file's mtime through -update. * iter34's `-alter_date_r all "=N" /` updated creation (btime → Rock Ridge TF CREATION) but not mtime/atime — possibly because -update runs at -commit time and re-stamps the node's a/m timestamps with the actual write time, after `-alter_date_r`'s in-memory update. Fix: add an explicit, narrowly-scoped `-alter_date all "=N" /live/filesystem.squashfs --` AFTER `-update` and BEFORE the global `-alter_date_r`. Per-file alter_date appears to be the last word xorriso processes against that specific node. Keep -alter_date_r all and the full -volume_date c/m/x/f/u/s as belt-and-suspenders. If this clears, M1.1 reproducibility gate passes. If not, we'll know xorriso's `-update` is genuinely stamping at commit time independent of any in-memory date setting, and the move is to skip -update and do an mkisofs-style full rewrite from the chroot directly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 01:44:58 +01:00
SysAdmin	1b1a1eabed	fix(linux/build): touch squashfs to SOURCE_DATE_EPOCH before xorriso (M1.1 iter35) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m25s Details Run #4282's enriched diagnostic pinpointed the exact remaining drift: diagnose: first ISO byte difference at offset 205152 (LBA 100) 205153 7 10 205154 27 0 205155 57 3 205156 52 55 Decoded as decimal, those are the day/hour/minute/second fields of an ISO9660 7-byte directory record date: A: dd=7 hh=23 mm=47 ss=42 (May 7 23:47:42 UTC) B: dd=8 hh=0 mm=3 ss=45 (May 8 00:03:45 UTC) Match the wall-clock mtime of /live/filesystem.squashfs that the TOC diff also still showed: -/live/filesystem.squashfs ... May 7 23:47 +/live/filesystem.squashfs ... May 8 00:03 Why iter34's `-alter_date_r all "=N" /` didn't catch it: xorriso applies `-alter_date_r` to the in-memory ISO node table, but `-update <src> <iso_path>` writes the directory record's mtime at `-commit` time using the SOURCE FILE's mtime — overriding whatever was in the node table. So the relevant mtime is on `/tmp/silvermetal-rebuilt- XXXXXX.squashfs` (the freshly-`mksquashfs`d file), and that has wall-clock mtime. Fix: touch the source file to SOURCE_DATE_EPOCH right before xorriso reads it. sudo touch -d "@${SOURCE_DATE_EPOCH}" "${new_sqfs}" Bonus: diagnose-divergence.sh now falls back to `od -t x1z` when xxd isn't available — silvermetal-builder ships coreutils but not vim-common, so the iter34 xxd window was silently empty. The new od-based dump is what landed the actual byte values in run #4282. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 01:06:45 +01:00
SysAdmin	34bc442dd8	fix(linux/build): cover all ISO9660 dates + locate residual byte drift (M1.1 iter34) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m40s Details Run #4281 cleared every layer above the ISO9660 wrapper: SHA256 (squashfs payload) caed117ca72c6c1d9204c49dd749d5f7b372f3a19cac1b2a7e66bee452a8d501 /tmp/.../a.squashfs caed117ca72c6c1d9204c49dd749d5f7b372f3a19cac1b2a7e66bee452a8d501 /tmp/.../b.squashfs …squashfs is now byte-identical, ISO TOC is identical, file listing diff is empty, but ISO SHA still differs. The remaining drift is in the ISO9660 metadata region between the system area (first 32 KiB) and the file payload start. Two complementary changes: 1. xorriso post-process now sets every date field xorriso writes, not just the obvious two: -alter_date_r all — atime + mtime + btime on all nodes, not just mtime. ISO9660 directory records carry creation+modification timestamps. -volume_date c m x f u s — every volume-descriptor date: c=creation m=modification x=expiration f=effective u=system area s=path table Default for any unset volume_date is "now", which is what was leaking through despite us setting c+m. 2. diagnose-divergence.sh now does whole-file cmp -l (capped at 200 lines so 1 GiB of all-different doesn't drown the report) and on any divergence, dumps a 128-byte xxd window from each ISO around the first differing byte plus a unified diff between the two windows. This tells us in the next failure log "first byte differs at offset N (LBA M), bytes around it look like X" — pinpoints the ISO9660 region without needing artifact download. Workflow tail-into-log step wired up the two new files (iso-cmp-first-200.txt, iso-around-first-diff.diff). If iter34 still fails the gate, the new diagnostic tells us exactly which structure (volume descriptor, path table, directory record, boot catalog…) is still drifting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 00:29:37 +01:00
SysAdmin	33e1501611	fix(linux/build): scrub apt lists + apt/dpkg logs from chroot (M1.1 iter33) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 35m47s Details Run #4280 cleared every previously-seen non-determinism (post-process ran end-to-end, all five iter32-fixed flag paths worked). The next diffoscope-flagged set is the runtime state that every Debian build captures and reproducible-rebuilders strip: /var/lib/apt/lists/127.0.0.1:9977_debian-fasttrack_…/InRelease The InRelease file from the FastTrack repo carries `Date:` and a fresh PGP signature with a 30-minute drift between Build A's fetch (22:00 UTC) and Build B's fetch (22:30 UTC). FastTrack re-signs roughly daily, so apt pickup lands on different signed files when the two builds bracket a re-sign. snapshot.debian.org doesn't cover FastTrack so we can't pin upstream — strip the file instead. apt-get update regenerates it on first boot. /var/lib/apt/lists/_home_user_derivative-binary_aptrepo_local_…/Release The locally-built kicksecure apt repo's Release file. reprepro stamps this with wall-clock time when it generates the repo. SOURCE_DATE_EPOCH is honoured for the underlying package metadata but reprepro writes Release with the current time regardless. /var/log/apt/history.log /var/log/apt/term.log /var/log/dpkg.log Wall-clock-stamped logs from package installation. Every apt/dpkg invocation prepends a timestamp. Cleanup added: * /var/log/apt/.log /var/log/{dpkg,alternatives}.log * /var/lib/apt/lists/{everything except lock and partial/} The live system regenerates all of these on first use. Standard reproducible-Debian rebuilder behaviour (Tails, Whonix-public-iso, debian-cd all do the equivalent). If the diffoscope output for run #4280 is honest about the full delta — and grep ├── shows exactly five entries — this should be the last divergence. Crossing fingers for run #4281. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 23:50:14 +01:00
SysAdmin	5e5026088d	fix(linux/build): terminate xorriso -alter_date_r path list with -- (M1.1 iter32) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 37m14s Details Run #4279 hit: xorriso : FAILURE : Cannot find path '/-volume_date' in loaded ISO image xorriso : aborting : -abort_on 'FAILURE' encountered 'FAILURE' `-alter_date_r type timestring iso_rr_path [***]` takes a variable-length path list. xorriso terminates that list either at the end of the command line or at a literal `--`. Without the terminator, the next intended option (`-volume_date`) is consumed as another path to set mtime on, blows up because there's no node called `/-volume_date`, and FAILURE-severity propagates to a hard exit. Add `--` after the `/` argument to close the path list. -volume_date c/m then take effect as expected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 23:10:02 +01:00
SysAdmin	d354040bd6	fix(linux/build): scrub apt/ldconfig caches + force xorriso mtimes (M1.1 iter31) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 17m44s Details Run #4278 with iter30's chroot scrub still produced different ISOs. The diagnostic was clean and pointed at a tight set of remaining divergences: * Inside the squashfs, three files differed: /var/cache/apt/pkgcache.bin /var/cache/apt/srcpkgcache.bin /var/cache/ldconfig/aux-cache — all post-install binary caches with internal pointers/timestamps that vary across runs. Standard reproducible-Debian practice is to drop them; `apt` regenerates pkgcache on first `apt-get update` (and implicitly when anything else needs it), and ldconfig regenerates aux-cache on its next run. * In the outer ISO TOC: /boot.catalog mtime May 7 21:27 vs May 7 21:44 /live/filesystem.squashfs May 7 21:27 vs May 7 21:44 — xorriso's `-update` and the boot-catalog rewrite were stamping files with wall-clock time, not SOURCE_DATE_EPOCH. Two additions to post_process_for_reproducibility: 1. Three more entries in the chroot rm list (apt's two pkgcaches and ldconfig aux-cache). 2. xorriso post-update fixups: -alter_date_r m "=${SOURCE_DATE_EPOCH}" / -volume_date c "=${SOURCE_DATE_EPOCH}" -volume_date m "=${SOURCE_DATE_EPOCH}" set every file's mtime in the ISO and both volume-descriptor dates to the pinned epoch. (`=N` is xorriso's syntax for a literal decimal epoch.) If diffoscope flagged everything in run #4278 honestly (its full output was 3 file diffs in the squashfs + the squashfs metadata size delta, then nothing — TOC was reduced to just the two mtime lines), this should clear M1.1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 22:50:28 +01:00
SysAdmin	84179b3642	fix(linux/build): xorriso -return_with SORRY 0 to tolerate MBR size warning (M1.1 iter30) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 37m27s Details iter29 wired up the chroot scrub + squashfs rebuild + ISO patch. Run #4277 confirmed every actual operation succeeded: Updating '/tmp/silvermetal-rebuilt-MFqm7S.squashfs' to '/live/filesystem.squashfs' xorriso : UPDATE : Added/overwrote '/live/filesystem.squashfs' (899m) Differences detected and updated. (runtime 0.5 s) xorriso : NOTE : Keeping boot image unchanged ISO image produced: 506049 sectors Writing to '...silvermetal-clean.iso' completed successfully. …then xorriso re-assessed the freshly-written ISO and raised: libburn : SORRY : Read start address 525977s larger than number of readable blocks 506240 libisofs: NOTE : Found Protective MBR with size range larger than the medium capacity xorriso : NOTE : Tolerated problem event of severity 'SORRY' xorriso : NOTE : -return_with SORRY 32 triggered by problem severity SORRY That's the protective MBR header recording the original ISO size (525977 sectors) but our replaced squashfs is smaller, so the new ISO totals 506240 sectors. The protective MBR is purely a compatibility shim for tools that don't understand GPT — bootloaders consult the GPT and El Torito tables, both of which are self-consistent in the new ISO. The diagnostic is genuinely benign. xorriso's default `-return_with SORRY 32` made it exit 32, which `set -e` in build-inner.sh propagated up, killing the build. Add `-return_with SORRY 0` to the post-process xorriso invocation: keep the warning visible in the log but accept a SORRY as exit-zero given the operation reported `completed successfully` for the write itself. Note: this scoping is only on the post-process xorriso. Anywhere else upstream in derivative-maker can still use xorriso's default strictness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 22:09:55 +01:00
SysAdmin	10e099fcf9	fix(linux/build): scrub nvme/hostid + dkms logs, rebuild squashfs (M1.1 iter29) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 17m56s Details Run #4276's diffoscope (now actually working — see iter28) pinned the M1.1 reproducibility failure to exactly two files inside the rootfs squashfs: /etc/nvme/hostid - c5867514-b138-4bfc-a2ae-f801d05a3606 + 62e3fae3-692d-4451-ab04-353e27547806 /var/lib/dkms/tirdad/0.1/<kver>/x86_64/log/make.log - Thu May 7 20:23:04 UTC 2026 + Thu May 7 20:39:14 UTC 2026 - # elapsed time: 00:00:01 + # elapsed time: 00:00:00 Inner squashfs file sizes differed by 4 bytes (983547059 vs 983547063); the outer ISO size matched because squashfs pads to block boundaries. Both files come from upstream Debian package postinsts that run inside the live-build chroot: * nvme-cli's postinst calls `nvme gen-hostnqn` and writes a fresh random UUID to /etc/nvme/hostid the first time it's installed. Standard fix in reproducible-Debian rebuilders is to remove these files at the end of chroot setup — nvme-cli regenerates them on first boot. * DKMS captures wall-clock build times in its module make.log. The file is only consulted when troubleshooting a failed module build; on a successful chroot it has no runtime function. Drop /var/lib/dkms/<…>/log/ entirely. Both fixes have to land inside the chroot before mksquashfs seals it. derivative-maker doesn't expose a hook for that, and we don't want to fork upstream's chroot-scripts-post.d, so build-inner.sh now does the cleanup itself after derivative-maker exits, then rebuilds the squashfs and patches it back into the ISO with xorriso -update. mksquashfs flags chosen for max determinism: -reproducible -mkfs-time $SOURCE_DATE_EPOCH -all-time $SOURCE_DATE_EPOCH -no-exports -no-xattrs -all-root -no-recovery -comp xz -b 1M -Xdict-size 100% xorriso -update swaps just /live/filesystem.squashfs while -boot_image any keep preserves the El Torito + GPT/UEFI bootability bits unchanged. Adds ~5-7 minutes per build (mksquashfs of ~1 GiB chroot + xorriso ISO rewrite) but is the final blocker between us and the M1.1 reproducibility gate passing. Two independent runs from the same commit will now produce byte-identical squashfs payloads, byte- identical ISOs, and byte-identical SHA256SUMS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 21:49:25 +01:00
SysAdmin	c8eac79afc	fix(linux/build): xorriso -extract needs -osirrox on (M1.1 iter28) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 36m19s Details Run #4275's TOC parser worked perfectly — found /live/filesystem.squashfs as the largest file (983,547,904 bytes, right where it should be) — but extraction still bailed: diagnose: largest file in ... is /live/filesystem.squashfs; extracting diagnose: could not extract rootfs from A xorriso's -extract action requires -osirrox to be turned on at the start of the command line; without it, -extract is silently rejected ("OSIRROX is not enabled by default. -osirrox on permits it."). Our script swallowed stderr and the only signal was the empty output file. Two changes: * Add `-osirrox on` to every -extract invocation. * On extraction failure, surface the captured stderr (last 30 lines) into the workflow log instead of dropping it. Saves us one round-trip if the next thing breaks. ISO layout from the iter27 dump for the record: /live/filesystem.squashfs 983547904 bytes ← rootfs /live/initrd.img-... 62929840 bytes /live/vmlinuz-... 12113856 bytes /boot/grub/efi.img 3342336 bytes /EFI/boot/{boot,grub}x64.efi + grub modules under /boot/grub/{i386-pc,x86_64-efi}/ The named-path probe for /live/filesystem.squashfs was already first in the list — it'll succeed cleanly now and we skip the largest-file fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 21:07:39 +01:00
SysAdmin	a2bee4b5dc	fix(linux/build): better squashfs extraction + dump TOC sample (M1.1 iter27) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m47s Details Run #4274 made progress: identical ISO sizes, identical TOC, identical first 8 KiB — divergence is fully in file payload bytes. But the diagnostic stalled because extract_squashfs() couldn't find the rootfs: diagnose: could not extract squashfs from A diagnose: could not extract squashfs from B Two reasons to address: 1. The named-path probes only checked /live/filesystem.squashfs, /casper/filesystem.squashfs and /filesystem.squashfs. Some live-build configs use /install/... or no canonical name at all. 2. The fallback that used `xorriso -find / -name '.squashfs'` then piped to `xorriso -extract` didn't work because xorriso's -find output quotes paths, and -extract chokes on quotes. This iteration: Adds /install/filesystem.squashfs and /boot/filesystem.squashfs to the named-path probes. * Replaces the -find/-name/tail fallback with a generic "biggest file in the ISO" picker. In a live-build ISO the rootfs payload is reliably the largest file regardless of what it's called. Parses lsdl output (with awk, handling spaces in paths and stripping single-quote framing). * On extraction failure, dumps the top 20 files by size to stderr so the workflow log shows what's actually in the ISO — answers "what should the named-path probe match" for the next iter. * Always echoes the first 30 lines of toc-a.txt (and the line count) so we can sanity-check the ISO layout in every run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 20:32:01 +01:00
SysAdmin	c9e67d8b47	fix(linux/build): staged divergence diagnostic, avoid OOM (M1.1 iter26) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m36s Details Run #4273 confirmed two things: 1. The reproducibility gate works end-to-end. Both builds produced ISOs (1077194752 vs 1077202944 bytes — 8 KiB delta, exactly one squashfs block worth of compressed-payload drift) and the compare step caught it. 2. diffoscope, run on the whole 1 GB ISO inside the silvermetal-builder container, gets OOM-killed before producing any output: diagnose-divergence.sh: line 44: 13 Killed diffoscope --max-report-size 100000000 --html ... --text ... A.iso B.iso The host has 19 GiB free, but diffoscope's full recursion through ISO -> squashfs -> ~thousands of inner files needs more memory than that for a 1 GB image. Setting --max-report-size only caps the output, not the working-set. Rewrite diagnose-divergence.sh to do staged, cheap-to-expensive analysis: 1. sha256 + sizes (always) 2. xorriso TOC of both ISOs (every node: mode/size/mtime/path) -> diff 3. Pull just live/filesystem.squashfs out of each ISO, sha256 it + `unsquashfs -ll` it, diff the listings — this is where the per-file-size signal lives. 4. Targeted diffoscope on the squashfs payload only, with --max-container-depth 2 + --max-text-report-size 5MB + --no-html + a 10-minute timeout. Bounded enough to finish without the OOM. Drops `set -e` — every step `\|\| true`s itself so we get partial output even when one stage fails. Workflow tail-into-log step now prints the new staged outputs: * toc-diff.txt — what changed at the ISO level * sqfs-ls-diff.txt — which inner files have different sizes/mtimes * sqfs-diff.txt — diffoscope on the squashfs only * squashfs-sha256.txt * iso-header-cmp.txt — first-8KB cmp -l for header-level drift * sizes.txt / sha256.txt / checklist.md as before Should land us a focused list of "these N files inside the squashfs have different bytes" — that's what we need to find what's leaking non-determinism into the build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 19:54:35 +01:00
SysAdmin	3f51b2fd7f	feat(linux/build): run diffoscope inside silvermetal-builder + tail diff to log (M1.1 iter25) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 35m22s Details Run #4272 hit the M1.1 reproducibility gate as designed — both builds completed, ISOs differed (A=ff2e7444…, B=9ec7f3da…), diagnose-divergence fired. Two things stopped that diagnostic from being useful: 1. diffoscope wasn't available. diagnose-divergence.sh runs in the catthehacker job container, which has cmp but no diffoscope. The silvermetal-builder image we built two minutes earlier does have diffoscope-minimal (Dockerfile.builder line 109). Run the diagnostic inside that image: docker run --volumes-from $self_cid + the digest the builder-image job passed in via BUILDER_IMAGE. Mounts the same /workspace path so REPO_ROOT-relative resolution in diagnose-divergence.sh works unchanged. 2. The artifact was unreachable. actions/upload-artifact@v3 against Gitea 1.25.2 reports "successfully uploaded" but the /api/v1/repos/.../actions/runs/{id}/artifacts list comes back empty, and every download path probed returns 404. Known v3 incompatibility — v3 uses the legacy GitHub Services API endpoint that Gitea doesn't expose for retrieval. Workaround: tail the divergence content into the workflow log directly, so it shows up in `gitea actions logs` regardless of upload-artifact's behaviour. Specifically: sizes.txt, sha256.txt, checklist.md, head -n 400 of diff.txt (or cmp.txt as fallback). That's enough to see what's diverging without needing the artifact. Upload-artifact step kept in place for whenever Gitea's API gets sorted (fix-once-then-forget). The self-discovery loop (docker ps + inspect filtering by /workspace/SilverLABS/SilverMetal mount destination) is the same one build.sh uses; concurrency: 1 in this workflow guarantees a single match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 19:14:44 +01:00
SysAdmin	5bb24235bd	fix(linux/build): tolerate find perm-denied in chroot scan (M1.1 iter24) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 2s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m43s Details 🎉 Run #4271's Build A actually produced the ISO. derivative-maker ran clean for 15:24: INFO: Script ./derivative-maker completed. Exit Code: 0. Errors Detected: 0. Execution Time: 00:15:24 '/home/user/derivative-binary/.../Kicksecure-CLI-18.1.7.4-developers-only.Intel_AMD64.iso' -> '/workspace/SilverLABS/SilverMetal/build-a/Kicksecure-CLI-18.1.7.4-developers-only.Intel_AMD64.iso' …but build-inner.sh then died on its own post-build collection step: find: '.../live-build/chroot/usr/src': Permission denied find: '.../live-build/chroot/etc/sudoers.d': Permission denied find: '.../live-build/chroot/boot': Permission denied … The chroot's standard hardened subdirs (/usr/src, /etc/sudoers.d, /etc/cron., /boot, /root, /run/{sudo,lvm,cryptsetup,openvpn-{client, server}}, cache/bootstrap/root) are 0700 root-owned because the live-build chroot was assembled under sudo. As `user` (uid 1000) we can't descend them. find emits Permission denied on each, exits with status 1, and `set -euo pipefail` in build-inner.sh propagates that through `xargs cp` and aborts — even though the ISO copy itself had already succeeded a few lines earlier in the same xargs stream. Fix: redirect find's stderr to /dev/null and tolerate non-zero exit on both the .iso and *.manifest scans. build.sh already verifies an ISO landed in BUILD_DIR (exit 4 with "no ISO produced" if not), so a real miss is still caught — we just stop killing the script for the benign unreadable-chroot-subdirs case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 18:32:00 +01:00
SysAdmin	b0f1ab30f4	fix(linux/build): symlink /home/user/derivative-maker to checkout (M1.1 iter23) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 17m24s Details Run #4270's Build A made it 2:40 deep — past sanity-tests, prepare- build-machine, local-deps, into 2100_create-debian-packages — then died on: + /workspace/.../genmkfile/usr/bin/genmkfile reprepro-remove running: dm-reprepro-wrapper remove local age-api + /usr/bin/dm-reprepro-wrapper: line 28: /home/user/derivative-maker/help-steps/pre: No such file or directory Earlier `dm-reprepro-wrapper includedsc/includedeb` calls succeeded because 2100_create-debian-packages invokes them by absolute path (`$source_code_folder_dist/packages/.../developer-meta-files/usr/bin/ dm-reprepro-wrapper`) — the in-repo copy resolves help-steps/pre relative to its own location. `genmkfile reprepro-remove` calls `dm-reprepro-wrapper` via PATH instead, so the system copy at /usr/bin/dm-reprepro-wrapper wins. That copy was installed by 1500_local-deps `apt install`-ing the in-repo developer-meta-files.deb into the silvermetal-builder image at runtime. The .deb's intended layout assumes the matching derivative-maker checkout lives at /home/user/derivative-maker — the upstream-blessed path. Ours is at /workspace/SilverLABS/SilverMetal/linux/build/ derivative-maker, so the relative source() at line 28 walks off into nowhere. Bridge the gap with a symlink at the start of build-inner.sh: ln -sfn "${REPO_ROOT}/linux/build/derivative-maker" \ /home/user/derivative-maker That keeps our self-referential CI bind-mount topology (we still cd into REPO_ROOT/.../derivative-maker, derivative-maker still computes paths relative to itself), but also makes the system copy of dm-reprepro-wrapper find help-steps/pre and friends. Both reprepro wrappers (in-repo and system-installed) now resolve to the same files via the symlink, so the silvermetal-reprepro-wrap.sh PATH precedence shadow at /usr/local/bin/reprepro keeps applying to both code paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 18:11:58 +01:00
SysAdmin	5918305fd7	fix(linux/build): find self via docker inspect, cgroupns hides cgroup path (M1.1 iter22) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 2s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 4m37s Details iter21's /proc/self/cgroup approach hit: build.sh: cgroup contents: 0::/ Empty path — act_runner runs job containers with cgroupns enabled, so the in-container view of cgroup paths is rooted at the namespace, with no trace of the host-side container ID. Same blocker as `hostname`. The host docker daemon does know who we are, and we have its socket. We're the only running container with /workspace/SilverLABS/SilverMetal as a mount destination (concurrency: 1 in the workflow), so iterate docker ps and match by mount destination. Found CID becomes the --volumes-from argument; if no match, dump docker ps to the log and fail loud. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 18:04:41 +01:00
SysAdmin	4a837e07ed	fix(linux/build): discover job container ID from cgroup, not hostname (M1.1 iter21) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 2s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m17s Details Run #4268's build-and-verify died <1s into Build A: docker: Error response from daemon: No such container: docker Cause: build.sh's CI path uses `--volumes-from "$(hostname)"` to inherit the parent job container's /workspace mount, but in the new runner config (network: host applied via the now-actually-loaded config.yaml) `hostname` returns the literal string "docker" inside catthehacker/ubuntu:act-latest — the image bakes that into /etc/hostname and act_runner doesn't override it. So `--volumes-from docker` looks for a container literally named "docker", finds nothing, exits. This worked in earlier runs (#4260) only because config.yaml wasn't being loaded (see iter18 commit), so the runner ran on its built-in defaults — which kept the container's hostname as the auto-generated container ID. Fixing config.yaml exposed this latent bug. Right way to learn your own container ID inside a Linux container is /proc/self/cgroup, which contains the 64-char hex ID on every cgroup driver: cgroup v1: 12:devices:/docker/<64-hex> cgroup v2: 0::/system.slice/docker-<64-hex>.scope awk extracts the first 64-hex run; that becomes the --volumes-from argument. If extraction fails (would only happen on a non-docker runtime), fail loud rather than silent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 17:59:48 +01:00
SysAdmin	ec942b7698	fix(linux/build): bind only config.json, not whole /root/.docker (M1.1 iter20) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 1m24s Details Run #4267 finally got the bind mount through (Merged Binds includes /root/.docker:/root/.docker:ro), but docker build then died: failed to update builder last activity time: open /root/.docker/buildx/activity/.tmp-...: read-only file system The catthehacker job container uses buildx, which writes activity tracking to /root/.docker/buildx/. Mounting the whole host /root/.docker read-only made that path read-only too. Right scope is the file, not the dir: -v /root/.docker/config.json:/root/.docker/config.json:ro That gives the cli the registry auth it needs while leaving the rest of /root/.docker on the container's writable overlay so buildx can populate its own activity dir without colliding with the host's. Also matches the principle of mounting the minimum the secret requires. valid_volumes entry updated to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 17:52:35 +01:00
SysAdmin	ced77e305f	fix(linux/build): valid_volumes takes source paths, not bind specs (M1.1 iter19) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Failing after 1s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Has been skipped Details Run #4266 dropped the /root/.docker bind silently: Custom container.HostConfig from options ==> &{Binds:[/root/.docker:/root/.docker:ro]…} [/root/.docker] is not a valid volume, will be ignored Merged container.HostConfig ==> &{Binds:[/var/run/docker.sock:/var/run/docker.sock /root/.docker:/root/.docker:ro]…} no basic auth credentials Wait, the merged binds list does include /root/.docker — but the line between them, "[/root/.docker] is not a valid volume, will be ignored", fires during the merge step's allowlist check, and the bind ends up absent in the actual container start (the `Binds:` list shown is pre-filter). Net result: the registry creds are not in the job container, push fails. Root cause: container.valid_volumes is an allowlist of source-path globs, not full bind specs. The entry `/root/.docker:/root/.docker:ro` was being treated as a literal pattern and never matched the bind's source `/root/.docker`. Same for the other two entries — they were just no-ops because the auto-mount / explicit options were the things actually creating the binds. Fix: rewrite valid_volumes entries as bare source paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 17:51:17 +01:00
SysAdmin	c205139e86	fix(linux/build): drop duplicate docker.sock mount from runner options (M1.1 iter18) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Failing after 6s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Has been skipped Details Run #4265 (the first run after the config.yaml wiring fix actually took effect) failed with: failed to create container: 'Error response from daemon: Duplicate mount point: /var/run/docker.sock' act_runner v0.4.1 already auto-mounts /var/run/docker.sock into every job container; listing it a second time in container.options is a hard error on container create. Same likely applies to /cache, which the workflow doesn't actually use anyway (the inner build.sh bind- mounts via REPO_ROOT/BUILD_DIR, not /cache). Trim container.options down to only the bind act_runner doesn't provide: -v /root/.docker:/root/.docker:ro for registry credentials. valid_volumes stays as the broader allowlist for workflow-requested mounts but doesn't force the mounts itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 17:49:51 +01:00
SysAdmin	f66585e0b1	fix(linux/build): wire config.yaml into act_runner via CONFIG_FILE env Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Failing after 0s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Has been skipped Details The runner config.yaml on disk was decorative — never read. The upstream gitea/act_runner image's run.sh only adds `--config <file>` when the CONFIG_FILE env var is set, and our compose set neither CONFIG_FILE nor mounted config.yaml into the container. So `timeout: 240m`, `container.options`, `valid_volumes` etc. were silently ignored and the runner ran on built-in defaults. This is also why iter17's `-v /root/.docker:/root/.docker:ro` addition to config.yaml had no effect on run #4264: the runner never read it. The push still failed with "no basic auth credentials". Fix: bind-mount ./config.yaml into the runner container at /etc/act_runner/config.yaml and set CONFIG_FILE to that path. After a `docker compose up -d --force-recreate`, the runner picks up everything in config.yaml — including the per-job-container /root/.docker bind. Per-job timeouts in build-iso-linux.yaml are set via `timeout-minutes: 240` at the job level, which overrides the daemon default anyway, so nothing was visibly broken before. But silently-ignored config is a trap for the next thing we add to config.yaml, so wire it correctly now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 17:48:07 +01:00
SysAdmin	e7a5fdd629	fix(linux/build): mount /root/.docker into job containers (M1.1 iter17) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Failing after 2s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Has been skipped Details Run #4263 cleared the new builder-image job's `docker build` step cleanly but `docker push` died with: no basic auth credentials The runner host (10.0.0.51) is logged in to docker-registry.silverlabs.uk — that's how iter1-15 builder images got pushed by hand. But the silvermetal-builder act_runner only mounts /root/.docker into its own container, not into the job containers it spawns. catthehacker/ubuntu: act-latest runs as root and reads /root/.docker/config.json for auth; without that file mounted in, docker-cli has no creds to send via the DooD socket and the registry returns 401 Basic-realm. Fix: extend the act_runner `container.options` to mount /root/.docker:/root/.docker:ro into each job container, and add the same entry to valid_volumes. Update the runner README so first-time deploys know the host-side `docker login` is what makes the in-CI push work. This requires a one-time runner redeploy on 10.0.0.51: cd /opt/silvermetal-builder-runner git pull docker compose up -d --build After that, the builder-image job pushes cleanly and feeds its digest to build-and-verify as designed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 17:33:35 +01:00
SysAdmin	e260fe1c81	ci(linux/build): self-host the builder image build + iter16 reprepro wrap (M1.1) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Failing after 2s Details Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Has been skipped Details Two coupled changes that unblock the M1.1 iter loop. Both belong in CI; iter1-15 was wrong to require human-in-the-loop steps to make progress. 1. CI now builds Dockerfile.builder. `.gitea/workflows/build-iso-linux.yaml` grows a `builder-image` job that runs ahead of `build-and-verify`. It rebuilds the silvermetal- builder image from `linux/build/docker/Dockerfile.builder`, pushes it to `docker-registry.silverlabs.uk/silvermetal-builder:m1.1-<sha>` (and `:latest`), reads the resulting digest off `docker inspect`, and feeds it forward as a job output. `build-and-verify` consumes that digest as the `BUILDER_IMAGE` env override that `build.sh` already honours (and validates is digest-form on line ~37). That kills the old workflow where every Dockerfile.builder change required a human to `docker build` + `docker push` on 10.0.0.51 by hand and then bump the digest in `build.sh` in lockstep. The crash that triggered this (exit 126 mid-iter16 build run) was a symptom of that off-CI step still existing. Both jobs run on the existing `silvermetal-builder` runner; the host docker daemon is shared via DooD and is already authenticated to `docker-registry.silverlabs.uk` (linux/build/runner/docker-compose.yml mounts `/root/.docker:/root/.docker:ro`), so no extra login step. The hardcoded `BUILDER_IMAGE` digest in `build.sh` stays as the local-developer / offline-rebuild fallback. Comments updated in `build.sh`, `Dockerfile.builder`, and `linux/build/README.md` to match the new flow. 2. reprepro wrapper for the benign "No priority for X" case. Pinned derivative-maker's `2100_create-debian-packages` (with --target iso) re-imports source packages from snapshot.debian.org into a local apt repo via `reprepro --basedir … includedsc local <foo>.dsc`. The local repo's `conf/distributions` ships no `DscOverride` entries, so any source package whose `.dsc` lacks an explicit Priority field trips: No priority for 'X', skipping. There have been errors! …and reprepro exits 255. dm-reprepro-wrapper bubbles that up, 2100_create-debian-packages aborts. The current offender is `virtualbox_.dsc` (key import is now fine — debian-keyring landed in commit `4aa59ba` — but the priority field gap remains). VirtualBox is not in SilverMetal's `--target iso` set, so the sane behaviour is "log it, continue". New `linux/build/docker/silvermetal-reprepro-wrap.sh` shadows `/usr/bin/reprepro` at `/usr/local/bin/reprepro` (PATH precedence). It runs the real reprepro, captures merged stdout+stderr, and: - if rc != 0 AND every non-blank output line matches one of the known-benign patterns ("No priority for 'X', skipping." plus the trailing "There have been errors!"), emits the output, logs one line of explanation to stderr, and exits 0; - otherwise emits the output and propagates rc unchanged. Any other* reprepro error path stays fatal — only the specific "No priority for X" pattern is neutralised. `dm-reprepro-wrapper` resolves `reprepro` via `\$PATH` so it picks up the wrapper transparently. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 17:30:08 +01:00
SysAdmin	4aa59ba633	fix(linux/build): non-interactive mode + visible output + key import (M1.1) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 11m33s Details Run #4260 cleared every harness layer and ran for 18 minutes — past sanity-tests, prepare-build-machine, cowbuilder-setup, local-deps — into 2100_create-debian-packages, where it died on: Could not check validity of signature with '92978A6E195E4921825F7FF0F34F09744E9F5DD9' in '/home/user/derivative-binary/temp_packages_debian_sid/virtualbox_7.2.8-dfsg-1.dsc' as public key missing! …and then also hung the runner indefinitely because, on any error, derivative-maker's exception_handler_general detected a TTY (we passed `docker run -t`) and dropped into an interactive `read -p 'Answer? '` prompt that nothing was ever going to answer. The orphan docker run in turn orphaned the act_runner job container, blocking the runner until manual cleanup. Three coordinated fixes, validated end-to-end with docker-side smoke tests on 10.0.0.51: 1. Non-interactive mode without losing output visibility. The original architectural goal: keep derivative-maker out of interactive mode (`[ -t 0 ]` must be false) AND keep the build log visible to docker run / Gitea Actions (PTY needed somewhere). Resolution: - `docker run -t` is kept (required for /dev/console to be a real PTY back to docker), but no `-i`, so fd 0 stays /dev/null. - docker-entrypoint.service: `StandardInput=tty-force` → `StandardInput=null` so the service's fd 0 is /dev/null too. Verified inside the container: `[ -t 0 ]` returns false. - entrypoint.sh now wraps the user command with an explicit `> /dev/console 2>&1` redirect before writing it to /etc/docker-entrypoint-cmd. systemd's `StandardOutput=inherit` does NOT propagate PID-1's stdout to services in this PID-1- systemd-in-container topology — the service log was going nowhere visible. /dev/console under `docker run -t` IS the allocated PTY back to docker, so the redirect surfaces the log to the act_runner / Gitea Actions log. - entrypoint.sh's `[ ! -t 0 ] && exit 1` guard removed (it would now always trigger). 2. debian-keyring for reprepro source-package signature checks. 2100_create-debian-packages calls dm-reprepro-wrapper includedsc on every .dsc in temp_packages_debian_sid (including virtualbox_.dsc, even for `--target iso` — see line 114 of that build step). reprepro verifies the dsc signature against the user's GPG keyring; without the maintainer keys it fails. Adds `debian-keyring` to Dockerfile.builder. build-inner.sh now imports debian-keyring.gpg / debian-maintainers.gpg / debian-nonupload.gpg into the user's keyring before running derivative-maker. 3. BUILDER_IMAGE digest re-pinned.* Built natively on 10.0.0.51 (per memory: never on WSL/aarch64). New digest: sha256:2f680c96…f0db. Smoke-test results (against this exact image): ==> START ← user output reaches docker stdout (keyring present) ← debian-keyring imported successfully STDIN_NOT_TTY ← derivative-maker WILL stay non-interactive ==> END ← clean shutdown docker run exit: 42 ← exit code propagates correctly on failure Files: Dockerfile.builder, systemd-entrypoint/entrypoint.sh, systemd-entrypoint/docker-entrypoint.service, scripts/build.sh, scripts/build-inner.sh. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 14:05:49 +01:00
SysAdmin	9c406598e2	fix(linux/build): pin user_name=user, mkdir derivative-binary (M1.1) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 18m13s Details Run #4259 (the systemd-in-container debut) cleared every prior failure class, ran for 15 minutes, then died inside 1100_sanity-tests' aptgetopt_conf_add at: tee: /home/root/derivative-binary/30_derivative-maker.conf: No such file or directory last_failed_bash_command: tee --append -- "$dist_aptgetopt_file" > /dev/null Two compounding bugs: 1. user_name resolves to "root" via $SUDO_USER derivative-maker/help-steps/variables (lines 80-93) computes user_name with these fallbacks, in order: [ -n "$user_name" ] \|\| user_name="$SUDO_USER" [ -n "$user_name" ] \|\| user_name="$(logname 2>/dev/null)" if [ -z "$user_name" ] && [ "$(id -u)" != "0" ]; then user_name="$(whoami)" [ -n "$user_name" ] \|\| user_name="$USER" fi build.sh enters the container as root (systemd's docker-entrypoint.service runs as root), then sudoes to user via `sudo --preserve-env -u user --`. sudo always sets SUDO_USER to the calling user (= root), regardless of --preserve-env. So variables.sh hits the first fallback and computes user_name="root", then HOMEVAR=/home/root, then binary_build_folder_dist= /home/root/derivative-binary — a directory that does not exist because root's home is /root (not /home/root). Fix: build-inner.sh now exports user_name=user before sourcing the config, satisfying the first-priority check in variables.sh and short-circuiting the SUDO_USER fallback. The comment in the script notes the failure mode for the next reader. 2. Missing mkdir of derivative-binary Upstream's derivative-maker-docker-start does: mkdir --parents -- "${HOME}/derivative-binary" before invoking derivative-maker. Our build-inner.sh skipped that because previous iterations didn't reach the point where it mattered. Now that we do, we replicate it. 3. Output collection path correction derivative-maker writes its ISO/manifest output into ${HOME}/derivative-binary (per variables.sh:109) — not into the source tree under linux/build/derivative-maker. The previous `find . -maxdepth 6 -type f -name "*.iso"` would have missed everything once we got that far. Updated to `find "${HOME}/derivative-binary" ...`. No image rebuild needed — this is a pure script-and-env change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 12:47:47 +01:00
SysAdmin	38ac4f8a96	fix(linux/build): systemd-in-container build host (M1.1) Some checks failed Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 15m34s Details Run #4258 cleared the systemctl shim only to die two seconds later on the next expectation derivative-maker has of a real systemd host: its sources.list points at http://127.0.0.1:9977/debian (the approx package-cache socket-activated by systemd) and apt-get update could not reach the daemon because nothing was actually started by the no-op shim: Err:1 http://127.0.0.1:9977/debian trixie InRelease Could not connect to 127.0.0.1:9977 (127.0.0.1). - connect (111: Connection refused) Whack-a-mole'ing each service derivative-maker tries to start (approx today, then journald, then systemd-logind, then who-knows-what tomorrow) is going to keep failing for a while — derivative-maker is fundamentally designed for a real systemd-managed Debian host. The container pattern upstream itself ships (linux/build/derivative-maker/docker/) runs systemd as PID 1 inside the container; this commit adopts that approach. Architecture: - PID 1 in the build container is now systemd. Upstream's vendored entrypoint.sh records the user-supplied command into /etc/docker-entrypoint-cmd, captures env into /etc/docker-entrypoint-env, masks irrelevant units, and execs systemd. systemd boots, docker-entrypoint.service runs the command, docker-entrypoint-stop.sh propagates the exit code via `systemctl exit <code>` so the container exits with the right status. - The four entrypoint files (entrypoint.sh, docker-entrypoint.service / .target, docker-entrypoint-stop.sh) are vendored at linux/build/docker/systemd-entrypoint/ rather than COPY'd from the submodule path — Docker build context can only reach below itself, and bumping is tracked in that dir's README. - Container runtime now requires --cgroupns=host, --tmpfs /run, --tmpfs /run/lock, and -v /sys/fs/cgroup:/sys/fs/cgroup:rw so systemd can manage cgroups properly. -t allocates a TTY, satisfying entrypoint.sh's `[ ! -t 0 ] && exit 1` check in CI where stdin is otherwise /dev/null. - User renamed builder → user (uid 1000, passwordless sudo) to match upstream's USER=user / HOME=/home/user convention. chown in build.sh now uses uid 1000:1000 so it's name-agnostic. - Image package list grew to match upstream's derivative-maker-docker-setup (sq stack + dbus + approx + the rest) plus our ISO toolchain (live-build / debootstrap / xorriso / squashfs-tools / etc.). Snapshot.debian.org pinning is preserved (same APT_SNAPSHOT_URL, two-phase install pattern). Verified: Smoke test on 10.0.0.51 — `docker run --rm --privileged --cgroupns=host --tmpfs /run --tmpfs /run/lock -v /sys/fs/cgroup:...:rw -t <image> /bin/bash -c 'echo OK'` — booted systemd, ran the command via docker-entrypoint.service, captured the output, shut down filesystems and exited cleanly. build.sh BUILDER_IMAGE pin → sha256:dc9dd29d…8811. Image rebuilt natively on 10.0.0.51, pushed to docker-registry.silverlabs.uk. The systemctl shim is removed by virtue of the Dockerfile rewrite — real systemd makes it unnecessary. The previous "iter6 / iter7" intermediate digests stay in the registry until we GC; the live one is m1.1-iter8-systemd. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 12:06:47 +01:00

1 2

67 Commits