6 Commits

Author SHA1 Message Date
1b1a1eabed fix(linux/build): touch squashfs to SOURCE_DATE_EPOCH before xorriso (M1.1 iter35)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m25s
Run #4282's enriched diagnostic pinpointed the exact remaining drift:

    diagnose: first ISO byte difference at offset 205152 (LBA 100)
    205153   7  10
    205154  27   0
    205155  57   3
    205156  52  55

Decoded as decimal, those are the day/hour/minute/second fields of an
ISO9660 7-byte directory record date:
    A: dd=7  hh=23  mm=47  ss=42  (May 7 23:47:42 UTC)
    B: dd=8  hh=0   mm=3   ss=45  (May 8 00:03:45 UTC)

Match the wall-clock mtime of /live/filesystem.squashfs that the TOC
diff also still showed:
    -/live/filesystem.squashfs ... May  7 23:47
    +/live/filesystem.squashfs ... May  8 00:03

Why iter34's `-alter_date_r all "=N" /` didn't catch it: xorriso
applies `-alter_date_r` to the in-memory ISO node table, but `-update
<src> <iso_path>` writes the directory record's mtime at `-commit`
time using the SOURCE FILE's mtime — overriding whatever was in the
node table. So the relevant mtime is on `/tmp/silvermetal-rebuilt-
XXXXXX.squashfs` (the freshly-`mksquashfs`d file), and that has
wall-clock mtime.

Fix: touch the source file to SOURCE_DATE_EPOCH right before xorriso
reads it.

    sudo touch -d "@${SOURCE_DATE_EPOCH}" "${new_sqfs}"

Bonus: diagnose-divergence.sh now falls back to `od -t x1z` when xxd
isn't available — silvermetal-builder ships coreutils but not
vim-common, so the iter34 xxd window was silently empty. The new
od-based dump is what landed the actual byte values in run #4282.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 01:06:45 +01:00
34bc442dd8 fix(linux/build): cover all ISO9660 dates + locate residual byte drift (M1.1 iter34)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m40s
Run #4281 cleared every layer above the ISO9660 wrapper:

    SHA256 (squashfs payload)
    caed117ca72c6c1d9204c49dd749d5f7b372f3a19cac1b2a7e66bee452a8d501  /tmp/.../a.squashfs
    caed117ca72c6c1d9204c49dd749d5f7b372f3a19cac1b2a7e66bee452a8d501  /tmp/.../b.squashfs

…squashfs is now byte-identical, ISO TOC is identical, file listing
diff is empty, but ISO SHA still differs. The remaining drift is in
the ISO9660 metadata region between the system area (first 32 KiB)
and the file payload start.

Two complementary changes:

1. xorriso post-process now sets *every* date field xorriso writes,
   not just the obvious two:

     -alter_date_r all     — atime + mtime + btime on all nodes,
                             not just mtime. ISO9660 directory
                             records carry creation+modification
                             timestamps.
     -volume_date c m x f u s — every volume-descriptor date:
       c=creation  m=modification  x=expiration  f=effective
       u=system area  s=path table
     Default for any unset volume_date is "now", which is what was
     leaking through despite us setting c+m.

2. diagnose-divergence.sh now does whole-file cmp -l (capped at 200
   lines so 1 GiB of all-different doesn't drown the report) and on
   any divergence, dumps a 128-byte xxd window from each ISO around
   the first differing byte plus a unified diff between the two
   windows. This tells us in the next failure log "first byte differs
   at offset N (LBA M), bytes around it look like X" — pinpoints the
   ISO9660 region without needing artifact download.

Workflow tail-into-log step wired up the two new files
(iso-cmp-first-200.txt, iso-around-first-diff.diff).

If iter34 still fails the gate, the new diagnostic tells us exactly
which structure (volume descriptor, path table, directory record,
boot catalog…) is still drifting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 00:29:37 +01:00
c8eac79afc fix(linux/build): xorriso -extract needs -osirrox on (M1.1 iter28)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 36m19s
Run #4275's TOC parser worked perfectly — found
/live/filesystem.squashfs as the largest file (983,547,904 bytes,
right where it should be) — but extraction still bailed:

    diagnose: largest file in ... is /live/filesystem.squashfs; extracting
    diagnose: could not extract rootfs from A

xorriso's -extract action requires -osirrox to be turned on at the
start of the command line; without it, -extract is silently rejected
("OSIRROX is not enabled by default. -osirrox on permits it."). Our
script swallowed stderr and the only signal was the empty output
file.

Two changes:
  * Add `-osirrox on` to every -extract invocation.
  * On extraction failure, surface the captured stderr (last 30
    lines) into the workflow log instead of dropping it. Saves us
    one round-trip if the next thing breaks.

ISO layout from the iter27 dump for the record:
    /live/filesystem.squashfs   983547904 bytes  ← rootfs
    /live/initrd.img-...         62929840 bytes
    /live/vmlinuz-...            12113856 bytes
    /boot/grub/efi.img            3342336 bytes
    /EFI/boot/{boot,grub}x64.efi
    + grub modules under /boot/grub/{i386-pc,x86_64-efi}/

The named-path probe for /live/filesystem.squashfs was already first
in the list — it'll succeed cleanly now and we skip the largest-file
fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 21:07:39 +01:00
a2bee4b5dc fix(linux/build): better squashfs extraction + dump TOC sample (M1.1 iter27)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m47s
Run #4274 made progress: identical ISO sizes, identical TOC, identical
first 8 KiB — divergence is fully in file payload bytes. But the
diagnostic stalled because extract_squashfs() couldn't find the rootfs:

    diagnose: could not extract squashfs from A
    diagnose: could not extract squashfs from B

Two reasons to address:

1. The named-path probes only checked /live/filesystem.squashfs,
   /casper/filesystem.squashfs and /filesystem.squashfs. Some live-build
   configs use /install/... or no canonical name at all.

2. The fallback that used `xorriso -find / -name '*.squashfs'` then
   piped to `xorriso -extract` didn't work because xorriso's -find
   output quotes paths, and -extract chokes on quotes.

This iteration:
  * Adds /install/filesystem.squashfs and /boot/filesystem.squashfs
    to the named-path probes.
  * Replaces the -find/-name/tail fallback with a generic "biggest
    file in the ISO" picker. In a live-build ISO the rootfs payload
    is reliably the largest file regardless of what it's called.
    Parses lsdl output (with awk, handling spaces in paths and
    stripping single-quote framing).
  * On extraction failure, dumps the top 20 files by size to stderr
    so the workflow log shows what's actually in the ISO — answers
    "what should the named-path probe match" for the next iter.
  * Always echoes the first 30 lines of toc-a.txt (and the line
    count) so we can sanity-check the ISO layout in every run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 20:32:01 +01:00
c9e67d8b47 fix(linux/build): staged divergence diagnostic, avoid OOM (M1.1 iter26)
Some checks failed
Build SilverMetal Linux ISO (reproducibility-gated) / builder-image (push) Successful in 1s
Build SilverMetal Linux ISO (reproducibility-gated) / build-and-verify (push) Failing after 33m36s
Run #4273 confirmed two things:

1. The reproducibility gate works end-to-end. Both builds produced
   ISOs (1077194752 vs 1077202944 bytes — 8 KiB delta, exactly one
   squashfs block worth of compressed-payload drift) and the compare
   step caught it.

2. diffoscope, run on the whole 1 GB ISO inside the silvermetal-builder
   container, gets OOM-killed before producing any output:

       diagnose-divergence.sh: line 44:    13 Killed
         diffoscope --max-report-size 100000000 --html ... --text ... A.iso B.iso

   The host has 19 GiB free, but diffoscope's full recursion through
   ISO -> squashfs -> ~thousands of inner files needs more memory than
   that for a 1 GB image. Setting --max-report-size only caps the
   output, not the working-set.

Rewrite diagnose-divergence.sh to do staged, cheap-to-expensive
analysis:
  1. sha256 + sizes (always)
  2. xorriso TOC of both ISOs (every node: mode/size/mtime/path) -> diff
  3. Pull just live/filesystem.squashfs out of each ISO,
     sha256 it + `unsquashfs -ll` it, diff the listings — this is
     where the per-file-size signal lives.
  4. Targeted diffoscope on the squashfs payload only, with
     --max-container-depth 2 + --max-text-report-size 5MB + --no-html
     + a 10-minute timeout. Bounded enough to finish without the OOM.

Drops `set -e` — every step `|| true`s itself so we get partial output
even when one stage fails.

Workflow tail-into-log step now prints the new staged outputs:
  * toc-diff.txt   — what changed at the ISO level
  * sqfs-ls-diff.txt — which inner files have different sizes/mtimes
  * sqfs-diff.txt   — diffoscope on the squashfs only
  * squashfs-sha256.txt
  * iso-header-cmp.txt — first-8KB cmp -l for header-level drift
  * sizes.txt / sha256.txt / checklist.md as before

Should land us a focused list of "these N files inside the squashfs
have different bytes" — that's what we need to find what's leaking
non-determinism into the build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 19:54:35 +01:00
4444dc11f3 feat(linux/build): scaffold reproducible ISO build pipeline (M1.1)
Vendors Kicksecure derivative-maker as a pinned submodule (18.1.7.4),
adds the wrapper + verify + diagnose scripts, the pinned builder image,
and the reproducibility-gated Gitea Actions workflow. Base flavour only —
no hardening overlay (that's M1.2).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 04:25:48 +01:00