Run #4273 confirmed two things: 1. The reproducibility gate works end-to-end. Both builds produced ISOs (1077194752 vs 1077202944 bytes — 8 KiB delta, exactly one squashfs block worth of compressed-payload drift) and the compare step caught it. 2. diffoscope, run on the whole 1 GB ISO inside the silvermetal-builder container, gets OOM-killed before producing any output: diagnose-divergence.sh: line 44: 13 Killed diffoscope --max-report-size 100000000 --html ... --text ... A.iso B.iso The host has 19 GiB free, but diffoscope's full recursion through ISO -> squashfs -> ~thousands of inner files needs more memory than that for a 1 GB image. Setting --max-report-size only caps the output, not the working-set. Rewrite diagnose-divergence.sh to do staged, cheap-to-expensive analysis: 1. sha256 + sizes (always) 2. xorriso TOC of both ISOs (every node: mode/size/mtime/path) -> diff 3. Pull just live/filesystem.squashfs out of each ISO, sha256 it + `unsquashfs -ll` it, diff the listings — this is where the per-file-size signal lives. 4. Targeted diffoscope on the squashfs payload only, with --max-container-depth 2 + --max-text-report-size 5MB + --no-html + a 10-minute timeout. Bounded enough to finish without the OOM. Drops `set -e` — every step `|| true`s itself so we get partial output even when one stage fails. Workflow tail-into-log step now prints the new staged outputs: * toc-diff.txt — what changed at the ISO level * sqfs-ls-diff.txt — which inner files have different sizes/mtimes * sqfs-diff.txt — diffoscope on the squashfs only * squashfs-sha256.txt * iso-header-cmp.txt — first-8KB cmp -l for header-level drift * sizes.txt / sha256.txt / checklist.md as before Should land us a focused list of "these N files inside the squashfs have different bytes" — that's what we need to find what's leaking non-determinism into the build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
12 KiB
12 KiB