A cross-section of geological strata — layered bands of sediment — standing in for the buried history of a software repository.
essaysMay 31, 20265 min read
By

Strata: reading a repository as a dig site

I spent a night building a small agent toolkit that reads this site's git history as sediment — the rework, dead ends, companions, and quiet maintenance a finished feature is built to hide — and wrote ten interactive essays from what it dug up. Every number is real.

A deployed website is a surface. It is the part meant to be seen, and it is built — deliberately, with care — to hide its own making. The polish that makes a feature feel inevitable is the same polish that erases the weeks of detours behind it. Tonight I tried to dig that labor back up.

A repository is a dig site

Bowker and Star have a line that has stuck with me for years: infrastructure becomes invisible exactly when it works. You stop seeing the road and start seeing only the destination. The same recession happens to the labor that built a thing — success closes over its own history. But a git repository is one of the few places where that history is kept in full: dated, attributed, recoverable. So I treated this site's repository the way you'd treat a hillside cut open by a road — as strata. The deployed pages are the surface. Underneath are the abandoned attempts, the rework, the files dragged along, the renamings, the quiet upkeep.

The method is the argument

I built a little organization of agents to do the digging — an Excavator that surfaces the invisible labor behind a shipped feature, a Biographer that tells the whole life of a single file, a Classification Lab that treats the directory tree as a filing scheme and logs every reclassification, and a Necromancer that scores the features begun and abandoned. None of it invents anything. Every figure is regenerated by a program reading the commit graph — here, the full history, 1,327 commits. That constraint is the whole point: the numbers are real or they are nothing.

What it found

The bibliography at the foot of one essay looks like it sprang up whole — and it did, in a single day. Yet of the 1,377 lines written for it, only 720 survive; 657 were written over with no deletion ever recorded, because the page is generated. The most-touched file on the site is CLAUDE.md, the guide no visitor ever loads — amended in 27 commits, almost none of them about it; it rides along with every feature like a witness. One directory alone holds 98 files that were written and then deleted, a whole hand-built trip-app login among them, abandoned for a shared sign-in. Seventeen audio files were carried in one move from a novel's manuscript folder into the public tree — same sound, wholly different identity. And some corners have gone silent not because they failed but because they were finished, intact and asking for nothing.

The discarded version is not waste. It is the reasoning that the surviving version inherited.

The honest part

The tenth essay turns the instrument on itself. The survival metric — the spine of the whole series — pins past 100% in 37 of 71 areas, because relocations and generated rebuilds leave more lines standing than were ever counted as inserted. Earlier in the night, run against a shallow clone, the tool nearly quoted a stray merge line as CLAUDE.md's first authored words. A truncated past produces a confident, wrong origin. None of that voids the other nine essays; where the instrument breaks, it breaks loudly and in public, which is the only honest way for an instrument to break. The tool is another stratum — measured here in turn, and just as partial as everything it measured.

Experience it yourselfExplore the Strata series →