RESEARCH PAPER

Classification is Infrastructure

Planning Layers, Taxonomic Commitment, and the Architecture of Intelligent Systems

Jake LawrenceInfrastructure Theory3 Case StudiesMarch 2026

THE IDEA

The Categories Are Not Scaffolding — They’re Load-Bearing Walls

Why the most consequential design decisions in intelligent systems are taxonomic, not algorithmic.

Every system that acts under uncertainty depends on a classification layer — a set of categories that determines what the system can represent and therefore what it can do. A doctor can only diagnose conditions their manual contains. A compiler can only select strategies its lattice enumerates. An agent can only reason about distinctions its architecture represents.

This paper argues that these classification layers function as infrastructure in the technical sense developed by Bowker and Star: embedded in other structures, transparent when working, constitutive rather than descriptive, and resistant to change once installed.

The implication is stark: improving the algorithm only searches within the space the taxonomy defines. Only taxonomic revision can expand it. And the taxonomy becomes invisible precisely when it is most consequential.

The better a taxonomy works, the harder it is to see as a taxonomy. Transparency produces reification: the tool becomes invisible, and the world it constructs is mistaken for the world as it is.

3 CASE STUDIES

One Pattern, Three Domains

The same structural pattern — taxonomy defines possibility, optimization searches within it, taxonomy becomes invisible — appears in radically different systems.

1🧠Psychiatric ClassificationThe DSM · Hard infrastructure

The DSM’s diagnostic categories have become institutional infrastructure — persisting for decades despite known scientific inadequacy. They define what conditions are diagnosable, what insurance covers, how research is designed, how patients understand their own suffering. Multiple superior alternatives exist; none can displace the DSM because the switching costs are societal.

2🧮Inference Execution PlanningLLM-QP · Soft infrastructure

The plan lattice in a constrained decoding optimizer determines what strategies are discoverable. No cost model, however sophisticated, can find an operator the lattice doesn’t contain. A purely computational instance of the same structural pattern — no human institutions involved, but the constitutive property is identical.

3🪞Agent Cognitive ArchitectureSAGEN · Medium infrastructure

A six-module cognitive architecture determines the behavioral ceiling of the agent. Of 20 information dimensions evaluated, 16 required explicit architectural support — no improvement in the underlying LLM produces them if the architecture doesn’t represent them. The adapter pattern provides a deliberate anti-reification mechanism.

THE HARDNESS SPECTRUM

Soft

LLM-QP

Extendable by design. No looping. Technical switching costs.

Medium

SAGEN

Modular at the domain layer, constraining at the module layer.

Hard

DSM

Looping effects. Societal switching costs. No anti-reification mechanism.

THE INFRASTRUCTURE PARADOX

Best When Invisible, Most Dangerous When Invisible

Classification works best when you don’t notice it — but that’s exactly when it’s hardest to question.

1. Design

Categories are chosen as practical tools — provisional, purpose-specific, explicitly acknowledged as decisions. Everyone knows they’re making choices.

2. Installation

Categories get embedded in practice. They’re learned as part of professional socialization, linked to conventions, connected to standards and other systems.

3. Transparency

Categories become invisible. Users look through them, not at them. They cease to feel like choices and start to feel like features of the world.

4. Reification

Categories are treated as discoveries rather than decisions. Provisional conventions harden into what feel like natural kinds. The taxonomy is now load-bearing — and nobody remembers building the walls.

The cycle is not a failure of vigilance. It is a consequence of infrastructure’s defining property: to function, it must be transparent; to be transparent, it must become invisible; to become invisible, it must cease to feel like a choice.

RUN IT

See the Pattern Move

The thesis is abstract until you can break it. Below, the load-bearing taxonomy claim made literal in the SAGEN engine. Then the two soft-infrastructure cases, now runnable systems you can drive yourself.

Interactive · the taxonomy is load-bearing

Collapse the categories, lose the thought

The same SAGEN conversation, the same engine, the same code path. The only thing that changes is the entity taxonomy the World Model is allowed to use. Watch what becomes unthinkable when the categories collapse.

Trajectory the agent can represent

progressInitial topics: Python, web scraping

pivotPivoted: web scraping -> cooking, pasta carbonara

progressContinuing: Python, web scraping

pivotPivoted: cooking, web scraping -> BeautifulSoup, Python, debugging

topic pivots detected

With the real taxonomy the engine detects 2 pivots: it can tell that the user abandoned Python for pasta and later lurched into a debugging panic. “The subject changed” is a representable thought.

SAGENMedium infrastructure

Watch the six-module taxonomy decide what the agent can represent. →

LLM-QPSoft infrastructure

Drive the plan lattice: the router can only choose operators the lattice contains. →

WHY IT MATTERS

Taxonomic Commitment as First-Class Design

The paper doesn’t argue for taxonomic nihilism (all categories are arbitrary) or taxonomic perfectionism (there’s one right answer). It argues for taxonomic awareness: treating the classification layer as a first-class design decision with structural consequences that persist long after the decision is made.

The practical implications include building anti-reification mechanisms into systems (like SAGEN’s adapter pattern or LLM-QP’s formal equivalence proofs), designing layered taxonomic architectures that separate deep cognitive infrastructure from shallow domain taxonomy, and budgeting for taxonomic debt the same way we budget for technical debt.

This paper provides the shared theoretical framework that connects the author’s work on constrained decoding (LLM-QP) and cognitive architecture (SAGEN), showing they are instances of a single underlying pattern.

Read the Full Paper

Cross-case analysis, infrastructure properties, taxonomic debt, design implications

→

View PDF Download PDF

References & lineage

1.Bowker, G. C., & Star, S. L. (1999). Sorting Things Out: Classification and Its Consequences. MIT Press.

2.Star, S. L., & Ruhleder, K. (1996). Steps toward an ecology of infrastructure. Information Systems Research, 7(1).

3.Hacking, I. (1999). The Social Construction of What? (the looping effects of human kinds). Harvard University Press.

4.Lampland, M., & Star, S. L. (eds.) (2009). Standards and Their Stories. Cornell University Press.