FULL PAPER

Classification is Infrastructure

Planning Layers, Taxonomic Commitment, and the Architecture of Intelligent Systems

Jake Lawrence · Independent Researcher · March 2026

ABSTRACT

Every system that acts under uncertainty depends on a classification layer — a set of categories that determines what the system can represent and therefore what it can do. We argue that classification layers function as infrastructure in the technical sense developed by Bowker and Star (1999): embedded in other structures, transparent when working, constitutive rather than descriptive, and resistant to change once installed. It follows that the most consequential design decisions in such systems are taxonomic, not algorithmic. We develop this thesis through three case studies spanning radically different domains: psychiatric nosology (the DSM), LLM inference execution planning (LLM-QP), and LLM agent cognition (SAGEN). Across all three cases, the same structural pattern holds: the taxonomy defines the space of possibility, optimization searches only within that space, and the taxonomy becomes invisible precisely when it is most consequential.

SECTION 1

Introduction — The Taxonomic Blind Spot

Across AI, medicine, and systems design, enormous effort goes into optimizing algorithms, architectures, training data, and governance. The categories those systems operate over receive comparatively little deliberate attention — yet the classification layer is where the behavioral ceiling gets set.

Central claim. In any system where an agent must act under uncertainty, the classification layer functions as infrastructure. It is embedded, transparent, constitutive, and resistant to change. The most consequential design decisions are taxonomic, not algorithmic.

The paper previews three case studies as questions each case answers:

Can a classification system become so embedded that it persists for decades despite known scientific inadequacy? Psychiatric classification shows that it can.
Does taxonomic commitment constrain optimization even in purely computational systems with no human institutions? LLM-QP’s plan lattice shows that it does.
Can a system be deliberately designed to resist the infrastructural hardening that makes classification invisible? SAGEN’s adapter pattern attempts this.

Scope. The paper argues for taxonomic awareness — not taxonomic nihilism (all categories are arbitrary) or taxonomic perfectionism (one right taxonomy exists). It treats the classification layer as a first-class design decision with structural consequences that persist long after the decision is made.

SECTION 2

Classification as Infrastructure

Bowker and Star’s Sorting Things Out (1999) identified eight properties that characterize infrastructure. These are not a checklist; they form a mechanism.

#	Property	Description
1	Embeddedness	Sunk into other structures, social arrangements, and technologies
2	Transparency	Invisible in routine use — users look through the system
3	Reach or scope	Extends beyond a single event or one-site practice
4	Learned as membership	Acquired as part of socialization into a community of practice
5	Links with practice	Shapes and is shaped by the practices it organizes
6	Embodiment of standards	Plugs into other standards and is determined by their reach
7	Built on installed base	Inherits strengths and limitations of predecessors (path dependency)
8	Visible upon breakdown	Disappears when working; surfaces only when something fails

Why these properties interact. Properties 1–2 (embeddedness + transparency) explain invisibility: because the taxonomy is sunk into everything and invisible in use, it ceases to feel like a design decision. Properties 7–8 (installed base + breakdown visibility) explain lock-in: because the taxonomy inherits from its predecessors and is only noticed when it fails, revision is both constrained by path dependency and triggered only by crisis. The combination — invisibility plus lock-in — is the mechanism that produces the infrastructure paradox.

Key Concepts from Bowker and Star

Boundary object. A shared artifact used differently by different communities while maintaining enough structural identity to coordinate across them.
Torque. The biographical tension experienced by people whose lives don’t fit the available categories.
Infrastructural inversion. The methodological move of foregrounding what is normally in the background.

SECTION 3

Taxonomic Commitment

Definition. A taxonomic commitment is the set of categories a system operates with — the distinctions it can draw, the groupings it can represent, the boundaries it enforces. Every system that classifies makes a taxonomic commitment, whether or not it acknowledges doing so.

The key property. A taxonomic commitment defines the space of representable states, and therefore the space of achievable behaviors. An optimizer can only search within the space its taxonomy defines. A clinician can only diagnose conditions their manual contains. An agent can only reason about distinctions its architecture represents. Only taxonomic revision can expand this space.

Philosophical Grounding

Dupré’s promiscuous realism. Multiple equally legitimate ways to classify the same reality exist, each optimized for different purposes.
Zachar’s practical kinds. Categories are tools, not mirrors. Validity is measured by how well they serve their intended purposes.
Hacking’s interactive kinds. In domains involving human subjects, the categories and the classified co-constitute each other. The taxonomy is not just a lens on reality but an intervention in it.

Three Predictions

Constitutiveness. The taxonomy defines the representable space, not merely describes it.
Lock-in. The taxonomy becomes harder to revise the more successfully it is installed.
Invisibility through transparency. The taxonomy becomes harder to see the more fluently practitioners use it.

CASE STUDY I

Psychiatric Classification — The DSM

The most fully developed case. All eight infrastructure properties instantiated in an existing system with decades of history and global reach. Demonstrates all three predictions at maximum intensity — including looping effects unavailable in the computational cases.

The System

The DSM’s categories determine what insurance covers, how research is designed, which drugs are developed, how courts evaluate competency, how schools provide accommodations, how patients understand their own suffering. It maps against all eight infrastructure properties: embedded in insurance billing, legal standards, pharmaceutical regulation, EHRs, educational accommodation systems, military benefits. Transparent to trained clinicians who have internalized its categories as perceptual habits.

The Taxonomic Commitment

The DSM’s core commitments were design decisions, not discoveries:

Categorical rather than dimensional
Symptom-based rather than etiological
Atheoretical with respect to causation
Individual rather than relational as the unit of analysis

Spitzer’s DSM-III optimized reliability (inter-rater agreement) over validity, institutional utility over phenomenological accuracy. Once installed, these choices became invisible — the water clinicians swim in. Consequences include categories that group together people with wildly different experiences (1,030 unique symptom profiles under one depression diagnosis) and arbitrary thresholds (five of nine symptoms for two weeks).

The Switching Cost Problem

Multiple technically superior alternatives exist: RDoC (dimensional, biologically grounded), HiTOP (hierarchical dimensional), network theory (no latent variable assumption), process-based therapy (transdiagnostic mechanisms). None has displaced the DSM — not because it is better, but because it is installed.

Replacement requires simultaneous overhaul of insurance billing systems, retraining of hundreds of thousands of clinicians, revision of legal standards across all jurisdictions, rebuilding of EHRs, reanalysis of decades of research, and managing disruption to millions of people whose identity is organized around current categories. The installed base wins.

What This Case Uniquely Reveals

Looping. DSM categories do not merely describe mental disorders; they partially constitute them. The classified become aware of the classification, change their behavior, and thereby change the phenomenon. This is Hacking’s interactive kinds operating at institutional scale.

Boundary object persistence. The DSM endures not because any single community finds it optimal but because it is adequate enough for coordination across all communities — clinicians, insurers, researchers, lawyers, pharmaceutical companies, patients — while being optimal for none.

CASE STUDY II

Inference Execution Planning — LLM-QP

A compressed, computational instance of the same structural pattern. No looping, lower switching costs, but identical constitutive property. Demonstrates that taxonomic commitment is structural, not sociological.

The System

Constrained LLM decoding requires checking every token against a validity set. LLM-QP formalizes the observation that multiple execution strategies are semantically equivalent — they produce identical token sequences — but differ in runtime cost. Five physical plans implement the single logical operation DecodeStep(query, constraint_state): dense projection head, sparse adjacency scoring, amortized score update, amortized update with rerank, and full recomputation.

The Taxonomic Commitment

The plan lattice — the set of five physical implementations the planner considers — is LLM-QP’s taxonomic commitment. A planner whose lattice lacks an amortized operator cannot discover amortized savings, regardless of cost model sophistication or bandit algorithm. This mirrors database query optimizers: the physical operator enumeration defines the selection space; the cost model selects among lattice-provided options.

Infrastructure Properties

Embeddedness. The lattice is embedded in the MLIR/StableHLO compiler pipeline.
Transparency. When working well, invisible — the system just runs fast.
Built on installed base. Strategies constrained by existing kernel implementations and hardware capabilities.
Links with practice. The bandit achieves sub-linear regret relative to oracle plan selection — but convergence to the best plan within the lattice, not the best plan conceivable.

What This Case Uniquely Reveals

LLM-QP isolates the constitutive property in a domain stripped of human institutions, identity, and looping. The plan lattice is a deliberate engineering artifact, yet it exhibits the same constitutive property as the DSM. The case also contributes an anti-reification device: the formal plan equivalence proof, which makes the lattice’s contingency visible — the plans are choices among equivalent alternatives, not the unique correct implementation.

CASE STUDY III

Agent Cognitive Architecture — SAGEN

Bridges the gap between the computational and institutional cases. Constitutive like LLM-QP; shapes perception and action like the DSM; but includes a deliberate anti-reification mechanism.

The System

SAGEN provides LLM agents with structured, persistent situational awareness through six cognitive modules on a shared blackboard: Goal Graph, Trajectory, World Model, Self Model, Attention Priorities, and Interaction Protocol. Coordination occurs through an Observe–Update–Inject loop.

The Taxonomic Commitment

Module-level. The six-module decomposition determines what the agent can represent and act on. An agent lacking a Trajectory module cannot distinguish a topic pivot from a topic abandonment. An agent without typed Attention cannot allocate urgency differentially.

Finer-grain. The Trajectory’s seven transition types (progress, reversal, pivot, discovery, external event, failure, branch) define recognizable episodic patterns. The Attention module’s four categories (threat, opportunity, anomaly, transition) determine representable salience. Domain adapter scan patterns are the agent’s perceptual categories.

The Adapter as Anti-Reification Mechanism

The domain adapter pattern is a deliberate architectural response to the reification problem — requiring a developer to explicitly choose entity types, relationship types, and scan patterns for each domain. This keeps categories visible as design decisions rather than allowing them to calcify into invisible assumptions.

The adapter pattern does not solve the reification problem. The six modules themselves are not adapter-replaceable — they are the installed base. The adapter resists reification at the domain-content level while the module-level taxonomy remains vulnerable. This illustrates that anti-reification is a gradient, not a binary.

What This Case Uniquely Reveals

SAGEN demonstrates that taxonomic commitment determines the behavioral ceiling of a cognitive architecture. Of 20 information dimensions evaluated, 16 were captured by none of the flat-memory baselines. These require explicit architectural support — no improvement in the underlying LLM produces them.

Uniquely, SAGEN demonstrates layered taxonomic commitment — the six modules are a deep commitment (hard to revise), while adapter content is a shallow commitment (designed to be revised per domain). The DSM conflates both layers into one monolithic artifact, which is part of why revision is so costly.

SECTION 7

Cross-Case Analysis — The Structural Pattern

Dimension	DSM	LLM-QP Plan Lattice	SAGEN Modules
Defines representable space	Diagnosable conditions	Selectable execution strategies	Performable cognitive operations
Constrains optimization	Treatment limited to recognized categories	Cost selection limited to enumerated plans	Agent reasoning limited to represented distinctions
Transparent when working	Clinicians see through categories	Planner selects transparently	Agent reasons through modules
Resists revision	Institutional switching costs	Compiler/kernel dependencies	Architectural assumptions
Anti-reification mechanism	None	Formal plan equivalence proofs	Domain adapter pattern
Unit of revision	Entire manual (15–20 yr cycle)	Individual plan (add new kernel + pass)	Adapter (shallow) or module (deep)

Key Differences

Looping. DSM categories loop — they change the people they classify. LLM-QP’s lattice does not loop. SAGEN occupies a middle position — its categories shape agent perception but the agent doesn’t reflexively modify its own categories.

Switching costs. DSM: societal (legal, financial, institutional, identity). LLM-QP: technical (compiler passes, kernels). SAGEN: architectural (module redesign, adapter revision). The mechanism differs; the structural effect is the same.

The Hardness Spectrum

Level	System	Looping	Switching Costs	Reification Risk
Soft	LLM-QP	None	Technical and bounded	Low
Medium	SAGEN	None (Hacking sense)	Architectural	Moderate
Hard	DSM	Yes — institutional scale	Societal	Maximal (largely realized)

The spectrum is not a ranking of quality. Hard infrastructure is not worse than soft; it is harder to change. The design implication: systems should be built as soft as the domain permits.

Testing the Three Predictions

Constitutiveness. Confirmed in all three cases.
Lock-in. Confirmed with varying intensity — maximal for DSM, minimal for LLM-QP.
Invisibility through transparency. Confirmed with a gradient — SAGEN’s adapter pattern deliberately forces visibility.

SECTION 8

The Infrastructure Paradox

The paradox. Classification works best when invisible but becomes most dangerous when invisible. The four-phase cycle:

Design. Categories chosen as practical tools — provisional, purpose-specific, explicitly acknowledged as decisions.
Installation. Categories embedded in practice. Learned, linked to conventions, connected to standards.
Transparency. Categories become invisible. Users look through them. They cease to feel like choices.
Reification. Categories treated as discoveries rather than decisions. Provisional conventions hardened into natural kinds.

The cycle operates at different speeds across the hardness spectrum. Computational taxonomies (LLM-QP) cycle fast and reify weakly. Cognitive architectures (SAGEN) cycle at medium speed. Institutional taxonomies (DSM) cycle slowly and reify completely.

The structural insight. The cycle is not a failure of vigilance. It is a consequence of infrastructure’s defining property: to function, it must be transparent; to be transparent, it must become invisible; to become invisible, it must cease to feel like a choice. The only defense is architectural — mechanisms that structurally resist transparency’s slide into reification.

SECTION 9

Implications for System Design

Taxonomic Commitment as First-Class Design Decision

The categories a system operates with should be documented, evaluated, and revisited with the same rigor applied to algorithmic choices: explicit enumeration of what the taxonomy can and cannot represent, versioned category definitions, and periodic review.

Anti-Reification Mechanisms

Modular adapter patterns (SAGEN) — encapsulate domain-specific categories in replaceable modules
Formal equivalence proofs (LLM-QP) — demonstrate multiple plans produce identical outputs
Explicit confidence metadata — annotate categories with epistemic status (provisional, validated, contested, convenience-only)
Versioned categories with changelogs and sunset dates on provisional categories

Layered Taxonomic Architecture

Systems should distinguish between deep categories that define fundamental representational capacity (hard to change, chosen with commensurate care) and shallow categories that specialize for a domain (designed to be replaceable). The DSM’s monolithic structure — where deep ontological commitments are fused with shallow clinical content — is an anti-pattern that maximizes revision cost at every level.

Taxonomic Debt

By analogy with technical debt: taxonomic debt accumulates when classification decisions are made expediently and left unexamined as the system scales. Symptoms include categories that no longer match operational reality, distinctions that practitioners routinely work around, and switching costs that grow faster than the system’s value proposition. Like technical debt, the remedy is not to avoid classification decisions but to make them deliberately, document them, and budget for revision.

Taxonomic Evaluation Criteria

Coverage. What can the taxonomy represent?
Blind spots. What can it not represent, and what are the consequences?
Switching costs. How embedded is it? What would revision disrupt?
Reification risk. How likely are provisional categories to be mistaken for natural kinds?
Hardness. Where on the soft–medium–hard spectrum, and is that the right position?

SECTION 10

Conclusion — The Categories Are Not Scaffolding

In any system where an agent must act under uncertainty, the classification layer functions as infrastructure. It is the most consequential and least examined design decision. It determines the space of possibility. Optimization improves performance within that space. Only taxonomic revision can expand it.

Three case studies demonstrate this across domains sharing almost nothing except the structural pattern: a taxonomy that defines the representable space, becomes invisible when working, and resists revision once installed. The three predictions — constitutiveness, lock-in, invisibility through transparency — hold in all three cases, modulated by a hardness spectrum that tracks the domain’s coupling to human institutions, identity, and reflexive awareness.

The practical upshot is not that better taxonomies will solve hard problems. It is that failing to recognize taxonomies as taxonomies — failing to see them as design decisions with structural consequences, treating them as neutral descriptions rather than constitutive commitments — produces systems trapped inside spaces they cannot see the edges of.

The categories are not scaffolding. They are load-bearing walls. The first step toward building better systems is seeing the walls for what they are.

View PDF Download PDF

← Back to ELI5 Overview

Classification is Infrastructure

Introduction — The Taxonomic Blind Spot

Classification as Infrastructure

Key Concepts from Bowker and Star

Taxonomic Commitment

Philosophical Grounding

Three Predictions

Psychiatric Classification — The DSM

The System

The Taxonomic Commitment

The Switching Cost Problem

What This Case Uniquely Reveals

Inference Execution Planning — LLM-QP

The System

The Taxonomic Commitment

Infrastructure Properties

What This Case Uniquely Reveals

Agent Cognitive Architecture — SAGEN

The System

The Taxonomic Commitment

The Adapter as Anti-Reification Mechanism

What This Case Uniquely Reveals

Cross-Case Analysis — The Structural Pattern

Key Differences

The Hardness Spectrum

Testing the Three Predictions

The Infrastructure Paradox

Implications for System Design

Taxonomic Commitment as First-Class Design Decision

Anti-Reification Mechanisms

Layered Taxonomic Architecture

Taxonomic Debt

Taxonomic Evaluation Criteria

Conclusion — The Categories Are Not Scaffolding

Related

Need something like this built?