FULL PAPER

SAGEN: Situational Awareness for Generative Agents

A Modular Cognitive Architecture for Persistent State in Language Model Agent Systems

Jake Lawrence  ·  Independent Researcher  ·  March 2026

ABSTRACT

Large language model (LLM) agents operate under a fundamental constraint: each inference call is stateless. Existing approaches to agent memory treat persistence as a storage problem. We reframe it as a cognitive architecture problem. We introduce SAGEN (Situational Awareness for Generative Agents), a modular framework that maintains structured, evolving awareness state across agent interactions through six interoperating cognitive modules: Goal Graph, Trajectory, World Model, Self Model, Attention Priorities, and Interaction Protocol. SAGEN operates via an Observe–Update–Inject loop coordinated by a domain-agnostic engine and domain-specific adapters. We present the architecture, formal specification, a reference implementation, and a proof-of-concept evaluation in the conversational domain. We release SAGEN as open-source software.

SECTION 1

Introduction

The deployment of large language models as interactive agents has exposed a structural gap between model capability and operational coherence. A model that can reason about complex plans within a single context window routinely loses the thread of those same plans across sequential interactions. This is not a failure of intelligence. It is a failure of architecture.

The dominant paradigm treats agent memory as information retrieval: store facts, embed them, retrieve relevant chunks at query time. RAG and its derivatives address the question “What does the agent know?” They do not address: What is the agent trying to do? What just changed? What should it pay attention to? What has it tried before?

These are questions of situational awareness — a concept well-studied in human factors research but underexplored in LLM agents.

The core design principles are:

  1. Modularity. Six cognitive modules that can be composed, extended, or replaced independently.
  2. Domain agnosticism. A plugin-based adapter pattern that separates domain-specific perception from domain-general state management.
  3. Compression over accumulation. Inspired by ACT-R’s memory decay, prioritize recent, salient, and failure-associated state.
  4. Injection-native. State rendered as compact, structured text designed for LLM context windows.
SECTION 2

Architecture

SAGEN’s awareness state is organized as a blackboard: a shared data structure that multiple cognitive modules read from and write to. The unified state object, AwarenessState, contains six modules and two coordination fields.

SAGEN architecture diagram

SAGEN architecture. Six cognitive modules reside on a shared blackboard. The Update Engine coordinates the Observe–Update–Inject loop, using a Domain Adapter to translate between raw observations and structured state updates.

Module 1: Goal Graph

The Goal Graph maintains a directed acyclic graph of agent objectives. Each goal has a lifecycle (active, completed, abandoned, blocked, deferred) and a provenance (explicit, inferred, emergent, or spawned).

FieldTypeDescription
idstringUnique 8-character identifier
descriptionstringNatural language goal statement
statusGoalStatusLifecycle state (5 values)
sourceGoalSourceProvenance (4 values)
priorityfloat [0,1]Urgency/importance score
parent_idstring?Parent goal for hierarchy
depends_onlist[string]Blocking dependency IDs
completion_criteriastring?Verifiable exit condition

Module 2: Trajectory

Records state transitions as a temporal sequence — the agent’s episodic memory. Each transition is typed: progress, reversal, pivot, discovery, external event, failure, or branch. The module implements ACT-R-inspired compression: recent transitions are detailed, routine progress fades, and failures/reversals/discoveries are sticky.

Module 3: World Model

An entity-relationship graph of the agent’s environment. Entities have types, mutable state, and affordances. The World Model also tracks assumptions (unverified beliefs) and unknowns (identified but unanswered questions) — giving the agent explicit access to its own epistemic boundaries.

Module 4: Self Model

The agent’s understanding of its own capabilities, limitations, resources, authority boundaries, and failure history. The can_i(action) method provides a pre-flight check across three dimensions: capability, authorization, and resourcing.

Module 5: Attention Priorities

A priority queue of threats, opportunities, anomalies, and transitions. Each item has an urgency score and optional TTL. The module also stores persistent scan patterns — watchlist templates defined by the domain adapter.

Module 6: Interaction Protocol

The operational contract: communication style, output format, collaboration mode (autonomous, supervised, collaborative, advisory), escalation rules, and hard constraints.

Unified State and the Tick

All six modules reside on the AwarenessState blackboard alongside a global step counter and timestamp. Formally, the awareness state at step tt is:

St=Gt,Tt,Wt,Mt,At,Pt,tS_t = \langle G_t, T_t, W_t, M_t, A_t, P_t, t \rangle

where GG is the Goal Graph, TT the Trajectory, WW the World Model, MM the Self Model, AA the Attention Priorities, and PP the Interaction Protocol.

SECTION 3

The Update Engine

The Update Engine coordinates the three-phase Observe–Update–Inject (OUI) loop that transforms raw observations into structured state and then into LLM-consumable context.

Phase 1: Observe

The adapter’s parse_observation method receives raw input and the current state, returning a structured dictionary of entities, goals, attention items, assumptions, unknowns, and trajectory events. This phase is entirely domain-specific.

Phase 2: Update

The engine applies parsed observations using upsert semantics for entities and append semantics for goals, attention items, and lists. Expired attention items are removed based on TTL. This phase is domain-agnostic.

Phase 3: Inject

The adapter renders the current state as a compact text block sized to a token budget. The injection string is structured but natural-language-readable, wrapped in <sagen> tags.

<sagen>
ACTIVE GOALS:
  [explicit] Learn Python (p=0.7)
  [explicit] Build a web scraper (p=0.7)
  [inferred] Answer: What library for scraping? (p=0.6)

ATTENTION:
  [opportunity] Callback to earlier topic
  [transition] Topic shift: {'cooking'} -> {'Python'}

ACTIVE TOPICS: Python, web scraping

TRAJECTORY:
  [progress] Continuing: {'Python', 'web scraping'}
  [pivot] Pivoted from {'cooking'} to {'Python'}
</sagen>
SECTION 4

Domain Adapters

Each adapter implements six methods that encapsulate all domain-specific logic:

MethodResponsibility
domain_nameReturns the domain identifier string
define_entity_types()Declares entity types the domain recognizes
define_relationship_types()Declares valid relationship types
define_scan_patterns()Provides persistent attention scan templates
parse_observation()Transforms raw input into structured updates
format_for_injection()Renders state as LLM-consumable text
define_protocol_defaults()Sets default interaction protocol values

Cross-Domain Comparison

DimensionConversation AdapterCoding Adapter
Entity types6 (person, topic, concept, reference, emotion, preference)8 (file, library, error, endpoint, service, concept, function, config)
Scan patternsTopic shift, emotional escalation, callback, implicit goalError recurrence, scope creep, dependency conflict, solution regression
Primary threatsFrustration, confusionBlocking errors, regressions
Trajectory emphasisPivots, callbacksFailures, discoveries, branches

The two adapters share zero entity types (except “concept”), zero scan patterns, and produce domain-appropriate injection formats — yet both run on the identical engine and schema without modification.

SECTION 5

Evaluation

State Evolution (Conversation Domain)

StepEventGoalsEntitiesAttnTrajectory
1Goal creation3201 (progress)
2Topic pivot541 (transition)2 (+pivot)
3Callback642 (+opportunity)3 (+progress)
4Frustration753 (+threat)4 (+progress)

Baseline Comparison

MetricRaw BufferRolling SummarySAGEN
Approx. tokens9243176
Coverage score (/20)2.14.119.7
Coverage (%)10.520.598.5
Info density (cov/100 tok)2.289.5911.23
Fully captured1420
Not captured16130

SAGEN captures 98.5% of evaluated information dimensions vs. 20.5% for rolling summaries and 10.5% for raw buffers. The key finding is not SAGEN’s absolute score but the structural gap: 16 of 20 dimensions are captured by none of the baselines. These are capabilities — goal hierarchy, typed transitions, epistemic boundary tracking, urgency scoring — that require explicit architectural support.

Qualitative Response Comparison

Turn 3 (Callback): Under rolling summary context, the LLM answers the factual question competently but treats it as standalone. Under SAGEN injection, it recognizes the callback, connects it to the original goal, and calibrates its response to the user’s stated skill level.

Turn 4 (Frustration): The summary-context response jumps directly to troubleshooting. The SAGEN-context response first acknowledges the user’s emotional state, calibrates its tone, and adopts a patient, step-by-step approach.

Closed-Loop Evaluation

LLM-generated perception achieves 0.94 alignment with hand-crafted analysis across all four turns. All key cognitive dynamics (pivot detection, callback recognition, frustration monitoring) are preserved regardless of whether analysis is hand-crafted or LLM-generated.

State Serialization

Split-session experiment: Session 1 processes turns 1–2, state is serialized to JSON (4.7 KB), Session 2 processes turns 3–4 from restored state. The restored engine produces exactly identical state and injection output to continuous execution. The blackboard is the state, and the state is the JSON.

SECTION 6

Discussion

Classification as Infrastructure

SAGEN’s six-module decomposition is not a neutral description of cognition. It is a taxonomic commitment that determines what the agent can represent and therefore what it can reason about. An agent without a Trajectory module cannot distinguish a topic pivot from a topic abandonment. An agent without typed Attention cannot allocate urgency differentially. The taxonomy is not scaffolding; it is load-bearing structure.

Limitations

  1. No learning. SAGEN maintains state but does not learn from it in the ML sense.
  2. Perception cost. Each observation requires an additional inference call.
  3. Single-domain assumption. One adapter at a time; multi-domain composition is not yet specified.
  4. Limited evaluation scope. Single 4-turn scenario; systematic end-to-end evaluation planned.
View PDFDownload PDF
← Back to ELI5 Overview
2026 Jake Lawrence
One question, asked different ways.