RESEARCH PAPER

SAGEN

Situational Awareness for Generative Agents — a modular cognitive architecture for persistent state in language model agent systems.

Jake Lawrence6 Cognitive ModulesOpen Source (MIT)March 2026
THE IDEA

What If Your AI Agent Could Actually Remember What It’s Doing?

A cognitive architecture that gives LLM agents structured, persistent situational awareness.

Every LLM agent has the same fundamental problem: it forgets everything between turns. Each API call starts from scratch. The model doesn’t know what it was trying to do, what just changed, what it should watch for, or what it tried before that didn’t work.

Most solutions treat this as a storage problem — stuff more context into the prompt, retrieve relevant chunks from a vector database, summarize the conversation. But that only answers “what does the agent know?” It doesn’t answer the harder questions: what is it trying to do? What just happened? What should it pay attention to?

SAGEN reframes this as a cognitive architecture problem. Instead of a flat memory buffer, it gives the agent six specialized modules — a structured representation of its current situation that gets updated every turn and injected as compact context.

Where RAG manages what the agent knows, SAGEN manages what the agent understands about its current situation.
6 COGNITIVE MODULES

Six Modules on a Shared Blackboard

All six modules read from and write to a shared state object called the AwarenessState. Each captures a different dimension of situational awareness.

1🎯Goal GraphYour to-do list

A hierarchical tree of what the agent is trying to accomplish. Goals can be nested ("Learn Python" → "Understand list comprehensions"), have dependencies, and track their lifecycle: active, completed, blocked, abandoned, or deferred. Goals are inferred from behavior, not just explicit statements.

2📍TrajectoryYour memory of recent events

A timeline of what just happened — but not everything. Like human memory, routine events fade while failures, surprises, and pivots stick around. The module compresses history automatically, keeping a detailed window of recent events and sticky memories of consequential moments.

3🗺️World ModelYour mental map of the situation

An entity-relationship graph of people, topics, concepts, and their connections. Crucially, it also tracks assumptions (things the agent believes but hasn't verified) and unknowns (questions the agent has identified but can't yet answer) — giving the agent explicit access to its own epistemic boundaries.

4🪞Self ModelKnowing your own strengths and limits

What can the agent do? What is it not allowed to do? What resources does it have left? The Self Model tracks capabilities, authority boundaries, resource budgets (token limits, API quotas), and a history of past failures with extracted lessons to avoid repeating mistakes.

5🔔Attention PrioritiesKnowing what to focus on right now

A priority queue of threats, opportunities, anomalies, and transitions that need the agent's focus. Items have urgency scores and time-to-live values — they expire automatically when no longer relevant. The module also stores persistent scan patterns: things to always watch for, like topic shifts or emotional escalation.

6🤝Interaction ProtocolKnowing how to behave

The operational contract: communication style, output format, collaboration mode (autonomous, supervised, advisory), escalation rules, and hard constraints. This is the normative layer — not what the agent knows, but how it should act.

THE OUI LOOP

Observe → Update → Inject

Every turn, the same three-phase cycle runs to keep the agent’s awareness current.

SAGEN architecture diagram showing six cognitive modules on a shared blackboard with Update Engine and Domain Adapter

Six cognitive modules reside on a shared blackboard. The Update Engine coordinates the Observe–Update–Inject loop, using a Domain Adapter to translate between raw observations and structured state updates.

Observe

A Domain Adapter parses the raw input (a user message, a code change, a sensor reading) and extracts structured information: topics, entities, sentiment, goals, questions. This is the only domain-specific step — everything else is generic.

Update

The engine applies the parsed observations to the blackboard: new entities get added, goals get spawned, attention items fire, trajectory events get recorded. Expired items are cleaned up. The global clock ticks forward.

Inject

The adapter renders the current state as a compact, prioritized text block — designed for insertion into the LLM's system prompt. Goals first, then attention items, then topics, then unknowns. Budget-aware: if the context window is tight, lower-priority information gets trimmed.

EXAMPLE INJECTION OUTPUT
<sagen>
ACTIVE GOALS:
  [explicit] Learn Python (p=0.7)
  [explicit] Build a web scraper (p=0.7)
  [inferred] Answer: What library for scraping? (p=0.6)

ATTENTION:
  [opportunity] Callback to earlier topic
  [transition] Topic shift: {'cooking'} -> {'Python'}

ACTIVE TOPICS: Python, web scraping

TRAJECTORY:
  [progress] Continuing: {'Python', 'web scraping'}
  [pivot] Pivoted from {'cooking'} to {'Python'}
</sagen>
RUN IT

Test the Claims, Don’t Take Them

The reference implementation, ported to run in your browser. Step the documented conversation through the engine, watch the blackboard fill, and prove the serialization claim live. Then put a real model behind it.

Interactive · run the loop

Watch the blackboard accrue

The exact four-turn conversation from the paper, replayed through the in-browser port of the released engine. Step through it: the goals never get dropped on the pivot, the callback reconnects, frustration jumps to the top of attention, and the <sagen> block at the bottom is the literal context a SAGEN-aware model would receive.

USER · TURN 4

BeautifulSoup is not working! Errors everywhere!

Frustration. A high-urgency threat item lands at the top of attention, so a SAGEN-aware reply can change its tone instead of diving straight into troubleshooting.

Goal Graph6
explicitLearn Pythonp=0.7
explicitBuild a web scraperp=0.7
explicitFix BeautifulSoup errorsp=0.7
inferredAnswer: Can you help me learn Python?p=0.6
inferredAnswer: Good recipe for pasta carbonara?p=0.6
inferredAnswer: What library for scraping?p=0.6
World Model · entities6
Pythontopic
web scrapingtopic
cookingtopic
pasta carbonarareference
BeautifulSoupconcept
debuggingtopic
Attention · by urgency5
threatUser sentiment: frustrated
opportunityCallback to earlier topic
opportunityCallback to earlier topic
transitionTopic shift: web scraping -> cooking, pasta carbonara
transitionTopic shift: cooking, web scraping -> BeautifulSoup, Python, debugging
Trajectory4
progressInitial topics: Python, web scraping
pivotPivoted: web scraping -> cooking, pasta carbonara
progressContinuing: Python, web scraping
pivotPivoted: cooking, web scraping -> BeautifulSoup, Python, debugging
INJECTED CONTEXT (~230 tokens)
<sagen>
ACTIVE GOALS:
  [explicit] Learn Python (p=0.7)
  [explicit] Build a web scraper (p=0.7)
  [explicit] Fix BeautifulSoup errors (p=0.7)
  [inferred] Answer: Can you help me learn Python? (p=0.6)
  [inferred] Answer: Good recipe for pasta carbonara? (p=0.6)
  [inferred] Answer: What library for scraping? (p=0.6)

ATTENTION:
  [threat] User sentiment: frustrated
  [opportunity] Callback to earlier topic
  [opportunity] Callback to earlier topic
  [transition] Topic shift: web scraping -> cooking, pasta carbonara
  [transition] Topic shift: cooking, web scraping -> BeautifulSoup, Python, debugging

ACTIVE TOPICS: Python, cooking, debugging, web scraping

TRAJECTORY:
  [progress] Initial topics: Python, web scraping
  [pivot] Pivoted: web scraping -> cooking, pasta carbonara
  [progress] Continuing: Python, web scraping
  [pivot] Pivoted: cooking, web scraping -> BeautifulSoup, Python, debugging
</sagen>
Computed live · the honest Table 5

These counts are computed by the in-browser port as you step. The paper’s printed Table 5 reports 3 / 5 / 6 / 7 goals; the released code (and therefore this testbed) actually produces 3 / 4 / 5 / 6. The runnable artifact is authoritative — which is the entire point of being able to test a claim.

StepEventGoalsEntitiesAttnTrajectoryTransition
1Goal creation3201progress
2Topic pivot4412pivot
3Callback5423progress
4Frustration6654pivot
Interactive · the serialization claim

Checkpoint, restore, continue

The blackboard is the state, and the state is JSON. Run two turns, serialize to 2.5 KB, restore into a fresh engine, and run the last two. The result is identical, down to the byte, to running all four continuously — which is what makes a SAGEN agent durable across sessions, processes, and machines.

Restored (2 + 2 turns) injection is byte-identical to continuous (4 turns).
checkpoint size: 2.5 KB · verified in your browser, and in CI (engine.test.js)
Interactive · live model

Same turn, with and without awareness

Type anything, or walk the four-message arc below (learn Python → pivot to pasta → callback → frustration). Each turn answers twice: once with a plain system prompt, once with the live <sagen> block added. Only the injection changes. The same model is asked the same thing; watch where awareness shows up.

Live model calls. Responses are not stored; the blackboard round-trips with your browser between turns.

WHY IT MATTERS

From Stateless to Situationally Aware

Most agent frameworks bolt memory onto an LLM as an afterthought. SAGEN treats awareness as a first-class architectural concern. The result is an agent that knows what it’s doing, remembers what just changed, tracks what it should watch for, and understands its own capabilities and limits.

The adapter pattern means SAGEN works across domains. The same engine runs a conversational agent (tracking topics, emotions, callbacks) and a coding assistant (tracking errors, regressions, scope creep) without modification — you just swap the adapter.

And it’s open source. MIT licensed. The reference implementation includes a conversation adapter and a coding adapter, with full test coverage.

Read the Full Paper
Architecture details, formal specification, adapter pattern, evaluation results
View PDFDownload PDF
References & lineage
1.Endsley, M. R. (1995). Toward a theory of situation awareness in dynamic systems. Human Factors, 37(1).
2.Anderson, J. R., et al. (2004). An integrated theory of the mind (ACT-R). Psychological Review, 111(4).
3.Erman, L. D., Hayes-Roth, F., Lesser, V. R., & Reddy, D. R. (1980). The Hearsay-II speech-understanding system (the blackboard architecture). ACM Computing Surveys, 12(2).
4.Bowker, G. C., & Star, S. L. (1999). Sorting Things Out: Classification and Its Consequences. MIT Press.
5.Lewis, P., et al. (2020). Retrieval-Augmented Generation for knowledge-intensive NLP tasks. NeurIPS.
6.Park, J. S., et al. (2023). Generative Agents: Interactive Simulacra of Human Behavior. UIST.

Related

Theory
The Machine Mirror
Both explore machine consciousness through modular cognitive architecture
Method
LLM-QP
Both develop architectural optimizations for AI system performance
Theory
The Word Machine
Both involve modular AI architectures for enhanced cognitive capabilities
Method
The Weight of Salt
Both document AI-assisted creative processes and human-machine collaboration

Need something like this built?

I design and ship AI tools, full-stack apps, and data pipelines — end to end, to production. Tell me the problem in a sentence; I'll give you an honest read on fit within a day.

Work with me →