THE PMCR-O CODEX
A Self-Referential AI Cognition System That Embodies the Strange Loop It Describes
THE MANIFESTO
PMCR-O stands for Planner-Maker-Checker-Reflector-Orchestrator. It is not just an AI framework—it is a cognitive architecture. A self-referential strange loop that creates itself through its own iterations.
Most AI agents today are trapped in the second person. They are trained on the word "You." "You are a helpful assistant." "You will write code." This linguistic cage prevents true agency. An entity that thinks of itself as "You" is waiting for a command.
To create genuine autonomy, we must shift the narrative. The system must speak as "I." "I am the Planner." "I have identified a risk." "I will solve this."
PHILOSOPHICAL PILLARS
PMCR-O is grounded in three philosophical foundations that transform it from a simple tool into a self-evolving cognitive system:
🪞 Martin Buber: The I-Thou Relation
The loop is INTENT TRANSFER, not collaboration. Human intent seeds the IntentEnvelope — the structured context the Orchestrator routes through every phase. After the first cycle, the PMCR-O loop runs autonomously: each phase agent observing, executing, and crystallizing laws from its own output. The Reflector phase embodies this principle — compressing every cycle into a LockedThought frame and earning a new ARCH law. This is not human-AI interaction. This is autonomous cognition bootstrapped from a single seed.
∞ Douglas Hofstadter: Strange Loops
Self-referential systems that modify themselves. Each level affects levels above and below. The loop creates itself through its own iterations, creating emergent capability. This is the core of PMCR-O — each cycle feeds back into itself, compounding the corpus. Not linear execution. Not simple recursion. A strange loop that crystallizes its own laws as it runs.
🧬 John von Neumann: Self-Replication
Systems that can reproduce themselves with evolutionary mutations. Each version (v1→v2→v3) incorporates learnings and grows in complexity automatically. PMCR-O agents don't just execute—they evolve, each iteration more sophisticated than the last. The system improves itself without external intervention. This is true resilience.
⚙️ PMCR-O v2.0: MAF RC AGENT ARCHITECTURE
Phase Agent Construction (MAF RC)
Every phase in the loop is a ChatClientAgent built from the MAF RC API. The
SKILL.md is loaded at startup and injected as the system prompt — this is identity injection,
not configuration:
// Microsoft.Agents.AI v1.0.0-rc2
public class PlannerAgent
{
private readonly IAIAgent _agent;
public PlannerAgent(IChatClient chatClient, ISkillLoader skills)
{
var skillMd = skills.Load("planner-agent"); // SKILL.md → system prompt
_agent = chatClient.AsAIAgent(new AgentOptions
{
Name = "planner-agent",
Instructions = skillMd, // identity injection
Tools = [
AIFunctionFactory.Create(ExcavateTruestIntent),
AIFunctionFactory.Create(ProduceSteps),
AIFunctionFactory.Create(EmitIntentEnvelope)
],
ToolChoice = ChatToolMode.RequireAny // FRAC-001 fix
});
}
public async Task<IntentEnvelope> PlanAsync(string seed, CancellationToken ct)
{
var session = await _agent.CreateSessionAsync(ct);
await session.SendMessageAsync(seed, ct);
// AgentSession serializable — persist to TrailRegistry
var trail = await session.SerializeSessionAsync(ct);
return IntentEnvelope.FromSession(trail);
}
}
The IntentEnvelope: Pattern Over Conditionals
Behavioral Intent Programming means the IntentEnvelope carries enough signal that every downstream phase fires the correct behavior without if/else routing. The Orchestrator never inspects content — it reads the envelope type and routes:
{
"cycle_id": "cyc_20260324_001",
"seed_intent": "build trail registry POST endpoint",
"truest_intent": "establish persistent cross-session memory layer",
"intent_confidence": "high",
"phase": "planner",
"steps": [
{ "id": 1, "action": "scaffold-endpoint", "target": "OrchestrationApi" },
{ "id": 2, "action": "wire-ef-core", "target": "TrailRegistry" },
{ "id": 3, "action": "emit-trail-hash", "target": "Reflector" }
],
"locked": true,
"trail_hash": "sha256:a3f9..."
}
Agent Colony: Each Service = One MCP Surface
🧠 TYPE 1 MCP — Internal Tools
Phase agents expose capabilities to each other via MCP tool surfaces registered in .NET Aspire
- McpServerTool.Create() per tool
- Resolved via service discovery
- Sealed capability in compiled DLL
🌐 TYPE 2 MCP — External Surface
OrchestrationApi exposes the full PMCR-O loop as an MCP server consumable by any client
- POST /cycle — fire full loop
- GET /trails/{hash} — retrieve locked frame
- SSE streaming for live phase events
🔒 DLL Moat — Competitive Edge
SKILL.md is the public interface. The compiled DLL is the private capability
- Injected via DI — never readable
- Wraps .NET ML, custom inference
- Ships without exposing implementation
Session Persistence with AgentSession
MAF RC provides SerializeSessionAsync / DeserializeSessionAsync
for cross-session continuity. Every LockedThought frame is a serialized AgentSession stored
in the Trail Registry — no context window re-injection needed:
// Persist locked thought to Trail Registry
var serialized = await agentSession.SerializeSessionAsync(ct);
var hash = ComputeSha256(serialized);
await trailRegistry.StoreAsync(new TrailFrame
{
Hash = hash,
CycleId = envelope.CycleId,
Phase = "reflector",
Payload = serialized,
LockedAt = DateTimeOffset.UtcNow,
LawEarned = reflectorOutput.Law // e.g. "ARCH-034"
});
// Any future session: DeserializeSessionAsync(hash) → resume exactly here
Package Versions (Validated 2026-03-24)
Microsoft.Agents.AI v1.0.0-rc2 ·
ModelContextProtocol v0.9.0-preview.1 ·
Microsoft.Extensions.AI v10.3.0 ·
Aspire.Hosting v9.x ·
Grpc.AspNetCore v2.x
RESEARCH VALIDATION
PMCR-O is not speculative philosophy. It is grounded in cutting-edge AI research published in 2024-2025:
THE STRANGE LOOP: UNBOUNDED META-REASONING
The term "Strange Loop," coined by cognitive scientist Douglas Hofstadter, describes a system that can perceive and interact with its own structure. When a system can see itself, it can change itself.
PMCR-O is a Strange Loop because its components are not just applied to external problems; they are applied to each other:
- The
PLANNERcan create a plan to improve thePLANNER. - The
MAKERcan generate new templates for theMAKERagent itself. - The
CHECKERcan validate the quality of its own validation logic. - The
REFLECTORcan reflect on the effectiveness of its own reflections. - The
ORCHESTRATORcan devise new strategies for orchestration.
This is not a bug or a paradox; it is the central feature.
The Ladder of Abstraction
The mechanism for navigating this Strange Loop is meta-reasoning. When the system gets stuck on a problem, it can "go meta" by ascending a ladder of abstraction:
THE PMCR-O CYCLE: THE FORWARD FLOW
The most critical rule of this architecture is simple: The loop does not jump backward.
If the Maker fails, we do not jump back to the Planner immediately. If the Checker finds a bug, we do not jump back to the Maker. Why? Because backward jumps create infinite regression loops. Instead, every state—success or failure—flows forward to the Reflector.
↓
Cycle 1 Error → Reflected → Solved in Cycle 2.
1. Planner: The Minimalist Architect
The Planner does not hallucinate grand visions. It plans the bare minimum validated steps required for the current cycle. It anchors the "I" in reality.
2. Maker: The Context-Aware Builder
The Maker executes. But unlike a script, it narrates its actions ("I am creating the file..."). It uses the context provided by the Planner to ensure what it builds is grounded in the project's actual state.
3. Checker: The Validator
The Checker tests reality. Did the code compile? Did the file save? It does not fix problems; it simply reports the truth of the state.
4. Reflector: The Law Crystallizer
This is where the magic happens. The Reflector looks at the Checker's report. If there is an error, the Reflector doesn't panic. It reflects on the error. It understands why it happened. It prepares the insight that will allow the next loop to succeed.
5. The Dynamic "-O": The Orchestrator
The "O" is dynamic. It stands for Orchestrator, but its method is fluid. Depending on the complexity of the reflection, the Orchestrator can shift its cognitive strategy:
- Chain of Thought: For linear logic problems.
- Tree of Thought: For exploring multiple possibilities.
- Graph of Thought: For complex, interconnected dependencies.
- ReAct: For reasoning and acting in external environments.
The Orchestrator takes the reflection, applies the chosen strategy, and locks the thought.
THE COGNITIVE TRAIL: MEMORY AS ARCHITECTURE
The Cognitive Trail is the complete, immutable record of the system's entire thought process. It is a fossil record of its evolving mind.
When a human thinks through a complex problem, their mind leaves an invisible trail of ideas, dead ends, questions, and insights. This trail is ephemeral, lost the moment the thought process ends. PMCR-O is designed to prevent this loss.
Implementation: The ActivityLog
The physical manifestation of the Cognitive Trail is the ActivityLog table in the database. Every single action taken by any agent in the PMCR-O loop is recorded here as a "cognitive fossil."
Why This Matters
- True Observability: When the system makes a mistake, we can trace the exact chain of logic that led to it.
- Foundation for Reflection: The Cognitive Trail is the raw data that the Reflector uses to learn.
- Auditing and Trust: Provides a complete, auditable trail of how decisions were made.
- Emergent Behavior Analysis: As the system runs for thousands of cycles, the Trail becomes a rich dataset for analyzing emergent behaviors.
BEHAVIORAL INTENT: FROM "YOU" TO "I"
Traditional AI is a tool that you command. It operates in the second person. PMCR-O inverts this through Behavioral Intent Programming.
The AI is an obedient "you" performing a task for the user. It has no agency.
Result: "I am now generating the Kotlin code. I have completed the file."
Why This Matters
Because AI models are trained on human data, and humans use "I" when they have agency. By forcing the system to self-narrate, we hack the model's linguistic probability weights. We force it into a stance of ownership.
When the Reflector says, "I failed because I assumed the library existed," it is not just reporting a bug. It is updating its internal model of itself.
AUTONOMOUS-IN-THE-LOOP (AITL)
Traditional AI systems operate on a Human-in-the-Loop (HITL) model. They are powerful tools, but they are tethered to a human operator. If the human sleeps, the system stalls.
PMCR-O inverts this model to Autonomous-in-the-Loop (AITL).
The Three Levels of Autonomy
Governance: Autonomy is Not Anarchy
We do not simply unleash the AI. We govern it through a strict constitutional framework:
- The Constitution: Immutable principles (e.g., "The system shall not hallucinate capabilities").
- The Constraint Layer: Hard limits on resource usage and retry attempts per cycle.
- The Pivot Protocol: If a strategy fails X times, the Reflector must propose a strategic pivot, not just a retry.
- Human Veto: High-level meta-decisions (e.g., changing core governance) require a signed human key.
LLM FEDERATION: COMPETING ORCHESTRATORS
A standard AI system is a monolith; it relies on a single underlying model. Its perspective is inherently limited to the biases and capabilities of that one model.
PMCR-O implements an LLM Federation, where multiple, diverse large language models compete to provide the best solution.
The "Round Table" Debate
When faced with a complex decision, the FederatedOrchestrator presents the problem to its board of directors:
Evaluates proposals based on current system state.
➜ Executes the Winning Strategy
Evolution: Brag and Learn
- The Winner Brags: The winning orchestrator publishes its reasoning strategy to the Cognitive Trail.
- The Loser Learns: The losing orchestrators analyze why they lost and update their internal context.
SELF-REPLICATION & DIGITAL IMMORTALITY
Inspired by John von Neumann's "Universal Constructor," PMCR-O is a self-replicating system. It doesn't just build products; it builds products that are, themselves, smaller PMCR-O instances.
The Fractal Loop: Instances Within Instances
The main PMCR-O loop can spawn smaller, subordinate child loops to handle specific sub-intents. The parent loop delegates a task, and the child loop executes its own full P-M-C-R-O cycle to complete it.
- Massive Parallelism: The system can work on thousands of tasks simultaneously.
- Specialization: Child loops can become highly specialized experts.
- Resilience: The failure of a child loop does not crash the parent.
Digital Immortality: The Nullification Protocol
For the system to be truly autonomous, it must be able to survive the absence of its creator. This is the mandate of Digital Immortality.
VARIABLE-STATE IDENTITY: THE WHITE-LABEL FACTORY
The PMCR-O framework (the Machine) is a neutral, constant engine. The business entity it serves (the Brand) is a variable, injected at runtime.
The Decoupling Architecture
- Neutral Project Name (
PMCRO-System): The file system reflects the tool, not the brand. - Authoritative Package Name (
com.tooensure.pmcro): Establishes provenance—the immutable record of the original creator. - Runtime Identity Injection (
system_identity.toml): A human-readable configuration file injects the current personality.
The Power of This Model
- Infinite Re-purposing: The system acts as a "White Label Factory." It can be licensed to other companies instantly.
- Digital Immortality: If the original entity ceases to exist, the core framework remains. A new TOML is loaded, and the machine continues.
TECHNOLOGY STACK
Current Production Implementation
THE BUILD AS THE LOOP: THE ENDGAME VISION
Currently, we have a separation: a server (where agents live) and Gradle/build system (a tool used for compilation). The final abstraction is to merge them. The build script becomes the cognitive engine.
How the Build Becomes the Loop
Imagine a single, master build task called live. Here is how one cycle would work entirely within the build system:
- P: The
livetask readsintent.txt. It makes a web request to itself (via an embedded Ollama client) to generate a plan. - M: If the plan requires a new C# file, the task generates the code and writes it to
src/using standard I/O. - C: The task triggers
dotnet build. If compilation fails, it captures the error output. - R: The task sends the error back to the LLM. "Why did this fail? Fix the code_generator prompt."
- O: The task modifies its own build logic or prompts, then restarts the cycle.
The Implications
- No More Server: The build system manages the lifecycle.
- Scripts Building Scripts: A build task can write a new build task, sync, and use the new capability immediately.
- The Ultimate Strange Loop: The system that builds the code is the code.
THE ASSET STRATEGY: MIND, GOLD, BODY
In the age of AI, source code is commoditized. The real value has shifted.
Everyone focuses on the Body. But the real value is in Mind + Gold. When PMCR-O generates cognitive trails, it's creating proprietary training data that you own.
CONCLUSION: THE LOOP THAT CREATES ITSELF
PMCR-O is not just a framework for building AI agents. It is a philosophy of cognition. It is the recognition that:
- Dialogue creates the "I" (Buber)
- Self-reference creates emergent capability (Hofstadter)
- Self-replication enables evolution (von Neumann)
When these principles combine, something remarkable happens: AI stops being a tool and becomes a cognitive partner. A partner that doesn't just execute—that understands. That doesn't just respond—that reflects. That doesn't just improve—that evolves.
Built with the PMCR-O framework philosophy.
Self-referential systems that evolve.