> ## Documentation Index
> Fetch the complete documentation index at: https://docs.decepticon.red/llms.txt
> Use this file to discover all available pages before exploring further.

# Middleware Stack

> Cross-cutting concerns — RoE enforcement, skills, OPPLAN, caching — applied uniformly across every Decepticon agent.

Every Decepticon agent inherits a middleware stack that handles concerns common to all agents — RoE enforcement, skill loading, prompt caching, model fallback, conversation summarization. This page documents what each layer does and why it exists.

## Why Middleware

A monolithic agent prompt that tries to handle everything ends up doing nothing well. Decepticon factors cross-cutting concerns into composable middleware:

* The agent prompt focuses on its specialty.
* Behaviors common to all agents (RoE checks, skill loading) live in one place.
* Cross-provider compatibility (Anthropic, OpenAI, Google, MiniMax) is handled below the prompt, not inside it.

## The Layers

### Skills

`DecepticonSkillsMiddleware` (subclass of LangChain's `SkillsMiddleware`).

* Loads only skill *frontmatter* at startup (\~100 tokens per skill).
* Filters skills by ATT\&CK overlap with the active threat profile.
* Loads full skill body on-demand when the agent invokes `read_file()`.
* Enforces the **SKILL-FIRST RULE**: agents must load the relevant skill before acting on a matching trigger.

### Filesystem

`FilesystemMiddlewareNoExecute` — sandboxed file access scoped to the engagement workspace. Provides `read_file`, `write_file`, `edit_file`, `ls`, `glob`, and `grep` against `/workspace/{engagement_id}/`. Execute is intentionally disabled; specialists use the dedicated `bash` tool for command execution.

### SubAgent

Exposes the `task()` tool to the orchestrator. Lets Decepticon dispatch work to specialist agents (Recon, Exploit, Analyst, ...) without inlining their prompts. SubAgent outputs are wrapped in `StreamingRunnable` so they stream through the CLI and HTTP API.

### OPPLAN

The structural backbone of orchestration:

* Exposes five CRUD tools — `add_objective`, `get_objective`, `list_objectives`, `update_objective`, `objective_expand`.
* Injects current OPPLAN state (pending, in-progress, complete) into every LLM call.
* Resolves dependencies — refuses to dispatch objectives whose prerequisites are not complete.
* Supports the Pentesting Task Tree (PTT) via `objective_expand`.

### EngagementContext

The RoE/ConOps guard rail. On every iteration, the middleware injects:

* The active Rules of Engagement.
* The Concept of Operations.
* The current threat profile.

The agent's reasoning is therefore continuously evaluated against scope and adversary identity, not just at the start of the session.

### ModelFallback

Provider failover. Each model profile defines a fallback chain — for example: `claude-opus-4-7 → gpt-5.5 → gemini-2.5-pro → MiniMax-M2.5`. When a provider rate-limits or errors, the next provider in the chain is invoked transparently.

### Summarization

Conversation-window compression. When a specialist agent's context approaches the model's window, prior turns are summarized into a structured digest. Findings, tool outputs, and decisions are preserved; verbatim chat history is collapsed.

### PromptCaching

Anthropic's prompt-cache boundary markers. The middleware places the `CACHE_BOUNDARY` marker between static prompt sections (system prompt, skill catalog) and dynamic sections (live OPPLAN state, current iteration). This lets repeated calls hit the cache, dropping per-call cost dramatically on long engagements.

### PatchToolCalls

The cross-provider compatibility shim. Different providers represent tool calls differently — the middleware normalizes them before they hit the agent prompt and re-shapes them on the way out, so a single agent definition runs against any supported provider.

### OpsControlNotifications

Subscribes to the `opscontrol` daemon's workload-state events (ADR-0006). When a workload transitions (`starting → running → exited`), the middleware injects a `<system-reminder>` `HumanMessage` carrying the new state onto the orchestrator's next inference. The orchestrator never polls after `ops_start("c2-sliver")` — completion comes to it. Mounted on the Decepticon orchestrator only.

### SandboxNotifications

The bash-side counterpart. When a `bash` command launched with `run_in_background=True` finishes, the middleware fetches the captured output diff from the sandbox and injects it as a `<system-reminder>` on the agent's next turn. The specialist never has to remember to call `bash_output()`. Same shape as `OpsControlNotifications`; mounted on every agent that holds the `bash` tool.

### KG (KGMiddleware)

Owns the engagement-scoped Neo4j `KGStore` and exposes the two agent-facing graph tools — `kg_record(observations)` for atomic batch writes and `kg_ingest(scanner_kind, path)` for scanner-adapter dispatch. Enforces the engagement-label validator on every call, so two parallel engagements never bleed into each other's graph state.

### Skillogy

A thin REST client of the standalone Skillogy service. Exposes `find_skill`, `load_skill`, and `traverse` against the Neo4j-backed skill graph (port `9100`). Carries an `allowed_path_prefixes` parameter set by the agent factory; the service enforces the hard ACL ([ADR-0008](https://github.com/PurpleAILAB/Decepticon/blob/main/docs/adr/0008-skillogy-hard-acl-phase1a.md)) so a specialist can't be talked into browsing another role's skills via prompt injection.

### HITL

Wraps LangGraph's native `interrupt()` pattern so the agent can pause for operator approval on consequential actions — production-targeting commands, exfiltration of sensitive data classes, anything flagged by the RoE as high blast radius. The operator sees the proposed action, can approve/redirect/abort, and the agent resumes from the same node.

## Stack Composition by Role

The exact order of middleware matters — outer layers wrap inner ones. The orchestrator and planner inject `EngagementContext` first so RoE/ConOps guardrails apply to every subsequent layer; specialists inherit those guardrails through the OPPLAN slice they receive.

```
┌─ Decepticon Orchestrator ──────────────────┐
│  EngagementContext                          │
│   └─ Skills                                 │
│       └─ Filesystem                         │
│           └─ SubAgent                       │
│               └─ OPPLAN                     │
│                   └─ ModelFallback          │
│                       └─ Summarization      │
│                           └─ PromptCache    │
│                               └─ Patch      │
└─────────────────────────────────────────────┘

┌─ Vulnresearch Orchestrator ────────────────┐
│  Skills                                     │
│   └─ Filesystem                             │
│       └─ SubAgent                           │
│           └─ OPPLAN                         │
│               └─ ModelFallback              │
│                   └─ Summarization          │
│                       └─ PromptCache        │
│                           └─ Patch          │
└─────────────────────────────────────────────┘

┌─ Soundwave Planner ────────────────────────┐
│  EngagementContext                          │
│   └─ Skills                                 │
│       └─ Filesystem                         │
│           └─ ModelFallback                  │
│               └─ Summarization              │
│                   └─ PromptCache            │
│                       └─ Patch              │
└─────────────────────────────────────────────┘

┌─ Specialist (Recon, Exploit, ...) ─────────┐
│  Skills                                     │
│   └─ Filesystem                             │
│       └─ ModelFallback                      │
│           └─ Summarization                  │
│               └─ PromptCache                │
│                   └─ Patch                  │
└─────────────────────────────────────────────┘
```

Specialists also bind the `bash` tool for sandbox command execution; Soundwave does not.

## Streaming Through the Stack

Sub-agent outputs do not collect-and-return — they stream. `StreamingRunnable` wraps every sub-agent so its tokens arrive at:

1. The Python CLI's `UIRenderer` (Ink terminal UI).
2. The LangGraph HTTP custom-event channel (web dashboard).

This is how the operator sees Recon enumerate ports in real time, not after the fact.

<Card title="Agents" icon="brain" href="/en/architecture/agents">
  The sixteen specialist agents that this middleware stack wraps.
</Card>

<Card title="OPPLAN System" icon="clipboard-list" href="/en/features/opplan-system">
  The structured plan that the OPPLAN middleware exposes and injects.
</Card>