Why Middleware
A monolithic agent prompt that tries to handle everything ends up doing nothing well. Decepticon factors cross-cutting concerns into composable middleware:- The agent prompt focuses on its specialty.
- Behaviors common to all agents (RoE checks, skill loading) live in one place.
- Cross-provider compatibility (Anthropic, OpenAI, Google, MiniMax) is handled below the prompt, not inside it.
The Layers
Skills
DecepticonSkillsMiddleware (subclass of LangChain’s SkillsMiddleware).
- Loads only skill frontmatter at startup (~100 tokens per skill).
- Filters skills by ATT&CK overlap with the active threat profile.
- Loads full skill body on-demand when the agent invokes
read_file(). - Enforces the SKILL-FIRST RULE: agents must load the relevant skill before acting on a matching trigger.
Filesystem
FilesystemMiddlewareNoExecute — sandboxed file access scoped to the engagement workspace. Provides read_file, write_file, edit_file, ls, glob, and grep against /workspace/{engagement_id}/. Execute is intentionally disabled; specialists use the dedicated bash tool for command execution.
SubAgent
Exposes thetask() tool to the orchestrator. Lets Decepticon dispatch work to specialist agents (Recon, Exploit, Analyst, …) without inlining their prompts. SubAgent outputs are wrapped in StreamingRunnable so they stream through the CLI and HTTP API.
OPPLAN
The structural backbone of orchestration:- Exposes five CRUD tools —
add_objective,get_objective,list_objectives,update_objective,objective_expand. - Injects current OPPLAN state (pending, in-progress, complete) into every LLM call.
- Resolves dependencies — refuses to dispatch objectives whose prerequisites are not complete.
- Supports the Pentesting Task Tree (PTT) via
objective_expand.
EngagementContext
The RoE/ConOps guard rail. On every iteration, the middleware injects:- The active Rules of Engagement.
- The Concept of Operations.
- The current threat profile.
ModelFallback
Provider failover. Each model profile defines a fallback chain — for example:claude-opus-4-7 → gpt-5.5 → gemini-2.5-pro → MiniMax-M2.5. When a provider rate-limits or errors, the next provider in the chain is invoked transparently.
Summarization
Conversation-window compression. When a specialist agent’s context approaches the model’s window, prior turns are summarized into a structured digest. Findings, tool outputs, and decisions are preserved; verbatim chat history is collapsed.PromptCaching
Anthropic’s prompt-cache boundary markers. The middleware places theCACHE_BOUNDARY marker between static prompt sections (system prompt, skill catalog) and dynamic sections (live OPPLAN state, current iteration). This lets repeated calls hit the cache, dropping per-call cost dramatically on long engagements.
PatchToolCalls
The cross-provider compatibility shim. Different providers represent tool calls differently — the middleware normalizes them before they hit the agent prompt and re-shapes them on the way out, so a single agent definition runs against any supported provider.OpsControlNotifications
Subscribes to theopscontrol daemon’s workload-state events (ADR-0006). When a workload transitions (starting → running → exited), the middleware injects a <system-reminder> HumanMessage carrying the new state onto the orchestrator’s next inference. The orchestrator never polls after ops_start("c2-sliver") — completion comes to it. Mounted on the Decepticon orchestrator only.
SandboxNotifications
The bash-side counterpart. When abash command launched with run_in_background=True finishes, the middleware fetches the captured output diff from the sandbox and injects it as a <system-reminder> on the agent’s next turn. The specialist never has to remember to call bash_output(). Same shape as OpsControlNotifications; mounted on every agent that holds the bash tool.
KG (KGMiddleware)
Owns the engagement-scoped Neo4jKGStore and exposes the two agent-facing graph tools — kg_record(observations) for atomic batch writes and kg_ingest(scanner_kind, path) for scanner-adapter dispatch. Enforces the engagement-label validator on every call, so two parallel engagements never bleed into each other’s graph state.
Skillogy
A thin REST client of the standalone Skillogy service. Exposesfind_skill, load_skill, and traverse against the Neo4j-backed skill graph (port 9100). Carries an allowed_path_prefixes parameter set by the agent factory; the service enforces the hard ACL (ADR-0008) so a specialist can’t be talked into browsing another role’s skills via prompt injection.
HITL
Wraps LangGraph’s nativeinterrupt() pattern so the agent can pause for operator approval on consequential actions — production-targeting commands, exfiltration of sensitive data classes, anything flagged by the RoE as high blast radius. The operator sees the proposed action, can approve/redirect/abort, and the agent resumes from the same node.
Stack Composition by Role
The exact order of middleware matters — outer layers wrap inner ones. The orchestrator and planner injectEngagementContext first so RoE/ConOps guardrails apply to every subsequent layer; specialists inherit those guardrails through the OPPLAN slice they receive.
bash tool for sandbox command execution; Soundwave does not.
Streaming Through the Stack
Sub-agent outputs do not collect-and-return — they stream.StreamingRunnable wraps every sub-agent so its tokens arrive at:
- The Python CLI’s
UIRenderer(Ink terminal UI). - The LangGraph HTTP custom-event channel (web dashboard).
Agents
The sixteen specialist agents that this middleware stack wraps.
OPPLAN System
The structured plan that the OPPLAN middleware exposes and injects.
