> ## Documentation Index
> Fetch the complete documentation index at: https://docs.decepticon.red/llms.txt
> Use this file to discover all available pages before exploring further.

# Multi-Model Routing

> LiteLLM-powered model routing with role-based profiles and automatic fallback.

## Overview

Not every agent task requires the most powerful model. Decepticon uses **LiteLLM** as a proxy layer to route different agent roles to different model tiers — balancing cost, speed, and capability.

## Model Profiles

Three built-in profiles optimize for different use cases:

| Profile  | Orchestrator | Exploit    | Recon      | Use Case                                           |
| -------- | ------------ | ---------- | ---------- | -------------------------------------------------- |
| **eco**  | Opus 4.7     | Sonnet 4.6 | Haiku 4.5  | Production — best cost/performance ratio (default) |
| **max**  | Opus 4.7     | Opus 4.7   | Sonnet 4.6 | High-value targets — maximum capability            |
| **test** | Haiku 4.5    | Haiku 4.5  | Haiku 4.5  | Development/CI — fast iteration                    |

The full per-tier model matrix across providers is:

| Tier     | Anthropic API     | Anthropic OAuth        | OpenAI     | Google                | MiniMax                |
| -------- | ----------------- | ---------------------- | ---------- | --------------------- | ---------------------- |
| **HIGH** | claude-opus-4-7   | auth/claude-opus-4-7   | gpt-5.5    | gemini-2.5-pro        | MiniMax-M2.5           |
| **MID**  | claude-sonnet-4-6 | auth/claude-sonnet-4-6 | gpt-5.4    | gemini-2.5-flash      | MiniMax-M2.5-lightning |
| **LOW**  | claude-haiku-4-5  | auth/claude-haiku-4-5  | gpt-5-nano | gemini-2.5-flash-lite | —                      |

<Info>
  The **eco** profile is recommended for most engagements. The orchestrator needs deep reasoning (Opus), but reconnaissance tasks work well with faster, cheaper models.
</Info>

## Role-Based Routing

Each agent role has different cognitive demands:

<CardGroup cols={2}>
  <Card title="Orchestrator" icon="brain">
    Needs deep reasoning for objective sequencing, dependency resolution, and attack path adaptation. Routes to the most capable model.
  </Card>

  <Card title="Exploit Agent" icon="bug">
    Requires strong technical reasoning for vulnerability exploitation and tool operation. Routes to mid-high tier.
  </Card>

  <Card title="Recon Agent" icon="magnifying-glass">
    Handles structured tasks — port scanning, service enumeration, output parsing. Works well with faster models.
  </Card>

  <Card title="Post-Exploit Agent" icon="key">
    Complex post-exploitation through C2 sessions. Routes similarly to exploit tier.
  </Card>
</CardGroup>

## Automatic Fallback

Each tier has a fallback chain across providers. If the primary provider hits a rate limit or experiences an outage, Decepticon seamlessly switches to the next provider in the chain:

```
HIGH:  Opus 4.7   → GPT-5.5    → Gemini 2.5 Pro        → MiniMax-M2.5
MID:   Sonnet 4.6 → GPT-5.4    → Gemini 2.5 Flash      → MiniMax-M2.5-lightning
LOW:   Haiku 4.5  → GPT-5-nano → Gemini 2.5 Flash-Lite
```

Provider order is configurable via the `DECEPTICON_AUTH_PRIORITY` environment variable (e.g., `anthropic_oauth,anthropic_api,openai_api,google_api,minimax_api`). The switch is transparent — no manual intervention, no interrupted operations.

## Provider Support

Any LiteLLM-compatible backend works:

* **Anthropic** — Claude Opus 4.7, Sonnet 4.6, Haiku 4.5 (API key or Claude OAuth subscription)
* **OpenAI** — GPT-5.5, GPT-5.4, GPT-5-nano
* **Google** — Gemini 2.5 Pro / Flash / Flash-Lite
* **MiniMax** — MiniMax-M2.5 and M2.5-lightning
* **Self-hosted** — vLLM, Ollama, or any OpenAI-compatible endpoint

Configure via `decepticon onboard` or environment variables. See [Configuration](/en/getting-started/configuration) for details.
