Context Compression

Long-running agents accumulate transcripts that eventually exceed the model’s context window. Swarms ships a ContextCompressor that summarizes the active memory near the limit and archives the raw transcript — the agent keeps running without manual pruning.

When to use it

max_loops="auto" or any long-running iterative task.
Agents that produce or consume large tool outputs.
Multi-session agents whose MEMORY.md would otherwise grow unbounded.

How it fires

Compression runs at the top of a loop iteration when all of the following hold:

context_compression=True on the agent
Token usage of the active prompt ≥ threshold * context_length
The agent is at the start of an iteration (not mid tool-call)

The default threshold is 0.9 — compression fires at ~90% of the context window.

Default behavior

Compression is enabled by default. Just construct the agent normally:

from swarms import Agent

agent = Agent(
    agent_name="ResearchAgent",
    model_name="claude-sonnet-4-6",
    max_loops=5,
    context_compression=True,  # default
)

agent.run("Research low-latency cloud data warehouses, then dive deep on GCP.")

When the prompt approaches the limit, Swarms:

Summarizes the current transcript with an LLM call.
Copies MEMORY.md to archive/history_<timestamp>.md.
Wipes MEMORY.md and re-seeds it with the summary as a single System message.
Rebuilds conversation_history (system prompt + rules + summary).

The agent keeps running with a small active context; the full pre-compaction transcript stays in archive/.

Tune the compressor

Swap the default ContextCompressor after construction to change the threshold, summarizer model, or summary length:

from swarms import Agent
from swarms.agents.context_compressor import ContextCompressor

agent = Agent(
    agent_name="ResearchAgent",
    model_name="claude-sonnet-4-6",
    max_loops=5,
    context_compression=True,
)

agent._context_compressor = ContextCompressor(
    threshold=0.75,                       # compress earlier
    summarizer_model="claude-haiku-4-5",  # cheaper summary model
    summarizer_temperature=0.1,
    summarizer_max_tokens=3000,
)

Lower threshold for agents with large tool outputs so you compress before any single iteration overflows.

Manual compaction

You can compact memory yourself at any time — useful after a clear milestone (research phase done, plan finalized):

agent.short_memory.compact(
    summary=(
        "Researched cloud data warehouses. "
        "User prefers GCP. Shortlist: BigQuery, AlloyDB, ClickHouse Cloud."
    )
)

Manual compaction follows the same archive → wipe → re-seed flow as automatic compression.

Disable compression

When you want the active MEMORY.md to keep the raw transcript intact:

from swarms import Agent

agent = Agent(
    agent_name="StaticAgent",
    model_name="claude-sonnet-4-6",
    max_loops="auto",
    context_compression=False,
)

Use this for short tasks, or when downstream tooling parses the unmodified transcript.

What ends up on disk

After compaction:

$WORKSPACE_DIR/agents/ResearchAgent/
|-- MEMORY.md                                  # header + compressed summary
`-- archive/
    `-- history_2026-04-20_18-44-12.md         # full pre-compaction transcript

On the next run, Swarms preloads the compact summary from MEMORY.md — the archive is preserved for forensics but does not enter the active context.

Tips

Keep compression on for autonomous loops; the cost of one summary call is small versus a context-overflow failure.
Lower threshold (0.6–0.75) for agents that emit long structured outputs.
Use a cheaper summarizer_model (Haiku) to keep compaction lightweight.
Compact manually at major milestones to lock in important state with a hand-written summary.

Index

Basic Examples

Single Agent

Multi-Agent Examples

Applications

Research

Use Cases

Finance

Integrations

CLI

Context Compression

When to use it

How it fires

Default behavior

Tune the compressor

Manual compaction

Disable compression

What ends up on disk

Tips

See also

Index

Basic Examples

Single Agent

Multi-Agent Examples

Applications

Research

Use Cases

Finance

Integrations

CLI

Documentation Index

​When to use it

​How it fires

​Default behavior

​Tune the compressor

​Manual compaction

​Disable compression

​What ends up on disk

​Tips

​See also

When to use it

How it fires

Default behavior

Tune the compressor

Manual compaction

Disable compression

What ends up on disk

Tips

See also