Documentation Index
Fetch the complete documentation index at: https://docs.swarms.world/llms.txt
Use this file to discover all available pages before exploring further.
Overview
TheAgent class is the backbone of the Swarms framework, connecting LLMs with tools, long-term memory, and advanced autonomous capabilities. It provides a production-ready interface for building intelligent agents that can reason, use tools, handle multimodal inputs, and execute complex tasks.
Import
Key Features
- Tool Integration: Native support for function calling and tool execution
- Long-term Memory: RAG-based memory system for context retention
- Autonomous Loops: Dynamic execution with configurable stopping conditions
- Multi-modal Support: Process text, images, and other media
- MCP Support: Integration with Model Context Protocol servers
- Agent Handoffs: Delegate tasks to specialized agents
- Streaming: Real-time token streaming with callbacks
- Fallback Models: Automatic failover to backup models
- State Management: Autosave and state persistence
Initialization
Unique identifier for the agent instance
The name of the agent, used for identification and logging
A description of the agent’s purpose and capabilities. Shown to orchestrators when routing tasks.
The system prompt that defines the agent’s behavior and personality
The language model instance to use. If None, a LiteLLM instance will be created
The LiteLLM-compatible model identifier (e.g.
"gpt-4.1", "claude-sonnet-4-6", "groq/llama-3.3-70b-versatile").Extra keyword arguments forwarded to the underlying LiteLLM client.
Base URL for OpenAI-compatible providers (Ollama, LM Studio, vLLM, etc.).
Override API key for the LLM provider. Falls back to environment variables when unset.
Single fallback model used when the primary model fails.
Maximum number of reasoning loops. Use “auto” for autonomous mode with dynamic planning
List of callable functions that the agent can use as tools
Temperature for LLM sampling (0.0 to 1.0)
Maximum number of tokens in the LLM response
Effective context window in tokens. When
context_compression=True, the agent compresses memory once usage crosses 90% of this limit.Nucleus-sampling parameter. Stripped automatically for Anthropic models when extended thinking is enabled.
Allow the framework to grow/shrink the per-call context budget based on token usage signals.
When
True, the agent runs a ContextCompressor that summarises long histories at 90% of context_length so long sessions never hit the context wall.When
True, read/write MEMORY.md under the workspace so agent state survives process restarts. Set False for stateless tasks.Optional pre/post-processing transforms applied to the conversation history.
Enable basic streaming with formatted panels
Enable detailed token-by-token streaming with metadata (citations, tokens used, etc.)
Callback function to receive streaming tokens in real-time. Use with
agent.run_stream / agent.arun_stream for generator-style consumption.Enable interactive mode (REPL-style) — prompt the user for input between loops.
Enable verbose logging for debugging.
When
False, suppress the agent’s printed output (Rich panels, thinking panel, etc.). Token streams via arun_stream / streaming_callback are unaffected.Output format: ‘str’, ‘string’, ‘list’, ‘json’, ‘dict’, ‘yaml’, ‘xml’
Automatically save agent state during execution
Display agent dashboard on initialization
Long-term memory backend (e.g. vector database) for RAG.
List of fallback models to try in order if the primary model fails.
Number of retry attempts for LLM calls
Interval in seconds between retry attempts
Token that signals the agent to stop execution
Function that returns True when the agent should stop
Alternative stopping function
Enable dynamic temperature adjustment during execution
Enable dynamic loop count adjustment (sets max_loops=“auto”)
Seconds to wait between consecutive loop iterations.
Token the user can type in interactive mode to exit the loop.
When
True, append the framework’s preset stopping marker to the system prompt.Auto-generate a system prompt from the task description when one is not provided.
Name of the user in conversation history
Path to save agent state
Standard operating procedure for the agent
List of standard operating procedures
Rules that govern agent behavior
Prompt for planning phase
Enable planning phase before execution
Enable multi-modal processing (images, etc.).
When multiple images are provided, summarise them into a single context entry before invoking the LLM.
After every tool call, run a brief LLM summary of the tool result and add it to the conversation.
Number of times to retry a failing tool call before giving up.
Display tool inputs/outputs in the agent’s printed output.
Pre-built OpenAI function-calling tool schemas. Use when you want to bypass the auto-generated schema.
Override tool schema used at runtime.
Optional post-processor applied to the agent’s output before returning.
Pydantic models registered for structured-output prompting.
URL or connection object for a single MCP server.
List of MCP server URLs for connecting to multiple servers.
Single MCP connection configuration object.
List of agents to enable task handoffs/delegation
Free-form list of agent capabilities used for routing and documentation.
The agent’s role within a swarm (e.g.
"worker", "director").Tags used to filter or categorise the agent.
Structured list of intended use cases for documentation/marketplace listings.
Execution mode:
interactive (REPL), fast (minimal logging/decoration), or standard.UUID of a prompt from the Swarms marketplace to use as the system prompt.
When
True, publish this agent to the Swarms marketplace on initialization.Path to a directory of Agent Skills (Anthropic
SKILL.md format).Tools to enable for the autonomous looper when
max_loops="auto". Use "all" or a list of tool names.Enable ReAct-style reasoning prompting.
Whether to prepend the framework’s reasoning preamble to the system prompt.
Enable reasoning mode for supported models (e.g. o1, o3, Claude with extended thinking).
Effort level for reasoning models:
"low", "medium", or "high".Maximum extended-thinking budget for Claude reasoning models.
Prepend the framework’s safety preamble to the system prompt.
Randomly select from a pool of models on each call (load-balancing/experimentation).
Optional tokenizer instance used for local token counting.
Path from which to load saved agent state on init.
Methods
run
Execute the agent’s main loop for a given task.The task or prompt for the agent to process
Optional image path or data for vision-enabled models
Agent output formatted according to output_type configuration
call
Alternative syntax for running the agent (callsrun internally).
arun
Async version ofrun.
run_concurrent
Run a single task concurrently using the agent’s internal executor. Returns the awaited result.run_concurrent_tasks
Run a batch of tasks concurrently via a thread pool.bulk_run
Generate responses for multiple input sets. Each input is a dict of kwargs forwarded torun.
save
Save the agent’s current state to disk.load
Load agent state from a saved file.to_dict
Convert agent configuration to dictionary.to_json
Convert agent configuration to JSON string.to_yaml
Convert agent configuration to YAML string.add_tool / add_tools
Dynamically add a tool (or list of tools) to the agent at runtime.remove_tool / remove_tools
Remove a previously-registered tool (or list of tools).reset
Reset the agent’s memory and state.print_dashboard
Display the agent’s configuration dashboard.update_system_prompt / update_max_loops / update_loop_interval / update_retry_attempts / update_retry_interval
In-place setters for runtime reconfiguration.plan
Run only the planning phase for a task without executing.add_memory
Append a message to the agent’s short-term memory.enable_autosave / disable_autosave / cleanup
Control the agent’s background autosave loop.run_stream
Run the agent and yield response tokens one-by-one as a sync generator. The full agent loop (multi-step reasoning, tool calls, MCP, autonomous plan/execute/summary) runs in a background daemon thread; tokens are forwarded to the caller the moment the LLM emits them.arun_stream
Async generator version ofrun_stream. The agent loop runs in a thread executor while tokens are forwarded through an asyncio.Queue, so the caller’s event loop is never blocked.
Both
run_stream and arun_stream work for any max_loops value (1, integer > 1 with tools, or "auto"). They stream tokens through every internal loop, including tool-call turns, synthesis turns after a tool returns, and the autonomous plan/execute/summary cycle.Examples
Basic Usage
Agent with Tools
Multi-modal Agent
Autonomous Agent with Auto Loops
Agent with Streaming
Agent with Fallback Models
Agent with MCP Integration
Agent Handoffs
Marketplace Prompt Loading
Output Types
The agent supports multiple output formats via theoutput_type parameter:
"str"or"string": Returns the last response as a string"str-all-except-first": Returns all responses except system prompt as string (default)"list": Returns conversation as a list of messages"json": Returns conversation as JSON string"dict": Returns conversation as dictionary"yaml": Returns conversation as YAML string"xml": Returns conversation as XML string
Error Handling
The Agent class includes comprehensive error handling:- AgentInitializationError: Raised when agent fails to initialize
- AgentRunError: Raised when execution fails
- AgentLLMError: Raised when LLM encounters issues
- AgentToolError: Raised when tool execution fails
- AgentMemoryError: Raised for memory-related issues
Best Practices
- Set appropriate max_loops: Use 1 for simple tasks, higher numbers for complex reasoning, or “auto” for autonomous planning
- Use tools wisely: Provide well-documented tools with clear function signatures and docstrings
- Enable autosave for long-running tasks: Prevents data loss on interruption
- Configure fallback models: Ensures reliability in production
- Use streaming for real-time feedback: Better user experience for long-running tasks
- Set context_length appropriately: Prevents token limit errors
- Enable verbose mode during development: Helps debug issues
- Use agent handoffs for complex workflows: Delegate subtasks to specialized agents
Related
- BaseSwarm - Base class for multi-agent systems
- BaseStructure - Foundation for swarm structures
- Tools - Creating and using agent tools
- Memory - Long-term memory systems