Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.swarms.world/llms.txt

Use this file to discover all available pages before exploring further.

Overview

SkillOrchestra is a skill-aware agent orchestration system based on the paper “SkillOrchestra: Learning to Route Agents via Skill Transfer”. Instead of end-to-end RL routing, it maintains a Skill Handbook that profiles each agent on fine-grained skills, infers which skills a task requires via LLM, and matches agents to tasks via explicit competence-cost scoring.

Installation

pip install -U swarms

How It Works

SkillOrchestra routes tasks through a 5-step pipeline:
Task -> Skill Inference -> Agent Scoring -> Agent Selection -> Execution -> Learning
  1. Skill Inference — An LLM analyzes the incoming task and identifies which fine-grained skills are required (e.g., python_coding, data_analysis, technical_writing), each with an importance weight.
  2. Agent Scoring — Each agent is scored using a weighted competence-cost formula against the required skills. This step is pure math — no LLM calls.
  3. Agent Selection — The top-k agents with the highest scores are selected.
  4. Execution — Selected agents execute the task. Multiple agents run concurrently via ThreadPoolExecutor.
  5. Learning (optional) — An LLM evaluates the output quality, and agent skill profiles are updated via exponential moving average (EMA).

Scoring Formula

For each agent, the score is computed as:
score = sum(competence_weight * competence_i * importance_i + cost_weight * normalized_cost_i * importance_i) / total_importance
Where:
  • competence_i is the agent’s estimated probability of success on skill i
  • normalized_cost_i is 1 - (cost - min_cost) / (max_cost - min_cost) (lower cost = higher score)
  • importance_i is how important the skill is for the task

Key Components

Data Models

ModelDescription
SkillDefinitionA fine-grained skill with name, description, and optional category
AgentSkillProfileAn agent’s competence (0-1) and cost on a specific skill, with execution statistics
AgentProfileComplete skill profile for a single agent
SkillHandbookCentral data structure mapping all skills to all agent profiles
TaskSkillInferenceLLM output: skills required by a given task with importance weights
AgentSelectionResultResult of agent scoring with name, score, and reasoning
ExecutionFeedbackPost-execution quality assessment for updating skill profiles

Attributes

name
str
default:"SkillOrchestra"
Name identifier for the orchestrator.
description
str
default:"Skill-aware agent orchestration..."
Description of the orchestrator’s purpose.
agents
List[Union[Agent, Callable]]
required
List of agents to orchestrate (at least 1 required).
max_loops
int
default:"1"
Maximum execution-feedback loops per task.
output_type
OutputType
default:"dict"
Output format: "dict", "str", "json", "final", etc.
model
str
default:"gpt-5.4"
LLM model for skill inference and evaluation.
temperature
float
default:"0.1"
LLM temperature for inference calls.
skill_handbook
Optional[SkillHandbook]
default:"None"
Pre-built skill handbook. If None, auto-generated from agent descriptions.
auto_generate_skills
bool
default:"True"
Whether to auto-generate handbook when none is provided.
cost_weight
float
default:"0.3"
Weight for cost component in scoring (0-1).
competence_weight
float
default:"0.7"
Weight for competence component in scoring (0-1).
top_k_agents
int
default:"1"
Number of agents to select per task.
learning_enabled
bool
default:"True"
Whether to update skill profiles after execution via EMA.
learning_rate
float
default:"0.1"
EMA learning rate for profile updates.
autosave
bool
default:"True"
Whether to save conversation history and handbook to disk.
verbose
bool
default:"False"
Whether to log detailed information.
print_on
bool
default:"True"
Whether to print panels to console.

Methods

run()

Run the full pipeline on a single task.
def run(self, task: str, img: Optional[str] = None, imgs: Optional[List[str]] = None) -> Any
Parameters:
  • task (str): The task to execute
  • img (Optional[str]): Optional image input
  • imgs (Optional[List[str]]): Optional list of image inputs
Returns: Result in the specified output_type format

__call__()

Callable interface that delegates to run().
def __call__(self, task: str, *args, **kwargs) -> Any
Parameters:
  • task (str): The task to execute
Returns: Result from run()

batch_run()

Run multiple tasks sequentially.
def batch_run(self, tasks: List[str]) -> List[Any]
Parameters:
  • tasks (List[str]): List of tasks to execute
Returns: List[Any] - List of results, one per task

concurrent_batch_run()

Run multiple tasks concurrently.
def concurrent_batch_run(self, tasks: List[str]) -> List[Any]
Parameters:
  • tasks (List[str]): List of tasks to execute
Returns: List[Any] - List of results from concurrent execution

get_handbook()

Return the current skill handbook as a dictionary.
def get_handbook(self) -> dict
Returns: dict - The skill handbook data

update_handbook()

Replace the skill handbook.
def update_handbook(self, handbook: SkillHandbook) -> None
Parameters:
  • handbook (SkillHandbook): The new skill handbook

Architecture

Pipeline Flow

Scoring and Selection

Execution Modes

Best Practices

Agent Design

  • Write descriptive agent descriptions — The auto-generated skill handbook is only as good as your agent descriptions. Be specific about what each agent can do.
  • Use distinct specializations — Agents with overlapping skills reduce the effectiveness of skill-based routing. Make each agent clearly specialized.
  • Keep system prompts focused — System prompts should reinforce the agent’s specialization, not try to make the agent a generalist.

Tuning Weights

  • Default (0.7 competence / 0.3 cost) — Good for most use cases where quality matters more than cost.
  • High competence weight (0.9 / 0.1) — Use when quality is critical and cost is not a concern.
  • Balanced (0.5 / 0.5) — Use when you want a balance between quality and cost efficiency.
  • High cost weight (0.3 / 0.7) — Use for high-volume, cost-sensitive workloads where “good enough” is acceptable.

Learning Configuration

  • learning_rate=0.1 (default) — Slow adaptation, stable profiles. Good for production.
  • learning_rate=0.3 — Faster adaptation. Good for initial calibration of a new team.
  • max_loops=1 — Single pass, no refinement. Best for simple tasks.
  • max_loops=2-3 — Execute, evaluate, refine. Good for complex tasks that benefit from iterative improvement.

Error Handling

from swarms import SkillOrchestra

try:
    result = orchestra.run(task)
except ValueError as e:
    # Configuration errors (no agents, invalid weights)
    print(f"Configuration error: {e}")
except Exception as e:
    # Execution errors (LLM failures, agent errors)
    print(f"Execution error: {e}")

Inspecting Routing Decisions

Enable verbose=True and print_on=True to see detailed routing information:
from swarms import SkillOrchestra

orchestra = SkillOrchestra(
    agents=agents,
    verbose=True,    # Logs skill inference and scoring details
    print_on=True,   # Prints formatted panels to console
)

Saving and Loading Handbooks

import json
from swarms import SkillOrchestra, SkillHandbook

# Save a tuned handbook
handbook_dict = orchestra.get_handbook()
with open("my_handbook.json", "w") as f:
    json.dump(handbook_dict, f, indent=2)

# Load and reuse later
with open("my_handbook.json") as f:
    data = json.load(f)

handbook = SkillHandbook.model_validate(data)
orchestra = SkillOrchestra(
    agents=agents,
    skill_handbook=handbook,
    auto_generate_skills=False,
)

Source Code

View the source code on GitHub