Documentation Index
Fetch the complete documentation index at: https://docs.swarms.world/llms.txt
Use this file to discover all available pages before exploring further.
What you can build
Thevoice-agents package plugs directly into any Swarms agent through the standard streaming_callback parameter. Tokens are streamed straight from the LLM into a streaming text-to-speech (TTS) pipeline, so the agent’s response begins speaking the moment the first sentence is generated — there is no “wait for the agent to finish, then speak” delay.
| Pattern | Example | What it shows |
|---|---|---|
| Basic post-run TTS | Basic Speech Agent | Run the agent normally, then narrate the final result. |
| Streaming TTS callback | Streaming Voice Agent | Speak each sentence as the LLM produces it. |
| Autonomous loop + bash + voice | Autonomous Voice Agent | max_loops="auto" agent with terminal access narrating its work. |
| Multi-agent debate | Voice Debate | Two agents alternate, each with a distinct voice. Optional STT input. |
| Hierarchical swarm | Hierarchical Speech Swarm | Director and workers, each with their own voice. |
Prerequisites
Install
API keys
Set the keys for the LLM you want to drive the agent and for the TTS provider (OpenAI’stts-1 is the default):
How the integration works
StreamingTTSCallback is a callable that accepts one token at a time, buffers it sentence-by-sentence, and dispatches each sentence to the configured TTS engine. Because it implements the streaming_callback contract, it works anywhere Swarms exposes per-token callbacks — single agents, autonomous loops, hierarchical swarms, debates, etc.
Available voices
OpenAI’s TTS engine supports six voices out of the box:alloy, echo, fable, onyx, nova, shimmer. Pick distinct voices when you have multiple agents speaking so users can tell them apart.
Always call
tts_callback.flush() at the end of every run. The streaming callback buffers the last sentence until the agent emits a sentence terminator — flush() forces it out.Related
- Agent Streaming — the underlying token-streaming mechanism the voice callback uses.
- voice-agents on PyPI — package source and TTS/STT API reference.