Skip to content

ImageAgentBatchProcessor Documentation

Overview

The ImageAgentBatchProcessor is a high-performance parallel image processing system designed for running AI agents on multiple images concurrently. It provides robust error handling, logging, and flexible configuration options.

Installation

pip install swarms

Class Arguments

Parameter Type Default Description
agents Union[Agent, List[Agent], Callable, List[Callable]] Required Single agent or list of agents to process images
max_workers int None Maximum number of parallel workers (defaults to 95% of CPU cores)
supported_formats List[str] ['.jpg', '.jpeg', '.png'] List of supported image file extensions

Methods

run()

Description: Main method for processing multiple images in parallel with configured agents. Can handle single images, multiple images, or entire directories.

Arguments:

Parameter Type Required Description
image_paths Union[str, List[str], Path] Yes Single image path, list of paths, or directory path
tasks Union[str, List[str]] Yes Single task or list of tasks to perform on each image

Returns: List[Dict[str, Any]] - List of processing results for each image

Example:

from swarms import Agent
from swarms.structs import ImageAgentBatchProcessor
from pathlib import Path

# Initialize agent and processor
agent = Agent(api_key="your-api-key", model="gpt-4-vision")
processor = ImageAgentBatchProcessor(agents=agent)

# Example 1: Process single image
results = processor.run(
    image_paths="path/to/image.jpg",
    tasks="Describe this image"
)

# Example 2: Process multiple images
results = processor.run(
    image_paths=["image1.jpg", "image2.jpg"],
    tasks=["Describe objects", "Identify colors"]
)

# Example 3: Process directory
results = processor.run(
    image_paths=Path("./images"),
    tasks="Analyze image content"
)

_validate_image_path()

Description: Internal method that validates if an image path exists and has a supported format.

Arguments:

Parameter Type Required Description
image_path Union[str, Path] Yes Path to the image file to validate

Returns: Path - Validated Path object

Example:

from swarms.structs import ImageAgentBatchProcessor, ImageProcessingError
from pathlib import Path

processor = ImageAgentBatchProcessor(agents=agent)

try:
    validated_path = processor._validate_image_path("image.jpg")
    print(f"Valid image path: {validated_path}")
except ImageProcessingError as e:
    print(f"Invalid image path: {e}")

_process_single_image()

Description: Internal method that processes a single image with one agent and one or more tasks.

Arguments:

Parameter Type Required Description
image_path Path Yes Path to the image to process
tasks Union[str, List[str]] Yes Tasks to perform on the image
agent Agent Yes Agent to use for processing

Returns: Dict[str, Any] - Processing results for the image

Example:

from swarms import Agent
from swarms.structs import ImageAgentBatchProcessor
from pathlib import Path

agent = Agent(api_key="your-api-key", model="gpt-4-vision")
processor = ImageAgentBatchProcessor(agents=agent)

try:
    result = processor._process_single_image(
        image_path=Path("image.jpg"),
        tasks=["Describe image", "Identify objects"],
        agent=agent
    )
    print(f"Processing results: {result}")
except Exception as e:
    print(f"Processing failed: {e}")

call()

Description: Makes the ImageAgentBatchProcessor callable like a function. Redirects to the run() method.

Arguments:

Parameter Type Required Description
*args Any No Variable length argument list passed to run()
**kwargs Any No Keyword arguments passed to run()

Returns: List[Dict[str, Any]] - Same as run() method

Example:

from swarms import Agent
from swarms.structs import ImageAgentBatchProcessor

# Initialize
agent = Agent(api_key="your-api-key", model="gpt-4-vision")
processor = ImageAgentBatchProcessor(agents=agent)

# Using __call__
results = processor(
    image_paths=["image1.jpg", "image2.jpg"],
    tasks="Describe the image"
)

# This is equivalent to:
results = processor.run(
    image_paths=["image1.jpg", "image2.jpg"],
    tasks="Describe the image"
)

Return Format

The processor returns a list of dictionaries with the following structure:

{
    "image_path": str,          # Path to the processed image
    "results": {               # Results for each task
        "task_name": result,   # Task-specific results
    },
    "processing_time": float   # Processing time in seconds
}

Complete Usage Examples

1. Basic Usage with Single Agent

from swarms import Agent
from swarms.structs import ImageAgentBatchProcessor

# Initialize an agent
agent = Agent(
    api_key="your-api-key",
    model="gpt-4-vision"
)

# Create processor
processor = ImageAgentBatchProcessor(agents=agent)

# Process single image
results = processor.run(
    image_paths="path/to/image.jpg",
    tasks="Describe this image in detail"
)

2. Processing Multiple Images with Multiple Tasks

# Initialize with multiple agents
agent1 = Agent(api_key="key1", model="gpt-4-vision")
agent2 = Agent(api_key="key2", model="claude-3")

processor = ImageAgentBatchProcessor(
    agents=[agent1, agent2],
    supported_formats=['.jpg', '.png', '.webp']
)

# Define multiple tasks
tasks = [
    "Describe the main objects in the image",
    "What is the dominant color?",
    "Identify any text in the image"
]

# Process a directory of images
results = processor.run(
    image_paths="path/to/image/directory",
    tasks=tasks
)

# Process results
for result in results:
    print(f"Image: {result['image_path']}")
    for task, output in result['results'].items():
        print(f"Task: {task}")
        print(f"Result: {output}")
    print(f"Processing time: {result['processing_time']:.2f} seconds")

3. Custom Error Handling

from swarms.structs import ImageAgentBatchProcessor, ImageProcessingError

try:
    processor = ImageAgentBatchProcessor(agents=agent)
    results = processor.run(
        image_paths=["image1.jpg", "image2.png", "invalid.txt"],
        tasks="Analyze the image"
    )
except ImageProcessingError as e:
    print(f"Image processing failed: {e}")
except InvalidAgentError as e:
    print(f"Agent configuration error: {e}")

Best Practices

Best Practice Description
Resource Management • The processor automatically uses 95% of available CPU cores
• For memory-intensive operations, consider reducing max_workers
Error Handling • Always wrap processor calls in try-except blocks
• Check the results for any error keys
Task Design • Keep tasks focused and specific
• Group related tasks together for efficiency
Performance Optimization • Process images in batches for better throughput
• Use multiple agents for different types of analysis

Limitations

Limitation Description
File Format Support Only supports image file formats specified in supported_formats
Agent Requirements Requires valid agent configurations
Resource Scaling Memory usage scales with number of concurrent processes

This documentation provides a comprehensive guide to using the ImageAgentBatchProcessor. The class is designed to be both powerful and flexible, allowing for various use cases from simple image analysis to complex multi-agent processing pipelines.