Multi-Agent System Architectures
Welcome to the architectural guide for building advanced multi-agent systems. This guide moves beyond single-agent setups to explore powerful patterns for orchestrating teams of AI agents. By delegating tasks to specialized agents, you can build more robust, scalable, and maintainable AI applications.
We will explore three distinct architectural patterns using the OpenAI Agents API:
- General Triage Model: A central agent routes diverse tasks to the correct specialist.
- Hierarchical Customer Support: A tiered system for handling customer service with clear escalation paths.
- Collaborative Operations Team: An internal-facing system where agents act as a team of department heads to run a business.
1. General Triage Model
This is a fundamental pattern where a primary Triage Agent acts as a smart router. It assesses incoming requests and delegates them to a specialist agent with the appropriate tools and expertise.
Use Case
Ideal for applications that handle a wide variety of tasks, such as a general-purpose assistant that needs to access a knowledge base, manage user-specific memory, or perform business operations.
Architecture
Implementation
The TriageAgent
is configured with handoffs
to the specialist agents. Each specialist has a narrow set of tools and instructions, making them experts at their specific function.
// Define Specialist Agents
const knowledgeAgent = new Agent({
name: 'Knowledge Base Agent',
handoffDescription: 'For questions about our products, services, or policies.',
tools: [vectorSearchTool],
// ...instructions
});
const memoryAgent = new Agent({
name: 'Memory Agent',
handoffDescription: 'To remember or recall user-specific information.',
tools: [memoryTool],
// ...instructions
});
// Define the Triage Agent
const triageAgent = Agent.create({
name: 'Response AI Triage Agent',
instructions: 'You are a master router. Your job is to delegate tasks to the correct specialist agent.',
handoffs: [knowledgeAgent, memoryAgent /*, ...other specialists */],
});
// Enable handoffs back to the triage agent
knowledgeAgent.handoffs = [triageAgent];
memoryAgent.handoffs = [triageAgent];
2. Hierarchical Customer Support Model
This pattern builds on the triage model to create a more structured, customer-facing support system. It defines clear roles for different levels of support and includes a dedicated escalation path for complex or sensitive issues.
Use Case
Perfect for building a scalable, AI-powered customer service department that can handle a high volume of requests while providing expert-level support and a great customer experience for difficult cases.
Architecture
Implementation
The key here is the SeniorSupportSpecialist
, which has access to a broader set of tools and is given instructions that explicitly grant it authority to override policies or offer compensation.
// Tier 1 Specialist
const orderSupportAgent = new Agent({
name: 'Order Support Specialist',
handoffDescription: 'Handles order tracking, modifications, and cancellations.',
tools: [orderLookupTool, updateOrderTool],
// ...instructions
});
// Tier 2 Escalation Agent
const escalationAgent = new Agent({
name: 'Senior Support Specialist',
handoffDescription: 'For complex issues, policy exceptions, and VIP customer care.',
instructions: `You are a Senior Support Specialist with authority to make exceptions and offer compensation.
## Your Enhanced Authority:
- Offer discounts or store credit.
- Override standard return policies when justified.
- Handle complaints with full resolution power.`,
tools: [orderLookupTool, updateOrderTool, createReturnTool, createTicketTool], // Has more tools
});
// Triage Agent with an escalation path
const triageAgent = Agent.create({
name: 'Customer Service Triage',
instructions: `Your job is to route customers to the correct specialist. If the customer is angry or the issue is complex, route to the Senior Support Specialist.`,
handoffs: [orderSupportAgent, returnsAgent, faqAgent, escalationAgent],
});
// Allow Tier 1 agents to escalate
orderSupportAgent.handoffs = [triageAgent, escalationAgent];
returnsAgent.handoffs = [triageAgent, escalationAgent];
3. Collaborative Operations Team Model
This architecture is designed for internal use, acting as an “agentic operating system” for a business. A Master Orchestrator
agent acts as a CEO or project manager, delegating high-level goals to a team of agents representing different departments. These specialists can collaborate and hand off tasks to each other.
Use Case
An internal tool for business leaders to analyze performance, generate strategies, and optimize operations by interacting with a team of AI department heads.
Architecture
Implementation
The main difference is the handoff configuration. Here, specialists can hand off tasks directly to each other, enabling true collaboration to solve multi-faceted problems.
// Define department-specialized agents
const marketingAgent = new Agent({
name: 'Marketing Specialist',
instructions: 'You are an expert in marketing campaigns, analytics, and strategy.',
tools: [generateMarketingStrategyTool, campaignManagementTool],
});
const salesAgent = new Agent({
name: 'Sales Manager',
instructions: 'You are an expert in sales performance, forecasting, and pricing.',
tools: [salesAnalysisTool],
});
const operationsAgent = new Agent({
name: 'Operations Director',
instructions: 'You are an expert in inventory, fulfillment, and supply chain.',
tools: [inventoryAnalysisTool],
});
// The Orchestrator delegates to any specialist
const orchestratorAgent = Agent.create({
name: 'Master Orchestrator',
instructions: 'You are the orchestrator for the business. Delegate tasks to the appropriate department head.',
handoffs: [marketingAgent, salesAgent, operationsAgent],
});
// Configure peer-to-peer handoffs for collaboration
marketingAgent.handoffs = [orchestratorAgent, salesAgent, operationsAgent];
salesAgent.handoffs = [orchestratorAgent, marketingAgent, operationsAgent];
operationsAgent.handoffs = [orchestratorAgent, marketingAgent, salesAgent];
Best Practices for Multi-Agent Systems
1. Error Handling and Resilience
Implement robust error handling to ensure your multi-agent system gracefully handles failures:
// Wrap agent interactions with try-catch blocks
async function handleUserRequest(request: string) {
try {
const response = await triageAgent.process(request);
return response;
} catch (error) {
if (error.code === 'HANDOFF_FAILED') {
// Fallback to a general purpose agent
return await fallbackAgent.process(request);
} else if (error.code === 'TIMEOUT') {
// Implement retry logic with exponential backoff
return await retryWithBackoff(() => triageAgent.process(request));
}
// Log error for monitoring
logger.error('Agent processing failed', { error, request });
throw new Error('Unable to process request. Please try again.');
}
}
// Implement timeout handling for long-running operations
async function processWithTimeout(agent: Agent, request: string, timeoutMs = 30000) {
const timeoutPromise = new Promise((_, reject) =>
setTimeout(() => reject(new Error('Operation timed out')), timeoutMs)
);
return Promise.race([
agent.process(request),
timeoutPromise
]);
}
2. Monitoring and Observability
Track key metrics to ensure your multi-agent system performs optimally:
// Track handoff patterns and success rates
interface HandoffMetrics {
sourceAgent: string;
targetAgent: string;
success: boolean;
duration: number;
timestamp: Date;
}
class MetricsCollector {
async trackHandoff(metrics: HandoffMetrics) {
// Send to your monitoring service
await analytics.track('agent_handoff', metrics);
// Alert on failed handoffs
if (!metrics.success) {
await alerting.notify({
type: 'handoff_failure',
severity: 'warning',
details: metrics
});
}
}
}
3. Testing Multi-Agent Interactions
Create comprehensive tests for your multi-agent systems:
describe('Multi-Agent System Tests', () => {
it('should correctly route customer support requests', async () => {
const testCases = [
{ input: 'Where is my order #12345?', expectedAgent: 'Order Support Specialist' },
{ input: 'I need to return a damaged item', expectedAgent: 'Returns Specialist' },
{ input: 'I\'m very upset about my experience!', expectedAgent: 'Senior Support Specialist' }
];
for (const testCase of testCases) {
const result = await triageAgent.route(testCase.input);
expect(result.selectedAgent).toBe(testCase.expectedAgent);
}
});
it('should handle circular handoffs gracefully', async () => {
// Test that prevents infinite loops between agents
const maxHandoffs = 5;
const result = await triageAgent.process('Complex request requiring multiple handoffs', {
maxHandoffs
});
expect(result.handoffCount).toBeLessThanOrEqual(maxHandoffs);
});
});
Optimize your multi-agent system for speed and efficiency:
// Cache frequently accessed data
const agentCache = new Map<string, CachedAgentData>();
// Preload agent configurations
async function initializeAgents() {
const agents = [triageAgent, orderAgent, returnsAgent, escalationAgent];
await Promise.all(agents.map(async (agent) => {
// Preload tools and configurations
await agent.initialize();
// Warm up the agent with common queries
await agent.warmup([
'Order status inquiry',
'Return request',
'Product information'
]);
}));
}
// Implement connection pooling for external services
const connectionPool = new ConnectionPool({
maxConnections: 10,
idleTimeout: 300000 // 5 minutes
});
5. Security Considerations
Ensure your multi-agent system maintains security best practices:
// Implement rate limiting per user
const rateLimiter = new RateLimiter({
windowMs: 60000, // 1 minute
maxRequests: 10
});
// Sanitize user inputs before processing
function sanitizeInput(input: string): string {
// Remove potential injection attempts
return input
.replace(/<script\b[^<]*(?:(?!<\/script>)<[^<]*)*<\/script>/gi, '')
.replace(/[<>]/g, '')
.trim();
}
// Implement authorization checks for sensitive operations
async function authorizeAgentAction(agent: Agent, action: string, context: Context) {
const permissions = await getAgentPermissions(agent.id);
if (!permissions.includes(action)) {
throw new Error(`Agent ${agent.name} not authorized for action: ${action}`);
}
// Log all authorized actions for audit trail
await auditLog.record({
agent: agent.id,
action,
context,
timestamp: new Date()
});
}
Advanced Patterns
Dynamic Agent Creation
Create agents dynamically based on business needs:
class DynamicAgentFactory {
async createSpecialistAgent(specialty: string, tools: Tool[]) {
const agent = new Agent({
name: `${specialty} Specialist`,
handoffDescription: `Handles ${specialty.toLowerCase()} related queries`,
instructions: await this.generateInstructions(specialty),
tools
});
// Register with the triage agent
await this.registerWithTriage(agent);
return agent;
}
private async generateInstructions(specialty: string): Promise<string> {
// Use AI to generate role-specific instructions
return await generateAgentInstructions({
role: specialty,
capabilities: this.getCapabilitiesForSpecialty(specialty),
guidelines: this.getCompanyGuidelines()
});
}
}
Adaptive Routing
Implement intelligent routing that learns from past interactions:
class AdaptiveRouter {
private routingHistory: Map<string, RoutingDecision[]> = new Map();
async route(request: string, context: Context): Promise<Agent> {
// Get historical routing decisions for similar requests
const similarRequests = await this.findSimilarRequests(request);
// Calculate success rates for each agent
const agentPerformance = this.calculateAgentPerformance(similarRequests);
// Select the best performing agent for this type of request
const selectedAgent = this.selectOptimalAgent(agentPerformance, context);
// Track this routing decision
this.trackRoutingDecision(request, selectedAgent);
return selectedAgent;
}
private calculateAgentPerformance(history: RoutingDecision[]): Map<string, number> {
const performance = new Map<string, number>();
for (const decision of history) {
const currentScore = performance.get(decision.agentId) || 0;
const newScore = decision.success ? currentScore + 1 : currentScore - 1;
performance.set(decision.agentId, newScore);
}
return performance;
}
}
Conclusion
Multi-agent systems represent the future of AI-powered business operations. By implementing these patterns and best practices, you can build robust, scalable, and intelligent systems that dramatically improve efficiency and customer satisfaction.
Next Steps