StateSet Computer Use Agent - Architecture Overview

Executive Summary

StateSet Computer Use Agent is a production-grade AI automation platform powered by Claude Opus 4.5. The system deploys multiple specialized AI agents that can see, understand, and interact with desktop environments to complete complex, long-running tasks autonomously. Built with Python using async/await patterns throughout, the platform implements Anthropic’s context engineering research achieving 95% cost savings compared to naive approaches. Key Metrics:

Average tokens/task: 7,500 (95% reduction from 150k baseline)
Average cost/task: $0.11 (95% savings from$ 2.25 baseline)
Average task duration: 30 seconds (33% faster with parallel execution)
Parallel speedup: 30-50% on multi-tool tasks

System Architecture Diagram

┌─────────────────────────────────────────────────────────────────────────────┐
│                              User Interface                                  │
│           ┌──────────────┐  ┌──────────────┐  ┌──────────────┐             │
│           │  CLI/Shell   │  │  Dashboard   │  │    APIs      │             │
│           │   Scripts    │  │   (Next.js)  │  │   (REST)     │             │
│           └──────┬───────┘  └──────┬───────┘  └──────┬───────┘             │
└──────────────────┼─────────────────┼─────────────────┼───────────────────────┘
                   │                 │                 │
┌──────────────────▼─────────────────▼─────────────────▼───────────────────────┐
│                            ORCHESTRATION LAYER                               │
│  ┌────────────────────────────────────────────────────────────────────────┐ │
│  │                           main.py                                       │ │
│  │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────────────┐│ │
│  │  │ Agent Selector  │  │  GlobalState    │  │  Multi-Agent Runner     ││ │
│  │  │ (keyword-based) │  │  (thread-safe)  │  │  (asyncio.gather)       ││ │
│  │  └─────────────────┘  └─────────────────┘  └─────────────────────────┘│ │
│  └────────────────────────────────────────────────────────────────────────┘ │
└───────────────────────────────────┬─────────────────────────────────────────┘
                                    │
┌───────────────────────────────────▼─────────────────────────────────────────┐
│                              AGENT ENGINE                                    │
│  ┌────────────────────────────────────────────────────────────────────────┐ │
│  │                         agent/loop.py                                   │ │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌────────────┐ │ │
│  │  │  Sampling    │  │    API       │  │   System     │  │  Message   │ │ │
│  │  │    Loop      │  │  Providers   │  │   Prompt     │  │  Manager   │ │ │
│  │  │              │  │ (3 backends) │  │   Init       │  │  (cache)   │ │ │
│  │  └──────────────┘  └──────────────┘  └──────────────┘  └────────────┘ │ │
│  └────────────────────────────────────────────────────────────────────────┘ │
│                                                                              │
│  ┌────────────────────┐  ┌────────────────────┐  ┌────────────────────────┐ │
│  │   SubagentManager  │  │     MCPManager     │  │  StructuredOutput      │ │
│  │  (task delegation) │  │  (external tools)  │  │    Parser              │ │
│  └────────────────────┘  └────────────────────┘  └────────────────────────┘ │
└───────────────────────────────────┬─────────────────────────────────────────┘
                                    │
┌───────────────────────────────────▼─────────────────────────────────────────┐
│                             TOOL LAYER                                       │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │                        ToolCollection                                 │   │
│  │  ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────────────┐│   │
│  │  │ Computer   │ │   Bash     │ │   Edit     │ │      Memory        ││   │
│  │  │ Tool       │ │   Tool     │ │   Tool     │ │      Tool          ││   │
│  │  │ (GUI ops)  │ │ (commands) │ │ (files)    │ │ (persistence)      ││   │
│  │  └────────────┘ └────────────┘ └────────────┘ └────────────────────┘│   │
│  │  ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────────────┐│   │
│  │  │   AGI      │ │  Subagent  │ │  StateSet  │ │     AskUser        ││   │
│  │  │   Tool     │ │   Tool     │ │  CLI Tool  │ │      Tool          ││   │
│  │  └────────────┘ └────────────┘ └────────────┘ └────────────────────┘│   │
│  └──────────────────────────────────────────────────────────────────────┘   │
└───────────────────────────────────┬─────────────────────────────────────────┘
                                    │
┌───────────────────────────────────▼─────────────────────────────────────────┐
│                          OPTIMIZATION LAYER                                  │
│  ┌───────────────────┐  ┌───────────────────┐  ┌───────────────────────────┐│
│  │ ParallelExecutor  │  │ ContextOptimizer  │  │    ToolExecutionGuard     ││
│  │ (30-50% speedup)  │  │ (5 patterns)      │  │ (safety + verification)   ││
│  └───────────────────┘  └───────────────────┘  └───────────────────────────┘│
│  ┌───────────────────┐  ┌───────────────────┐  ┌───────────────────────────┐│
│  │ StuckDetection    │  │   Verification    │  │      Checkpoint           ││
│  │ (loop prevention) │  │ (visual confirm)  │  │   (state persistence)     ││
│  └───────────────────┘  └───────────────────┘  └───────────────────────────┘│
└───────────────────────────────────┬─────────────────────────────────────────┘
                                    │
┌───────────────────────────────────▼─────────────────────────────────────────┐
│                         OBSERVABILITY LAYER                                  │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │                    UnifiedObservability                               │   │
│  │  ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────────────┐│   │
│  │  │ Structured │ │OpenTelemetry│ │ Prometheus │ │  Real-time Event  ││   │
│  │  │  Logging   │ │  Tracing   │ │  Metrics   │ │    Streaming      ││   │
│  │  └────────────┘ └────────────┘ └────────────┘ └────────────────────┘│   │
│  └──────────────────────────────────────────────────────────────────────┘   │
└───────────────────────────────────┬─────────────────────────────────────────┘
                                    │
┌───────────────────────────────────▼─────────────────────────────────────────┐
│                          EXTERNAL SERVICES                                   │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐  ┌────────────────────────┐ │
│  │ Anthropic  │  │  StateSet  │  │   Stripe   │  │     MCP Servers        │ │
│  │    API     │  │   APIs     │  │  Billing   │  │ (Slack, GitHub, etc.)  │ │
│  └────────────┘  └────────────┘  └────────────┘  └────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘

Core Components

1. Main Orchestrator (`main.py`)

The entry point for all agent execution, responsible for: Environment Validation:

def validate_environment(*, require_display: bool = True) -> Dict[str, str]:
    """Validates ANTHROPIC_API_KEY, DISPLAY, STRIPE_API_KEY, WORKSPACE_PATH"""

Agent Selection:

def get_active_agents(instruction: str) -> List[AgentType]:
    """Keyword-based agent selection from instruction text"""
    # Matches: "auto-close" → AUTO_CLOSE, "social media" → SOCIAL_MEDIA, etc.

Global State Management:

class GlobalState:
    running: bool              # System-wide running flag
    tasks: Set[asyncio.Task]   # Active agent tasks
    shutdown_event: Event      # Graceful shutdown coordination
    _lock: threading.Lock      # Thread-safe state management

Multi-Agent Execution:

async def continuous_loop(agents: List[AgentConfig], instruction: str):
    """Spawns agents in parallel using asyncio.gather()"""
    tasks = [asyncio.create_task(run_agent(agent, instruction)) for agent in agents]
    results = await asyncio.gather(*tasks, return_exceptions=True)

Task Completion Analysis:

async def analyze_task_completion(messages, agent_type) -> TaskStatus:
    """Agent-specific completion detection with indicator patterns"""
    # AUTO_CLOSE: "ticket closed", "successfully closed", "task finished"
    # SOCIAL_MEDIA: "comment hidden", "content removed", "moderation complete"

2. Agent Loop (`agent/loop.py`)

The core conversation engine with Claude API: Sampling Loop:

async def sampling_loop(
    model: str,                    # claude-opus-4-5-20251101
    provider: APIProvider,         # ANTHROPIC | BEDROCK | VERTEX
    system_prompt_suffix: str,     # Agent-specific rules
    messages: List[BetaMessageParam],
    tool_collection: ToolCollection,
    # New capabilities
    enable_subagents: bool = True,
    mcp_servers: Dict = None,
    output_schema: Dict = None,
) -> SamplingLoopResult:

API Provider Support:

Provider	Model ID	Use Case
ANTHROPIC	claude-opus-4-5-20251101	Direct API access
BEDROCK	anthropic.claude-opus-4-5-20251101-v1:0	AWS infrastructure
VERTEX	claude-opus-4-5-20251101	Google Cloud

Beta Flags:

prompt-caching-2024-07-31 - 90% cost reduction on cached tokens
advanced-tool-use-2025-11-20 - Tool search (regex/bm25)
effort-2025-11-24 - Effort parameter (low/medium/high)
computer-use-2025-11-24 - Latest tool version with zoom action

System Prompt Initialization:

async def initialize_system_prompt(agent_config: AgentConfig) -> str:
    """Fetches rules/attributes from StateSet APIs:
       - /api/rules/get-agent-rules
       - /api/attributes/get-agent-attributes
       - /api/agents/get-agent
    """

3. Tool System (`agent/tools/`)

Tool Hierarchy:

BaseAnthropicTool (Abstract)
├── ComputerTool (3 versions)
│   ├── Actions: screenshot, click, type, scroll, drag, zoom
│   ├── Resolution scaling (XGA, WXGA, FWXGA)
│   └── Performance: 8ms typing delay, 100-char groups
├── BashTool
│   ├── Persistent session with sentinel pattern
│   ├── Async subprocess management
│   └── 60-second timeout
├── EditTool
│   ├── File creation/modification
│   └── Directory traversal prevention
├── MemoryTool
│   ├── Commands: view, create, str_replace, insert, delete, rename
│   ├── Prompt injection sanitization
│   └── Per-agent memory isolation
├── AGITool
│   └── Extended AGI capabilities
├── SubagentTool (lazy-loaded)
│   └── Spawn specialized subagents
└── AskUserTool
    └── Human-in-the-loop requests

Tool Versions:

Version	Release	Features
computer_use_20251124	Current	Zoom action, deferred tool loading
computer_use_20250124	Previous	Stable production version
computer_use_20241022	Legacy	Backward compatibility

ToolCollection API:

class ToolCollection:
    tool_map: Dict[str, BaseAnthropicTool]  # name → tool

    def to_params(self) -> List[Dict]       # Convert to API format
    async def run(self, name, input) -> ToolResult
    def set_deferred_tools(self, tools: List[str])  # For tool search

Advanced Capabilities

4. Subagent System (`agent/subagent.py`)

Implements Anthropic’s sub-agent compression pattern for 95% context savings: Subagent Types:

Type	Model	Max Tokens	Use Case
EXPLORE	Haiku	4096	Fast codebase exploration
ANALYZE	Sonnet	8192	Deep analysis with thinking
EXECUTE	Sonnet	4096	Task execution with verification
RESEARCH	Haiku	4096	Web search and synthesis
CODE	Sonnet	8192	Code generation/modification

Architecture:

MainAgent (Opus 4.5)
    │
    ├── spawn_subagent("explore", "Find auth files")
    │   └── Returns: 2k summary (not 50k raw output)
    │
    ├── spawn_subagent("analyze", "Review patterns")
    │   └── Returns: Structured insights
    │
    └── spawn_subagent("execute", "Refactor code")
        └── Returns: Confirmation + diff

Usage:

from agent.subagent import SubagentManager, SubagentType

manager = SubagentManager(api_key)
result = await manager.spawn(
    task="Analyze the authentication flow",
    subagent_type=SubagentType.ANALYZE,
)
# result.result contains compressed summary

5. MCP Client Integration (`agent/mcp_client.py`)

Connect to external Model Context Protocol servers: Supported Transports:

STDIO (subprocess)
SSE (Server-Sent Events)
HTTP (direct HTTP)

Preset Servers:

PRESET_SERVERS = {
    "slack": {"command": "npx", "args": ["-y", "@modelcontextprotocol/server-slack"]},
    "github": {"command": "npx", "args": ["-y", "@modelcontextprotocol/server-github"]},
    "postgres": {"command": "npx", "args": ["-y", "@modelcontextprotocol/server-postgres"]},
    "filesystem": {...},
    "memory": {...},
    "brave-search": {...},
    "puppeteer": {...},
    "sqlite": {...},
}

Usage in sampling_loop:

result = await sampling_loop(
    mcp_servers={
        "slack": {
            "command": "npx",
            "args": ["-y", "@modelcontextprotocol/server-slack"],
            "env": {"SLACK_BOT_TOKEN": os.environ["SLACK_BOT_TOKEN"]}
        }
    },
    # Agent now has access to mcp__slack__send_message, etc.
)

6. Structured Output (`agent/structured_output.py`)

Force Claude to return valid JSON matching specified schemas: Pre-defined Schemas:

TICKET_ANALYSIS_SCHEMA - Support ticket analysis
TASK_RESULT_SCHEMA - Task completion results
CODE_ANALYSIS_SCHEMA - Code review findings
ENTITY_EXTRACTION_SCHEMA - Entity extraction

Usage:

from agent.structured_output import OutputSchema, StructuredOutputParser

schema = OutputSchema(
    name="TicketAnalysis",
    schema={
        "type": "object",
        "properties": {
            "tickets_to_close": {"type": "array", "items": {"type": "string"}},
            "summary": {"type": "string"},
            "confidence": {"type": "number", "minimum": 0, "maximum": 1}
        },
        "required": ["tickets_to_close", "summary"]
    }
)

result = await sampling_loop(output_schema=schema.schema, ...)
parser = StructuredOutputParser(schema)
data = parser.parse(response_text)  # Validates against schema

Optimization Systems

7. Parallel Executor (`agent/parallel_executor.py`)

Automatic parallel execution for independent tool calls: Dependency Analysis:

class DependencyAnalyzer:
    def analyze(self, tool_calls: List[ToolCall]) -> ExecutionPlan:
        """
        Rules:
        - Computer tool calls: Always sequential (visual state dependency)
        - Same path parameter: Sequential (file system dependency)
        - Read-only tools: Can parallelize
        - Write operations: Sequential
        """

Execution Strategy:

Tool Calls: [screenshot, bash(ls), bash(pwd), click]
                ↓
Dependency Analysis:
- screenshot → click (computer tool dependency)
- bash(ls), bash(pwd) (independent, read-only)
                ↓
Execution Plan:
1. [screenshot]           # Sequential
2. [bash(ls), bash(pwd)]  # Parallel
3. [click]                # Sequential
                ↓
Result: 30-50% speedup

8. Context Optimizer (`agent/context_optimizer.py`)

Implements 5 Anthropic context engineering patterns: Pattern 1: Just-in-Time Retrieval

# Instead of: read_file("large_file.py")
# Use: grep("pattern", "large_file.py") | head -50

Pattern 2: Dynamic Compaction

class ContextBudget:
    OPTIMAL = 50_000           # EXCELLENT attention quality
    ATTENTION_DEGRADATION = 100_000  # GOOD → DEGRADED
    WARNING = 150_000          # DEGRADED → WARNING
    CRITICAL = 200_000         # WARNING → CRITICAL

Pattern 3: Structured Note-Taking

# Persistent memory outside context window
memory_tool.create("auth_findings", "OAuth2 flow uses refresh tokens...")

Pattern 4: Sub-Agent Compression

# 50k raw exploration → 2k structured summary
subagent = await manager.spawn(task="Find all API endpoints", type=EXPLORE)

Pattern 5: Attention Budget Monitoring

class AttentionQuality(Enum):
    EXCELLENT = "excellent"  # < 50k tokens
    GOOD = "good"           # < 100k tokens
    DEGRADED = "degraded"   # < 150k tokens
    WARNING = "warning"     # < 200k tokens
    CRITICAL = "critical"   # > 200k tokens

9. Tool Execution Guard (`agent/tool_guard.py`)

Safety and verification layer: Features:

Pre-execution Validation: Safety checks before tool execution
Visual Verification: Confirms actions took effect (optional)
Stuck Detection: Monitors for infinite loops
Result Caching: 120-second TTL for cacheable operations

Speed Modes:

# Normal mode: Verification enabled (~0.5-1.0s per action)
python main.py "task"

# Fast mode: Skip verification (2-3x faster)
AGENT_FAST_MODE=1 python main.py "task"

10. Stuck Detection (`agent/stuck_detection.py`)

Prevents infinite loops and stuck patterns: Detection Methods:

Repeating same action consecutively
Cycling between 2-3 actions
No visual progress (identical screenshots)
Slow progress (too few actions per time)

Recovery Strategies:

class StuckDetector:
    def check(self, action: ActionRecord) -> Optional[RecoverySuggestion]:
        """
        Returns suggestions like:
        - "Try a different approach"
        - "Scroll to see more content"
        - "Check if element exists"
        """

Observability System

11. Unified Observability (`agent/observability/`)

Single interface for all observability concerns: Configuration:

from agent.observability import get_observability, configure_observability

configure_observability(
    enable_metrics=True,
    enable_tracing=True,
    enable_streaming=True,
    metrics_port=9090,
    otlp_endpoint="localhost:4317",  # OpenTelemetry
)

Usage:

obs = get_observability()

async with obs.task_context("AUTO_CLOSE", "agent-123", "Close tickets"):
    obs.log_info("Starting task", tickets_count=10)

    with obs.tool_execution("computer", action="click"):
        # Automatically tracked
        pass

    obs.record_api_call(
        provider="anthropic",
        model="claude-opus-4-5-20251101",
        latency=2.5,
        input_tokens=1500,
        output_tokens=500,
    )

Components:

Component	Purpose	Backend
Structured Logging	JSON logs with context	Python logging
Distributed Tracing	Request correlation	OpenTelemetry
Metrics	Performance tracking	Prometheus
Event Streaming	Real-time updates	SSE/WebSocket
Health Monitoring	System health	Circuit breakers

Environment Variables:

METRICS_PORT=9090           # Prometheus metrics
OTLP_ENDPOINT=localhost:4317  # OpenTelemetry collector
LOG_FORMAT=json             # json | human | compact
LOG_LEVEL=INFO              # DEBUG | INFO | WARNING | ERROR

Infrastructure

12. Configuration Management (`agent/config.py`)

Centralized configuration with documented rationale: Configuration Classes:

@dataclass
class ContextSettings:
    optimal_budget: int = 50_000           # From Anthropic research
    degradation_threshold: int = 100_000   # Attention starts degrading
    warning_threshold: int = 150_000       # Significant degradation
    max_context: int = 200_000             # Model limit

@dataclass
class ToolSettings:
    bash_timeout: int = 60                 # Optimized from 120s
    typing_delay_ms: int = 8               # Characters per ms
    screenshot_retention: int = 5          # Most recent screenshots

@dataclass
class BudgetSettings:
    input_price_per_million: float = 3.0   # Claude Opus 4.5
    output_price_per_million: float = 15.0
    cached_input_price: float = 0.30       # 90% savings

13. Exception Hierarchy (`agent/exceptions.py`)

Comprehensive error handling:

AgentError (base)
├── RetryableError
│   ├── NetworkError
│   ├── RateLimitError
│   ├── TimeoutError
│   └── ServiceUnavailableError
├── NonRetryableError
│   ├── ConfigurationError
│   ├── ValidationError
│   ├── SecurityError
│   └── AuthenticationError
├── BudgetError
│   ├── DailyBudgetExceededError
│   └── TaskBudgetExceededError
└── ToolError
    ├── ToolExecutionError
    ├── ToolTimeoutError
    └── ToolValidationError

14. Health Monitoring (`agent/health.py`)

Production health checks:

class HealthChecker:
    async def check_anthropic_api(test_connectivity=True) -> HealthCheck
    async def check_system_resources() -> HealthCheck
    async def check_disk_space() -> HealthCheck

    # Circuit breaker for failing services
    circuit_breaker: CircuitBreaker

Health States:

HEALTHY - All checks passing
DEGRADED - Some checks failing, system operational
UNHEALTHY - Critical failures

Dashboard Architecture

15. Backend (`dashboard/backend/`)

FastAPI REST API with async operations:

dashboard/backend/
├── app/
│   ├── main.py          # FastAPI app factory
│   ├── api/             # REST API routes
│   │   ├── jobs.py      # Job CRUD
│   │   ├── templates.py # Workflow templates
│   │   ├── artifacts.py # Screenshot/output storage
│   │   └── metrics.py   # Performance tracking
│   ├── models/          # SQLAlchemy ORM models
│   ├── schemas/         # Pydantic schemas
│   ├── services/        # Business logic
│   ├── tasks/           # Celery workers
│   │   └── worker.py    # Async agent execution
│   └── core/            # Configuration, database
└── migrations/          # Alembic schema versioning

Key Technologies:

FastAPI with CORS
SQLAlchemy async ORM
PostgreSQL database
Celery task queue
Server-Sent Events (SSE)
S3-compatible artifact storage (boto3)

16. Frontend (`dashboard/frontend/`)

Next.js 14 application:

dashboard/frontend/
├── app/                 # App router pages
├── components/          # React components
├── hooks/               # Custom React hooks
└── lib/                 # Utilities

Key Technologies:

Next.js 14 with app router
React Query for data fetching
Tailwind CSS styling
EventSource for real-time updates

Execution Flow

Complete Request Flow

1. User Command
   │
   ▼
2. validate_environment()
   ├── Check ANTHROPIC_API_KEY
   ├── Check DISPLAY
   └── Validate optional keys
   │
   ▼
3. get_active_agents(instruction)
   ├── Parse keywords: "auto-close" → AUTO_CLOSE
   └── Return: List[AgentConfig]
   │
   ▼
4. continuous_loop(agents, instruction)
   │
   ├──────────────────────────────────────┐
   │                                      │
   ▼                                      ▼
5a. run_agent(AUTO_CLOSE)           5b. run_agent(SOCIAL_MEDIA)
   │                                      │
   ▼                                      ▼
6. initialize_system_prompt()        6. initialize_system_prompt()
   ├── Fetch rules from StateSet         (parallel)
   └── Build system prompt
   │
   ▼
7. sampling_loop()
   │
   ├─── Send to Claude API ──────────────────────────┐
   │         │                                       │
   │         ▼                                       │
   │    Claude Response                              │
   │    ├── Text content                             │
   │    └── Tool calls                               │
   │         │                                       │
   │         ▼                                       │
   ├─── ToolExecutionGuard.execute()                 │
   │    ├── DependencyAnalyzer                       │
   │    ├── ParallelToolExecutor                     │
   │    ├── StuckDetection                           │
   │    └── Verification (optional)                  │
   │         │                                       │
   │         ▼                                       │
   │    Tool Results                                 │
   │         │                                       │
   └─────────┴───────────────────────────────────────┘
             │
             ▼ (loop until done)
   │
   ▼
8. analyze_task_completion()
   ├── Check completion indicators
   └── Return TaskStatus
   │
   ▼
9. send_stripe_meter_event()
   ├── Token usage
   └── Cost calculation
   │
   ▼
10. shutdown_gracefully()
    ├── Cancel all tasks
    └── Cleanup resources

Agent Types

Supported Agents

Agent Type	Keywords	Purpose
AUTO_CLOSE	”auto-close”, “ticket”	Close resolved support tickets
SOCIAL_MEDIA	”social media”, “moderate”	Content moderation
LINKEDIN_MESSENGER	”linkedin”, “outreach”	LinkedIn automation
SLACK_SUPPORT	”slack”, “support”	Slack support automation
SHOPIFY	”shopify”, “e-commerce”	E-commerce management
ONBOARDING	”onboarding”, “setup”	User onboarding
STATESET_AGENTIC	”stateset”, “custom”	Custom tasks

Agent Configuration

@dataclass
class AgentConfig:
    org_id: str               # Organization identifier
    agent_id: str             # Unique agent identifier
    description: str          # Agent purpose
    capabilities: List[str]   # What the agent can do
    stripe_customer_id: str   # Billing identifier

Security Architecture

API Key Management

All keys via environment variables
Validation on startup
No key transmission to external services

Tool Safety

Directory traversal prevention in EditTool
Prompt injection protection in MemoryTool
Pre-execution validation via ToolExecutionGuard
Agent memory isolation (per agent_id)

Sandbox Execution

Tools run in controlled environment
File system access limited by permissions
Network access controlled by system

Performance Characteristics

Benchmarks

Metric	Value	Notes
Tokens/task	7,500	95% reduction from 150k
Cost/task	$0.11	95% savings from $2.25
Task duration	30s	33% faster with parallel
Parallel speedup	30-50%	On multi-tool tasks
Typing speed	8ms/char	Optimized from 50ms
Bash timeout	60s	Optimized from 120s

Cost Breakdown

Operation	Price
Input tokens	$3.00/1M
Output tokens	$15.00/1M
Cached input	$0.30/1M (90% savings)

File Organization

stateset-computer-use-agent/
├── main.py                      # Entry point, orchestration
├── agent/
│   ├── loop.py                  # Core sampling loop
│   ├── parallel_executor.py     # Parallel tool execution
│   ├── context_optimizer.py     # Context engineering
│   ├── tool_guard.py            # Safety checks
│   ├── stuck_detection.py       # Loop prevention
│   ├── verification.py          # Visual verification
│   ├── subagent.py              # Subagent spawning
│   ├── mcp_client.py            # MCP integration
│   ├── structured_output.py     # JSON schema validation
│   ├── checkpoint.py            # State persistence
│   ├── metrics.py               # Performance tracking
│   ├── skill_manager.py         # Skill system
│   ├── config.py                # Configuration
│   ├── exceptions.py            # Error hierarchy
│   ├── logging_config.py        # Structured logging
│   ├── health.py                # Health monitoring
│   ├── observability/           # Unified observability
│   │   ├── unified.py
│   │   ├── tracing.py
│   │   └── metrics.py
│   └── tools/                   # Tool implementations
│       ├── base.py
│       ├── collection.py
│       ├── computer.py
│       ├── bash.py
│       ├── edit.py
│       ├── memory.py
│       ├── agi.py
│       └── groups.py
├── dashboard/
│   ├── backend/                 # FastAPI + Celery
│   └── frontend/                # Next.js 14
├── start-*.sh                   # Launch scripts
└── test_*.py                    # Test suites

Extension Points

Adding New Agents

Define AgentConfig in AGENT_CONFIGS
Add keyword detection in get_active_agents()
Create completion indicators in analyze_task_completion()

Adding New Tools

Inherit from BaseAnthropicTool
Implement __call__ returning ToolResult
Add to version groups in agent/tools/groups.py
Update tool traits if cacheable/read-only

Adding MCP Servers

await mcp_manager.add_server("custom-server", {
    "command": "npx",
    "args": ["-y", "@my/custom-mcp-server"],
    "env": {"API_KEY": "..."}
})

Quick Reference

Environment Variables

# Required
ANTHROPIC_API_KEY=sk-ant-api03-...
DISPLAY=:1

# Optional
STRIPE_API_KEY=sk_live_...
WORKSPACE_PATH=/path/to/workspace
AGENT_FAST_MODE=1              # Skip verification
METRICS_PORT=9090              # Prometheus
OTLP_ENDPOINT=localhost:4317   # OpenTelemetry
LOG_FORMAT=json                # json | human | compact
LOG_LEVEL=INFO                 # DEBUG | INFO | WARNING | ERROR

Common Commands

# Run agents
python main.py "auto-close tickets"
python main.py "auto-close and social media"  # Parallel

# With options
python main.py --effort medium "task"
python main.py --tool-search regex --defer-tool agi_agent "task"

# Dashboard
cd dashboard && docker compose up -d

This architecture provides a scalable, maintainable foundation for computer use automation with AI agents, implementing production-grade patterns for reliability, observability, and cost optimization.

Get started

Platform

StateSet One

StateSet Response

StateSet ResponseCX

StateSet iCommerce

StateSet Commerce

StateSet Sandbox

StateSet ACP

StateSet Sequencer

StateSet UCP

StateSet Cloud

Tools

​StateSet Computer Use Agent - Architecture Overview

​Executive Summary

​System Architecture Diagram

​Core Components

​1. Main Orchestrator (main.py)

​2. Agent Loop (agent/loop.py)

​3. Tool System (agent/tools/)

​Advanced Capabilities

​4. Subagent System (agent/subagent.py)

​5. MCP Client Integration (agent/mcp_client.py)

​6. Structured Output (agent/structured_output.py)

​Optimization Systems

​7. Parallel Executor (agent/parallel_executor.py)

​8. Context Optimizer (agent/context_optimizer.py)

​9. Tool Execution Guard (agent/tool_guard.py)

​10. Stuck Detection (agent/stuck_detection.py)

​Observability System

​11. Unified Observability (agent/observability/)

​Infrastructure

​12. Configuration Management (agent/config.py)

​13. Exception Hierarchy (agent/exceptions.py)

​14. Health Monitoring (agent/health.py)

​Dashboard Architecture

​15. Backend (dashboard/backend/)

​16. Frontend (dashboard/frontend/)

​Execution Flow

​Complete Request Flow

​Agent Types

​Supported Agents

​Agent Configuration

​Security Architecture

​API Key Management

​Tool Safety

​Sandbox Execution

​Performance Characteristics

​Benchmarks

​Cost Breakdown

​File Organization

​Extension Points

​Adding New Agents

​Adding New Tools

​Adding MCP Servers

​Quick Reference

​Environment Variables

​Common Commands

StateSet Computer Use Agent - Architecture Overview

Executive Summary

System Architecture Diagram

Core Components

1. Main Orchestrator (`main.py`)

2. Agent Loop (`agent/loop.py`)

3. Tool System (`agent/tools/`)

Advanced Capabilities

4. Subagent System (`agent/subagent.py`)

5. MCP Client Integration (`agent/mcp_client.py`)

6. Structured Output (`agent/structured_output.py`)

Optimization Systems

7. Parallel Executor (`agent/parallel_executor.py`)

8. Context Optimizer (`agent/context_optimizer.py`)

9. Tool Execution Guard (`agent/tool_guard.py`)

10. Stuck Detection (`agent/stuck_detection.py`)

Observability System

11. Unified Observability (`agent/observability/`)

Infrastructure

12. Configuration Management (`agent/config.py`)

13. Exception Hierarchy (`agent/exceptions.py`)

14. Health Monitoring (`agent/health.py`)

Dashboard Architecture

15. Backend (`dashboard/backend/`)

16. Frontend (`dashboard/frontend/`)

Execution Flow

Complete Request Flow

Agent Types

Supported Agents

Agent Configuration

Security Architecture

API Key Management

Tool Safety

Sandbox Execution

Performance Characteristics

Benchmarks

Cost Breakdown

File Organization

Extension Points

Adding New Agents

Adding New Tools

Adding MCP Servers

Quick Reference

Environment Variables

Common Commands