Production patterns for multi-agent LLM systems — circuit breakers, token tracking, session memory, and observability metrics. Drop-in FastAPI router included.
Why this exists. Most “agent framework” tutorials stop at the happy path. Production multi-agent systems fail in five predictable ways: cascading failures when one agent degrades, runaway token cost, race conditions between parallel agents, context overflow in long conversations, and zero observability when something goes wrong. This library ships opinionated, minimal-dependency primitives for all five, extracted from a working autonomous agent system.
from ralph_orchestrator import get_circuit_breaker_registry, get_token_tracker
registry = get_circuit_breaker_registry()
breaker = registry.get_breaker("inventory_agent")
if breaker.is_available():
try:
result = await agent.execute()
breaker.record_success()
get_token_tracker().record_usage(
agent_id="inventory_agent",
input_tokens=result.usage.input_tokens,
output_tokens=result.usage.output_tokens,
model="claude-sonnet-4-5",
)
except Exception:
breaker.record_failure()
raise
| Module | Pattern | What it solves |
|---|---|---|
circuit_breaker |
Hystrix-style circuit breaker with CLOSED / OPEN / HALF_OPEN states, per-agent | One flaky agent takes down the whole orchestrator |
token_tracker |
SQLite-backed LLM cost accounting with per-model pricing + daily/weekly/monthly rollups | No idea which agent is burning your Anthropic budget |
agent_memory |
MemoryManager (session-isolated shared state with 4 conflict-resolution strategies) + WindowMemory (sliding window with auto-compression) |
Parallel agents corrupting shared state; long sessions hitting context limits |
agent_metrics |
Call-level observability: success rate, latency, error taxonomy, agent-to-agent interaction graph, 0–100 health score | Black-box agent failures with no leading indicators |
fastapi_router |
create_orchestration_router() — mounts /api/agents/* endpoints exposing everything above |
Dashboard/Grafana needs a REST surface, not SDK imports |
Zero third-party runtime dependencies outside the standard library. FastAPI is only required if you mount fastapi_router.
pip install -e . # core only
pip install -e ".[fastapi]" # + fastapi / pydantic
pip install -e ".[dev]" # + pytest + coverage
Python 3.10+. All persistence is SQLite (./database/metrics.db, created on first write).
pytest
Three test files ship with the library (tests/test_circuit_breaker.py, tests/test_token_tracker.py, tests/test_agent_memory.py) covering state transitions, cost rollups, conflict resolution, and compression. No external services required.
Mount create_orchestration_router() on any FastAPI app and you get:
GET /api/agents/token-stats?period=daily|weekly|monthly
GET /api/agents/token-stats/trend?days=7
GET /api/agents/metrics?period=daily
GET /api/agents/metrics/trend?days=7
GET /api/agents/metrics/errors
GET /api/agents/metrics/interactions # nodes + edges graph
GET /api/agents/health # system-wide summary
GET /api/agents/circuit-breakers
POST /api/agents/circuit-breakers/{agent_id}/reset
GET /api/agents/memory/stats
Point Grafana, Metabase, or a custom dashboard at those endpoints and you’re done.
from fastapi import FastAPI
from ralph_orchestrator.fastapi_router import create_orchestration_router
app = FastAPI()
app.include_router(create_orchestration_router())
get_circuit_breaker_registry(), get_token_tracker(), get_agent_metrics() return process-local singletons. Simple to reason about; swap for DI if you need to.MemoryManager.create_session() deep-copies shared state at session start; the session commits atomically with configurable conflict handling (LAST_WRITE_WINS, FIRST_WRITE_WINS, MERGE_ARRAYS, ERROR_ON_CONFLICT).AgentMetrics._calculate_health_score for your workload.test_circuit_breaker.py::TestCircuitBreakerIntegration::test_typical_failure_scenario for the full CLOSED → OPEN → HALF_OPEN → CLOSED recovery path.MODEL_PRICING. Update it as Anthropic / OpenAI ship new models.src/ralph_orchestrator/
circuit_breaker.py # fault tolerance
token_tracker.py # cost accounting
agent_memory.py # MemoryManager + WindowMemory
agent_metrics.py # health scoring + interaction graph
fastapi_router.py # REST surface
tests/
test_circuit_breaker.py
test_token_tracker.py
test_agent_memory.py
examples/
quickstart.py
MIT — see LICENSE.