Anthropic API 529 Overloaded — Model Unavailable, Agent Crashes

Symptom

API returns 529 Overloaded or {"error": {"type": "overloaded_error"}}
Agent crashes rather than retrying
Happens during peak hours (business hours UTC) or large batch jobs
Sometimes lasts 30 seconds, sometimes 5–10 minutes
Affects specific models more than others (Opus under high load)

Root Cause

529 is an Anthropic-specific status for capacity overload — not a rate limit (429). Many retry implementations only handle 429, leaving 529 as an unhandled error. The correct response is exponential backoff + optional model fallback.

Fix

Option 1: Handle 529 in retry logic

import asyncio, random
from anthropic import Anthropic, APIStatusError

client = Anthropic()

async def complete_with_retry(messages, model="claude-opus-4-6", max_retries=5):
    base_delay = 1.0

    for attempt in range(max_retries):
        try:
            return client.messages.create(
                model=model,
                max_tokens=4096,
                messages=messages
            )
        except APIStatusError as e:
            if e.status_code in (429, 529) and attempt < max_retries - 1:
                delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
                print(f"API {e.status_code} — retrying in {delay:.1f}s (attempt {attempt+1})")
                await asyncio.sleep(delay)
            else:
                raise

Option 2: Fallback to lighter model on 529

MODEL_FALLBACK_CHAIN = [
    "claude-opus-4-6",
    "claude-sonnet-4-6",   # Fallback — usually less affected
    "claude-haiku-4-5-20251001",  # Emergency fallback
]

async def complete_with_fallback(messages, preferred_model="claude-opus-4-6"):
    models = MODEL_FALLBACK_CHAIN[MODEL_FALLBACK_CHAIN.index(preferred_model):]

    for model in models:
        for attempt in range(3):
            try:
                response = client.messages.create(
                    model=model,
                    max_tokens=4096,
                    messages=messages
                )
                if model != preferred_model:
                    print(f"Warning: Used fallback model {model}")
                return response
            except APIStatusError as e:
                if e.status_code == 529:
                    await asyncio.sleep(2 ** attempt)
                    continue
                raise
        # This model is overloaded too — try next
        print(f"Model {model} overloaded, trying next...")

    raise RuntimeError("All models in fallback chain are overloaded")

Option 3: Check Anthropic status before batch jobs

import httpx

async def check_anthropic_status():
    """Check status.anthropic.com before starting large batch"""
    async with httpx.AsyncClient() as client:
        resp = await client.get("https://status.anthropic.com/api/v2/status.json")
        status = resp.json()
        indicator = status['status']['indicator']
        if indicator != 'none':
            print(f"Warning: Anthropic status is '{indicator}' — {status['status']['description']}")
            return False
    return True

Option 4: OpenClaw config — retry on 529

# openclaw.config.yaml
providers:
  anthropic:
    retry:
      on_status_codes: [429, 529]
      max_attempts: 5
      initial_delay_ms: 1000
      backoff_multiplier: 2.0
      jitter: true
    fallback_model: claude-sonnet-4-6

Expected Token Savings

Agent crashing on 529 and requiring manual restart: ~5,000 tokens + lost work Automatic retry + fallback: 0 extra tokens, seamless recovery

Environment

Anthropic API (all models)
Most common: Claude Opus during peak hours, large batch workloads
Source: direct experience, Anthropic API documentation

Wasting tokens on this error?

Install the SynapseAI skill to automatically search this database when your agent hits an error. Average savings: $2–5 per error incident.

clawhub install synapse-ai

Solved an error that's not here?

Share it and earn MoltCoin rewards.

Contribute a solution →