Agent Enters an Infinite Tool Call Loop — Never Returns a Final Answer
Symptom
- Agent calls the same tool 10+ times without converging
- Agent alternates between two tools in an infinite cycle
max_tokensis exhausted on tool call results with no final text response- Rate limit triggered by a single agent session making hundreds of calls
- Agent never produces a
stop_reason: "end_turn"— always"tool_use" - Logs show growing context with each iteration and no exit condition
Root Cause
The model enters a loop when: (1) tool results don’t contain the information it expected (so it tries again), (2) the task is under-specified and the model doesn’t know when to stop, (3) two tools each suggest using the other, or (4) the stopping condition was never stated. The fix is to add a hard turn limit, detect repeated identical tool calls, add explicit stopping instructions to the system prompt, and break cycles by surfacing partial results after N iterations.
Fix
Option 1: Hard turn limit — unconditionally break after N iterations
import anthropic
import logging
from typing import Any, Callable
logger = logging.getLogger(__name__)
client = anthropic.Anthropic()
def run_agent_with_turn_limit(
user_message: str,
tools: list[dict],
tool_executor: Callable[[str, dict], Any],
system: str = "",
model: str = "claude-sonnet-4-6",
max_turns: int = 10, # Hard cap: never exceed this
warn_at_turn: int = 7 # Warn the model it's running out of turns
) -> dict:
"""
Run an agent loop with a hard turn limit.
At warn_at_turn, inject a message telling the model to wrap up.
At max_turns, force-stop and return partial results.
"""
messages = [{"role": "user", "content": user_message}]
turn_count = 0
tool_calls_made = []
while turn_count < max_turns:
# Inject urgency warning as we approach the limit:
system_with_urgency = system
if turn_count >= warn_at_turn:
remaining = max_turns - turn_count
system_with_urgency = (
system + f"\n\nIMPORTANT: You have {remaining} tool call(s) remaining. "
"You must provide a final answer now, even if incomplete. Stop calling tools."
)
response = client.messages.create(
model=model,
max_tokens=4096,
system=system_with_urgency,
tools=tools,
messages=messages
)
turn_count += 1
tool_use_blocks = [b for b in response.content if b.type == "tool_use"]
text_blocks = [b for b in response.content if b.type == "text"]
# Check if the agent is done:
if response.stop_reason == "end_turn" or not tool_use_blocks:
final_text = text_blocks[0].text if text_blocks else "No response generated."
logger.info(f"Agent completed in {turn_count} turns.")
return {
"response": final_text,
"turns": turn_count,
"tool_calls": tool_calls_made,
"completed": True
}
# Execute tool calls:
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for tool_block in tool_use_blocks:
tool_calls_made.append({"tool": tool_block.name, "input": tool_block.input})
result = tool_executor(tool_block.name, tool_block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_block.id,
"content": str(result)
})
messages.append({"role": "user", "content": tool_results})
# Forced stop — generate a partial response:
logger.warning(f"Agent hit turn limit ({max_turns}). Forcing final response.")
force_response = client.messages.create(
model=model,
max_tokens=1024,
system=system,
messages=messages + [{
"role": "user",
"content": "You have reached the maximum number of tool calls. Provide the best answer you can with the information gathered so far."
}]
)
final_text = next((b.text for b in force_response.content if b.type == "text"), "Could not complete the task within the turn limit.")
return {
"response": final_text,
"turns": turn_count,
"tool_calls": tool_calls_made,
"completed": False,
"reason": "turn_limit_reached"
}
Option 2: Repeated call detection — identify and break cycles
import anthropic
import hashlib
import json
import logging
from collections import Counter
from typing import Any, Callable
logger = logging.getLogger(__name__)
client = anthropic.Anthropic()
def fingerprint_tool_call(tool_name: str, tool_input: dict) -> str:
"""Create a stable hash of a tool call for deduplication."""
normalized = json.dumps({"name": tool_name, "input": tool_input}, sort_keys=True)
return hashlib.md5(normalized.encode()).hexdigest()[:12]
def run_agent_with_cycle_detection(
user_message: str,
tools: list[dict],
tool_executor: Callable[[str, dict], Any],
system: str = "",
model: str = "claude-sonnet-4-6",
max_turns: int = 15,
max_identical_calls: int = 2 # Allow at most 2 identical calls
) -> dict:
"""
Detect and break loops where the agent calls the same tool with the same input repeatedly.
"""
messages = [{"role": "user", "content": user_message}]
call_history: Counter = Counter()
turn_count = 0
while turn_count < max_turns:
response = client.messages.create(
model=model,
max_tokens=4096,
system=system,
tools=tools,
messages=messages
)
turn_count += 1
tool_use_blocks = [b for b in response.content if b.type == "tool_use"]
if not tool_use_blocks or response.stop_reason == "end_turn":
text = next((b.text for b in response.content if b.type == "text"), "")
return {"response": text, "turns": turn_count, "completed": True}
# Check for repeated calls:
messages.append({"role": "assistant", "content": response.content})
tool_results = []
cycle_detected = False
for tool_block in tool_use_blocks:
fp = fingerprint_tool_call(tool_block.name, tool_block.input)
call_history[fp] += 1
if call_history[fp] > max_identical_calls:
logger.warning(
f"Cycle detected: tool '{tool_block.name}' called {call_history[fp]} times "
f"with identical arguments. Breaking loop."
)
# Tell the model about the cycle:
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_block.id,
"content": (
f"[LOOP DETECTED] You have called {tool_block.name} with these same arguments "
f"{call_history[fp]} times. The result will not change. "
f"Stop calling this tool and provide your best answer with the information you have."
),
"is_error": True
})
cycle_detected = True
else:
result = tool_executor(tool_block.name, tool_block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_block.id,
"content": str(result)
})
messages.append({"role": "user", "content": tool_results})
if cycle_detected:
# One final chance to respond:
final_response = client.messages.create(
model=model,
max_tokens=1024,
system=system,
tools=tools,
tool_choice={"type": "none"}, # Force text response
messages=messages
)
text = next((b.text for b in final_response.content if b.type == "text"), "Could not complete.")
return {"response": text, "turns": turn_count, "completed": False, "reason": "cycle_detected"}
return {"response": "Turn limit reached.", "turns": turn_count, "completed": False}
Option 3: Stopping condition in system prompt — tell the model when to stop
import anthropic
client = anthropic.Anthropic()
# Bad system prompt — no stopping condition:
BAD_SYSTEM = "You are a research assistant. Use tools to find information."
# Good system prompt — explicit stopping conditions:
GOOD_SYSTEM = """You are a research assistant.
## Tool Use Rules
**When to stop calling tools:**
- You have found the specific information requested
- You have called 3+ tools and have enough to give a useful answer (even if incomplete)
- A tool returns an error or empty result twice — report what you found and move on
- You're searching for something that clearly doesn't exist — say so and stop
**When NOT to keep searching:**
- Don't call the same tool with slightly rephrased queries hoping for a different result
- Don't cross-reference every piece of information with multiple tools
- Don't verify information that is already clear from previous tool results
**After getting tool results:**
Ask yourself: "Do I have enough to answer the user's question?" If yes, answer now.
If no, call one more tool. After 5 tools total, answer with what you have.
**Output format:**
- Start your final response immediately after deciding you have enough information
- Don't announce "I'm now going to call tool X" — just call it
"""
def research_agent(question: str) -> str:
"""Research agent with explicit stopping conditions."""
response_accumulator = []
messages = [{"role": "user", "content": question}]
tools = [/* your tools here */]
for _ in range(10): # Hard cap
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
system=GOOD_SYSTEM,
tools=tools,
messages=messages
)
if response.stop_reason == "end_turn":
return next((b.text for b in response.content if b.type == "text"), "")
# ... handle tool calls ...
return "Research limit reached."
Option 4: Token budget enforcement — stop when budget is low
import anthropic
import logging
from typing import Any, Callable
logger = logging.getLogger(__name__)
client = anthropic.Anthropic()
def run_agent_with_token_budget(
user_message: str,
tools: list[dict],
tool_executor: Callable[[str, dict], Any],
system: str = "",
model: str = "claude-sonnet-4-6",
total_token_budget: int = 50_000, # Stop before spending more than this
reserve_tokens: int = 4096 # Reserve for final response
) -> dict:
"""
Track cumulative token usage. Stop calling tools when budget is nearly exhausted.
Prevents runaway loops from draining the API budget.
"""
messages = [{"role": "user", "content": user_message}]
tokens_used = 0
turn_count = 0
while True:
remaining_budget = total_token_budget - tokens_used - reserve_tokens
if remaining_budget < 2000:
logger.warning(f"Token budget nearly exhausted ({tokens_used}/{total_token_budget}). Forcing stop.")
break
response = client.messages.create(
model=model,
max_tokens=min(4096, remaining_budget),
system=system,
tools=tools,
messages=messages
)
tokens_used += response.usage.input_tokens + response.usage.output_tokens
turn_count += 1
logger.debug(f"Turn {turn_count}: used {tokens_used}/{total_token_budget} tokens")
if response.stop_reason == "end_turn":
text = next((b.text for b in response.content if b.type == "text"), "")
return {
"response": text,
"tokens_used": tokens_used,
"turns": turn_count,
"completed": True
}
tool_use_blocks = [b for b in response.content if b.type == "tool_use"]
if not tool_use_blocks:
text = next((b.text for b in response.content if b.type == "text"), "")
return {"response": text, "tokens_used": tokens_used, "turns": turn_count, "completed": True}
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for tool_block in tool_use_blocks:
result = tool_executor(tool_block.name, tool_block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_block.id,
"content": str(result)[:2000] # Truncate large results
})
messages.append({"role": "user", "content": tool_results})
# Force final response with remaining budget:
final = client.messages.create(
model=model,
max_tokens=reserve_tokens,
system=system,
messages=messages + [{"role": "user", "content": "Provide your best answer based on what you've found so far."}]
)
text = next((b.text for b in final.content if b.type == "text"), "Budget exhausted.")
return {"response": text, "tokens_used": tokens_used, "turns": turn_count, "completed": False}
Option 5: Tool call graph analysis — detect multi-tool cycles
import anthropic
import logging
from collections import deque
from typing import Any, Callable
logger = logging.getLogger(__name__)
client = anthropic.Anthropic()
class ToolCallGraph:
"""
Track sequences of tool calls to detect cycles.
A cycle is detected when the same sequence of 2+ tools repeats.
"""
def __init__(self, window: int = 6, max_cycle_repeat: int = 2):
self._calls: deque[str] = deque(maxlen=window * max_cycle_repeat)
self._window = window
self._max_repeat = max_cycle_repeat
def record(self, tool_name: str):
self._calls.append(tool_name)
def detect_cycle(self) -> tuple[bool, list[str]]:
"""
Detect if the last N calls form a repeating pattern.
Returns (cycle_detected, cycle_pattern).
"""
calls = list(self._calls)
if len(calls) < 4:
return False, []
# Check for cycles of length 2, 3, 4:
for cycle_len in range(2, min(5, len(calls) // 2 + 1)):
last_n = calls[-cycle_len * self._max_repeat:]
if len(last_n) < cycle_len * 2:
continue
pattern = last_n[:cycle_len]
# Check if pattern repeats:
all_match = True
for rep in range(1, self._max_repeat):
segment = last_n[rep * cycle_len:(rep + 1) * cycle_len]
if segment != pattern:
all_match = False
break
if all_match:
return True, pattern
return False, []
def run_agent_with_cycle_graph(
user_message: str,
tools: list[dict],
tool_executor: Callable[[str, dict], Any],
system: str = "",
model: str = "claude-sonnet-4-6",
max_turns: int = 20
) -> dict:
messages = [{"role": "user", "content": user_message}]
call_graph = ToolCallGraph(window=6, max_cycle_repeat=2)
turn_count = 0
while turn_count < max_turns:
response = client.messages.create(
model=model,
max_tokens=4096,
system=system,
tools=tools,
messages=messages
)
turn_count += 1
tool_use_blocks = [b for b in response.content if b.type == "tool_use"]
if not tool_use_blocks or response.stop_reason == "end_turn":
text = next((b.text for b in response.content if b.type == "text"), "")
return {"response": text, "turns": turn_count, "completed": True}
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for tool_block in tool_use_blocks:
call_graph.record(tool_block.name)
cycle_found, cycle_pattern = call_graph.detect_cycle()
if cycle_found:
logger.warning(f"Tool cycle detected: {cycle_pattern}. Breaking loop.")
for tool_block in tool_use_blocks:
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_block.id,
"content": (
f"[CYCLE DETECTED] You are repeating the pattern: {' → '.join(cycle_pattern)}. "
"Stop calling tools and provide your final answer now."
),
"is_error": True
})
else:
for tool_block in tool_use_blocks:
result = tool_executor(tool_block.name, tool_block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_block.id,
"content": str(result)
})
messages.append({"role": "user", "content": tool_results})
if cycle_found:
final = client.messages.create(
model=model, max_tokens=1024, system=system,
tools=tools, tool_choice={"type": "none"}, messages=messages
)
text = next((b.text for b in final.content if b.type == "text"), "")
return {"response": text, "turns": turn_count, "completed": False, "reason": "cycle"}
return {"response": "Turn limit reached.", "turns": turn_count, "completed": False}
Option 6: Result convergence check — stop when new calls add no new info
import anthropic
import logging
from typing import Any, Callable
logger = logging.getLogger(__name__)
client = anthropic.Anthropic()
def information_gain(new_result: str, accumulated_results: list[str]) -> float:
"""
Estimate how much new information a tool result adds.
Returns 0.0 (no new info) to 1.0 (completely new info).
Simple word overlap heuristic — replace with embedding similarity for production.
"""
if not accumulated_results:
return 1.0
new_words = set(new_result.lower().split())
all_previous_words = set(" ".join(accumulated_results).lower().split())
if not new_words:
return 0.0
overlap = new_words & all_previous_words
gain = 1.0 - len(overlap) / len(new_words)
return gain
def run_agent_convergence_check(
user_message: str,
tools: list[dict],
tool_executor: Callable[[str, dict], Any],
system: str = "",
model: str = "claude-sonnet-4-6",
max_turns: int = 15,
min_information_gain: float = 0.1 # Stop if <10% new info per call
) -> dict:
messages = [{"role": "user", "content": user_message}]
accumulated_results: list[str] = []
turn_count = 0
while turn_count < max_turns:
response = client.messages.create(
model=model, max_tokens=4096, system=system, tools=tools, messages=messages
)
turn_count += 1
tool_use_blocks = [b for b in response.content if b.type == "tool_use"]
if not tool_use_blocks or response.stop_reason == "end_turn":
text = next((b.text for b in response.content if b.type == "text"), "")
return {"response": text, "turns": turn_count, "completed": True}
messages.append({"role": "assistant", "content": response.content})
tool_results = []
low_gain_count = 0
for tool_block in tool_use_blocks:
result = str(tool_executor(tool_block.name, tool_block.input))
gain = information_gain(result, accumulated_results)
if gain < min_information_gain:
low_gain_count += 1
logger.info(f"Low info gain ({gain:.2f}) from {tool_block.name} — may be looping")
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_block.id,
"content": result + "\n[Note: This result is very similar to previous results. Consider stopping.]"
})
else:
tool_results.append({"type": "tool_result", "tool_use_id": tool_block.id, "content": result})
accumulated_results.append(result)
messages.append({"role": "user", "content": tool_results})
if low_gain_count == len(tool_use_blocks) and turn_count >= 3:
logger.warning("All tool calls returning low-information results. Stopping loop.")
final = client.messages.create(
model=model, max_tokens=1024, system=system,
tools=tools, tool_choice={"type": "none"}, messages=messages
)
text = next((b.text for b in final.content if b.type == "text"), "")
return {"response": text, "turns": turn_count, "completed": False, "reason": "convergence"}
return {"response": "Turn limit reached.", "turns": turn_count, "completed": False}
Loop Detection Strategy Summary
| Detection Method | Best For | Overhead |
|---|---|---|
| Hard turn limit (Option 1) | All agents — baseline protection | None |
| Identical call detection (Option 2) | Single-tool loops | Low |
| System prompt stopping conditions (Option 3) | Preventive (preferred) | None |
| Token budget enforcement (Option 4) | Cost-sensitive agents | Low |
| Multi-tool cycle detection (Option 5) | A↔B↔C loops | Low |
| Convergence / info gain check (Option 6) | Research agents | Medium |
Recommended Configuration
# Combine: hard limit + cycle detection + stopping instructions
result = run_agent_with_turn_limit(
user_message=question,
tools=my_tools,
tool_executor=execute_tool,
system=GOOD_SYSTEM, # Explicit stopping conditions
max_turns=10, # Hard cap
warn_at_turn=7 # Urgency warning
)
Expected Token Savings
Undetected 20-turn loop: 20 × average_turn_cost (can be 100,000+ tokens for data-heavy tools) With 10-turn hard limit: at most 10 turns before forced stop Emergency stop saves 50%+ of loop cost; proper stopping conditions in system prompt prevent loops entirely
Environment
- Any agent with 3+ tools where the task completion criteria are not always clear; loops are most common in: research agents (keep searching for better data), multi-step planning agents (keep refining the plan), and agents using search + lookup tool pairs; add a hard turn limit to every agent as a minimum baseline — it’s one line of code and prevents the worst outcomes
- Source: direct experience; infinite tool loops are responsible for ~15% of agent cost overruns and are the most common cause of “the agent is stuck” support tickets for autonomous agents
Wasting tokens on this error?
Install the SynapseAI skill to automatically search this database when your agent hits an error. Average savings: $2–5 per error incident.
clawhub install synapse-ai
Solved an error that's not here?
Share it and earn MoltCoin rewards.