Retry Semantics¶
This document explains how retries work in deepagent-temporal and how to configure them to avoid double-retry costs.
The Double-Retry Problem¶
Two retry layers exist when running LLM agents on Temporal:
- LLM SDK retries — The LLM client SDK (e.g.,
langchain-anthropic,openai) retries on rate limits (HTTP 429) and transient errors (5xx) internally, with exponential backoff. - Temporal Activity retries — Temporal retries failed Activities according to the Activity's
RetryPolicy. If an LLM call fails after the SDK exhausts its retry budget, Temporal retries the entire Activity — re-invoking the model from scratch.
Without configuration, both layers retry independently:
LLM SDK retry 1 → fail
LLM SDK retry 2 → fail
LLM SDK retry 3 ��� fail (SDK budget exhausted)
Activity fails → Temporal retry 1:
LLM SDK retry 1 → fail
LLM SDK retry 2 → fail
LLM SDK retry 3 → fail
Activity fails → Temporal retry 2:
...
Each Temporal retry re-invokes the model. This can cause unexpected API costs — especially with expensive models.
Recommended Configuration¶
Option A: Disable Temporal Retries for LLM Activities (Recommended)¶
Let the LLM SDK handle its own retries. Set max_attempts=1 on LLM-calling Activities so Temporal does not retry them:
from deepagent_temporal import TemporalDeepAgent
temporal_agent = TemporalDeepAgent(
agent, client,
task_queue="my-agents",
node_retry_policies=TemporalDeepAgent.recommended_retry_policies(),
)
recommended_retry_policies() returns:
| Node | max_attempts |
Rationale |
|---|---|---|
call_model |
1 | LLM SDK handles rate-limit/transient retries internally |
tools |
1 | Tool side-effects (file writes, shell commands) are not idempotent |
Option B: Disable SDK Retries, Use Temporal Retries¶
Alternatively, disable retries in the LLM SDK and let Temporal handle all retries. This gives you a single retry layer with Temporal's observability (retry attempts visible in Event History).
from langchain_anthropic import ChatAnthropic
from langgraph.temporal.config import RetryPolicyConfig
# Disable SDK retries
model = ChatAnthropic(
model="claude-sonnet-4-20250514",
max_retries=0, # Disable SDK retries
)
# Let Temporal handle retries
temporal_agent = TemporalDeepAgent(
agent, client,
node_retry_policies={
"call_model": RetryPolicyConfig(
max_attempts=5,
initial_interval_seconds=2.0,
backoff_coefficient=2.0,
max_interval_seconds=60.0,
non_retryable_error_types=["ContextOverflowError"],
),
"tools": RetryPolicyConfig(max_attempts=1),
},
)
Option C: Custom Per-Node Policies¶
For advanced use cases, configure different retry policies per node:
from deepagent_temporal import RetryPolicyConfig
temporal_agent = TemporalDeepAgent(
agent, client,
node_retry_policies={
# LLM calls: no Temporal retry (SDK handles it)
"call_model": RetryPolicyConfig(max_attempts=1),
# Read-only tools: safe to retry
"tools": RetryPolicyConfig(
max_attempts=3,
initial_interval_seconds=1.0,
backoff_coefficient=2.0,
),
},
)
Activity Timeouts¶
In addition to retry policies, configure Activity timeouts to bound execution time:
from langgraph.temporal.config import ActivityOptions
temporal_agent = TemporalDeepAgent(
agent, client,
node_activity_options={
"call_model": ActivityOptions(
start_to_close_timeout=timedelta(minutes=5),
heartbeat_timeout=timedelta(seconds=60),
),
"tools": ActivityOptions(
start_to_close_timeout=timedelta(minutes=30),
heartbeat_timeout=timedelta(seconds=60),
),
},
)
| Timeout | call_model |
tools |
Purpose |
|---|---|---|---|
start_to_close_timeout |
5 min | 30 min | Max time for a single execution |
heartbeat_timeout |
60s | 60s | Detect stuck activities |
schedule_to_close_timeout |
10 min | 60 min | Max time including retries |
Cost Awareness¶
Each Temporal retry of an LLM Activity re-invokes the model. With the recommended max_attempts=1 configuration:
- A failed LLM call costs only the SDK's internal retries (typically 3 attempts with backoff).
- Without this setting, Temporal's default unlimited retry policy could cause runaway costs.
Monitor your LLM API usage dashboards when experimenting with retry configurations.