Sandbox Tradeoffs¶
This document explains why deepagent-temporal requires UnsandboxedWorkflowRunner and the implications for workflow replay determinism.
Why Unsandboxed?¶
Temporal's Python SDK includes a workflow sandbox that prevents non-deterministic operations (I/O, random, system calls) inside workflow code. The sandbox works by restricting imports: any module that could introduce non-determinism is blocked.
deepagent-temporal requires UnsandboxedWorkflowRunner() because the LangGraph runtime imports modules that trigger sandbox violations:
langchain-core— importsjson,hashlib,uuid, and other modules that the sandbox flags as non-deterministic.pydantic— used extensively by LangChain for data validation; triggers sandbox restrictions ondatetimeandtypinginternals.langgraphcheckpoint machinery — accessesthreading,asyncio, andcollectionsinternals.
These imports happen at module load time, not at workflow execution time. The sandbox cannot distinguish between "this module is imported but only used in Activities" and "this module does non-deterministic things in workflow code."
Risk Assessment¶
What the Sandbox Protects Against¶
Temporal workflows must be deterministic — replaying the same workflow from its Event History must produce the same sequence of Activity/timer/signal commands. Non-determinism during replay causes NonDeterminismError and the workflow gets stuck.
Common sources of non-determinism:
- random.random() in workflow code
- datetime.now() in workflow code
- uuid.uuid4() in workflow code
- Network I/O in workflow code
Why the Risk is Low in This Architecture¶
In deepagent-temporal, the workflow function (LangGraphWorkflow.run) is a thin dispatcher:
- It reads the graph topology (node names, edges, conditional edges).
- It dispatches Activities for each node (
execute_node). - It applies state updates from Activity results.
- It handles signals, interrupts, and continue-as-new.
All LangGraph execution happens inside Activities, not in the workflow function. This includes:
- LLM invocations (
call_modelActivity) - Tool execution (
toolsActivity) - Middleware processing (summarization, prompt caching, tool call patching)
- All non-deterministic operations (UUID generation, timestamp access, I/O)
The workflow function itself only performs deterministic operations:
- Reading from WorkflowInput (deterministic)
- Conditional edge evaluation (dispatched as Activities)
- State channel updates (deterministic dict merges)
- Timer/signal waits (Temporal primitives, replay-safe)
Residual Risks¶
Despite the safe architecture, disabling the sandbox means Temporal cannot automatically catch mistakes:
- Future code changes — if someone adds a
uuid.uuid4()call to the workflow function (not an Activity), it would cause replay non-determinism. The sandbox would catch this; without it, the bug only surfaces on replay. - Import side effects — if a LangGraph upgrade introduces module-level side effects that run during workflow replay, they could cause non-determinism.
- Conditional edge evaluation — if a conditional edge function (evaluated in the workflow) uses non-deterministic values, replays may diverge. Currently, conditional edges are evaluated as Activities in
langgraph-temporal, avoiding this risk.
Mitigations¶
1. Architecture-Level Safety¶
The langgraph-temporal architecture already provides the strongest mitigation: all non-deterministic code runs in Activities, not in the workflow. This is the correct pattern regardless of sandbox status.
2. Replay Determinism Testing¶
Test that workflows replay correctly by running a workflow, extracting its Event History, and replaying it on a new worker. If the replay produces different Activity dispatch commands, you have a non-determinism bug.
# Conceptual test (requires Temporal test utilities)
async def test_replay_determinism(temporal_client, temporal_agent):
# Run a workflow to completion
result = await temporal_agent.ainvoke(
{"messages": [HumanMessage(content="hello")]},
config={"configurable": {"thread_id": "replay-test"}},
)
# Get the workflow's event history
handle = temporal_client.get_workflow_handle("replay-test")
history = await handle.fetch_history()
# Replay on a fresh worker — should not raise NonDeterminismError
replayer = WorkflowReplayer(workflows=[LangGraphWorkflow])
await replayer.replay_workflow(history)
3. Keep Workflow Code Thin¶
Do not add custom logic to the workflow function. If you need custom behavior, implement it as middleware (runs in Activities) or as a custom tool (runs in Activities). Never add:
- Direct LLM calls in workflow code
- File I/O in workflow code
- Network calls in workflow code
- Random/UUID generation in workflow code
Summary¶
| Aspect | Status |
|---|---|
| Sandbox mode | Disabled (UnsandboxedWorkflowRunner) |
| Why | LangGraph/LangChain imports trigger sandbox restrictions |
| Risk level | Low — all non-deterministic code runs in Activities |
| Mitigation | Architecture (thin workflow), replay testing |
| Future | May revisit if Temporal Python SDK adds import-allowlisting |