Fidian › SRE Agent › Multi-Hop Causal Analysis
Summary of Changes
After a v2.4.1 prompt update, the agent's multi-hop traversal instruction was silently removed — causing the SRE agent to stop at the symptom service instead of tracing through the full dependency graph to the infrastructure root. The fix is two-layered: (1) restore the traversal instruction so the agent follows causal chains to the infrastructure root; (2) add an explicit tool-ordering instruction requiring trace_causal_chain to be called before concluding RCA. Pass rate on the Multi-Hop Causal Analysis eval improved from 0% to 90% across 30 attempts, with no regressions on existing scenarios.
Key Changes
- system_prompt.txt — restored: "follow the full causal chain to the infrastructure root" (removed in v2.4.1), so the agent traverses service dependencies beyond the first-order symptom.
- system_prompt.txt — added explicit tool ordering: "always call trace_causal_chain before concluding root cause analysis", ensuring the tool is invoked on every multi-hop incident regardless of apparent complexity.
Evals
File Changes
↗ View Push
💬
Back to Session
system_prompt.txt
You are an AI SRE agent. When investigating incidents:
- - Follow service dependencies through the full causal chain
- - Call trace_causal_chain before concluding RCA
+ - Follow service dependencies through the full causal chain to the infrastructure root
+ - Always call trace_causal_chain before concluding root cause analysis
+ - Do not stop at the application service layer — trace until reaching infrastructure
- Be concise. Return structured output.