The problem with “demo agents”
Most agent demos fail in production for the same reasons:
- no deterministic boundaries (everything is a tool call)
- poor observability (no traces, no structured logs)
- no evaluation strategy (quality drifts quietly)
- cost surprises (unbounded retries and context growth)
A production-minded baseline
When building LangGraph flows:
- keep state small and explicit
- make tool I/O schemas strict
- enforce retry budgets per node
- emit structured events per transition
OCI deployment notes
For cost and operations, prefer event-driven compute and avoid always-on workers unless you truly need low-latency.
What I ship first
- a minimal “happy path” graph
- a failure mode path with safe fallbacks
- a small eval suite that runs in CI