Agentic AI · Retail Banking

Agentic Workflow

How the system behaves at runtime. Read the plain line, then open Go deeper for the mechanism and Internals for the code-level detail. Every trace is captured from the running code.

L1 · Intuition L2 · Mechanism L3 · Internals

What makes it agentic

L1The system doesn't follow one fixed script. It decides what to do, calls tools to get data, changes its plan based on what it finds, and remembers the last answer.

L2Go deeper — the four properties & their proof
PropertyWhere it showsProof
Autonomous reasoningPlanner classifies intent → builds its own planplan_trace[0] names the plan
Tool / function callingAgents call typed tools, never the storetrace logs every get_*() + row count
Data-dependent flowPlan pruned / extended on intermediate resultsAlerting runs even when unplanned
Memory / contextFollow-ups answered from cachetrace: reused cached insight
The plan_trace — a list[str] every agent appends to — is the single artifact proving all four. It renders in the UI's "Agent trace" panel.
L3Internals — why a trace at all

A rule engine has no visible "thoughts" — so reasoning is emitted, not inferred. Each agent receives the shared trace: list[str] by reference and appends to it; the Planner owns the list and returns it as FinalResponse.plan_trace.

This is deliberately the same shape a real LLM tool-calling loop produces. Swapping the rule-based reason() for a Claude loop later changes the origin of the lines, not the contract — the UI and tests are unaffected.

Turn lifecycle

L1Each message runs through one function: look for a follow-up → figure out intent → run the right agents → merge one answer.

L2Go deeper — the 9 steps of Planner.handle()
1 record the user turn in memory 2 follow-up? → answer from cached insight, RETURN (no agents) 3 classify_intent(message) → intent 4 plan = ROUTING[intent] → candidate step list 5 plan empty (greeting/unknown) → direct reply, RETURN 6 insight = spending.run(...) → always first; cache it 7 branch: run_alert = ("alert" in plan) OR insight.anomalies run_rec = ("recommend" in plan) 8 recs/alerts = run agents conditionally 9 merge → FinalResponse, persist memory
Step 7 is the autonomy: the executed step list is decided from data the agent itself produced, not fixed up front.
L3Internals — trace ordering & the insert(0) trick

Spending runs (step 6) and appends its tool-call lines before the Planner knows it succeeded. Only afterward does the Planner prepend the header:

insight = spending.run(message, profile, trace) self.store.set_last_insight(session_id, insight) trace.insert(0, f"Planner → intent={intent}, plan={plan}")

trace.insert(0, ...) guarantees the human-readable narrative reads top-down (Planner → … then SpendingAnalysis → …) even though the spending lines were appended first.

The proactive line is appended only in the override case: if run_alert and "alert" not in plan. So a planned alert stays silent about the branch; an unplanned one announces itself.

Example 01 — full pipeline

L1"How can I improve my monthly savings?" → the system reads the month's spending, spots that food jumped, suggests cuts, and warns that entertainment is near its budget.

L2Go deeper — captured trace & outputs
intent IMPROVE_SAVINGSprofile balancedreduction 15%
Planner → intent=IMPROVE_SAVINGS, plan=['spending', 'recommend', 'alert'] SpendingAnalysis → get_transactions(month=2026-05) → 14 rows SpendingAnalysis → get_transactions(month=2026-04) → 10 rows SpendingAnalysis → categorised 6 categories, 1 anomaly(ies) Recommendation → get_budgets() → evaluating 4 discretionary categories Alerting → get_budgets(), get_balance() → scanning 6 categories vs budget
Totalsfood ₹12000 · entertainment ₹4500 · shopping ₹8000 · travel ₹3500 · groceries ₹6200 · utilities ₹3000
Anomalyfood +25.0% (₹9600 → ₹12000)
Recsfood 15% → ₹1800/mo · shopping 15% → ₹1200/mo
Alertwarning — entertainment ₹4500 / ₹5000 (90%)
Final answerMay spending ₹37200 across 6 categories. Top: food ₹12000. 1 anomaly: food +25.0%. Top tip: Reduce food spending by 15% (~₹1800/mo). Alert: Nearing budget: entertainment ₹4500 / ₹5000 (90%).
L3Internals — debits-only, anomaly-first ranking, the 15%

Debits only. Spending sums type == "debit" per category; the salary credit is excluded — that's why the 14 May rows collapse to 6 spend categories.

Why 20% fixed, not z-score. Two months gives no distribution to estimate variance from. pct > 20.0 is deterministic and demo-stable. food: (12000−9600)/9600 = 25.0% → flagged.

Anomaly ranked first. Recommendation sorts candidates by (c not in anomaly_cats, -spend). False < True, so flagged categories sort ahead even if another category spent more. Essentials (groceries/utilities) are never in DISCRETIONARY, so never reduced.

The 15%. REDUCTION_BY_RISK = {conservative:10, balanced:15, aggressive:20}. Profile balanced → 15%. round(12000 × 0.15) = 1800. A different profile yields a different number on identical spending.

Example 02 — proactive branching

L1"Summarize my spending last month." only asked for a summary — but the system noticed the food anomaly and raised an alert on its own.

L2Go deeper — the unplanned step in the trace
intent SPENDING_SUMMARYplanned ['spending']
Planner → intent=SPENDING_SUMMARY, plan=['spending'] SpendingAnalysis → get_transactions(month=2026-05) → 14 rows SpendingAnalysis → get_transactions(month=2026-04) → 10 rows SpendingAnalysis → categorised 6 categories, 1 anomaly(ies) Alerting → get_budgets(), get_balance() → scanning 6 categories vs budget Planner → proactively ran Alerting (anomaly detected)
Alerting was not in the plan, yet it ran. This is the difference between scripted and reasoned.
L3Internals — the one boolean that does it

The whole behaviour is one expression evaluated after Spending returns:

run_alert = ("alert" in plan) or bool(insight.anomalies) ... if run_alert and "alert" not in plan: trace.append("Planner → proactively ran Alerting (anomaly detected)")

Because plan == ['spending'], "alert" in plan is False — the alert fires purely on insight.anomalies being non-empty. The trace line is conditional on the override, so it appears here but not in Example 01 (where alert was planned).

Example 03 — memory

L1"What about food specifically?" right after → the system answers instantly from what it already computed. No agents, no tool calls.

L2Go deeper — zero-recompute trace
agents run 0tool calls 0
Planner → referential follow-up detected Planner → reused cached insight for 'food' (no recompute)
Final answerYou spent ₹12000 on food this month (up 25% vs last month). Flagged anomaly: food up 25.0% vs prior month. Tip: Reduce food spending by 15% (~₹1800/mo).
It adds value — the +25% delta and a targeted tip — proving the cache is used, not echoed.
L3Internals — detection rule, object identity, a real nuance

Detection. _is_followup() returns a category only if a prior insight is cached AND the query is referential — startswith(("what about","and ","just ","how about")) or len(split) ≤ 3 — AND a known category word appears.

Identity preserved. The returned FinalResponse.insights is the same object pulled from the cache (get_last_insight), not a copy — verifiable by is. The follow-up only slices it.

Honest nuance. The follow-up path uses a flat _FOLLOWUP_PCT = 15, not the risk-driven percentage the Recommendation agent uses. With a balanced profile both are 15%, so they agree here — but an aggressive profile would get 20% from the agent and 15% from a follow-up. A known, documented simplification, not a bug.

Routing reference

L1The words in your message decide which agents run.

L2Go deeper — intent table & data-dependent overrides
IntentCandidate plan
IMPROVE_SAVINGSspending → recommend → alert
SPENDING_SUMMARYspending
BUDGET_STATUSspending → alert
WHY_OVERSPENTspending → alert → recommend
GENERAL_ADVICEspending → recommend
GREETING / UNKNOWNdirect reply, no agents
Condition (after Spending)Override
insight.anomalies non-emptyrun Alerting even if not planned
no discretionary categoryRecommendation returns []
follow-up + cached insightbypass all agents
L3Internals — first-match precedence & concurrency

Precedence is order, not score. classify_intent checks an ordered keyword list and returns on first hit. Greeting is matched by exact word / prefix first (so "hi there" ≠ a finance query). A query touching two groups routes to whichever appears earlier in the list — deterministic by construction.

Fallthrough. No keyword group hit but a finance term present → GENERAL_ADVICE; otherwise UNKNOWN (polite fallback, no agents).

Concurrency assumption. The store is single-flight-per-session — one in-flight turn per session_id. Fine for the demo; a real-LLM build adds a per-session lock so the in-memory dict can't race. The interface stays; swap the dict for Redis.