{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "923e7340",
   "metadata": {},
   "source": [
    "# Chapter 14: Financial and Legal Domain Agents\n",
    "\n",
    "**Book:** *30 Agents Every AI Engineer Must Build*\n",
    "**Author:** Imran Ahmad\n",
    "**Publisher:** Packt Publishing, 2026\n",
    "\n",
    "---\n",
    "\n",
    "> *\"The measure of intelligence is the ability to change.\"* — Albert Einstein\n",
    "\n",
    "## Chapter Overview\n",
    "\n",
    "Building agents for finance and law is a different game entirely. The general-purpose architectures from earlier chapters could afford trial and error — these cannot. A single compliance failure in a regulated domain does not just produce a bad answer. It can trigger fines, sanctions, or criminal liability. Every recommendation must be traced back to its data sources. Every decision must withstand regulatory audit. Every interaction must be logged with enough detail to reconstruct the reasoning months or years later.\n",
    "\n",
    "This notebook implements two **production-grade agent architectures** for regulated domains, corresponding to Chapter 14 (pp. 391–420) of the book:\n",
    "\n",
    "### Part 1 — Financial Advisory Agent (Section 14.1, pp. 392–408)\n",
    "\n",
    "A **supervised multi-agent system** (Figure 14.1) that coordinates specialist agents through a LangGraph `StateGraph`:\n",
    "\n",
    "- **Supervisor Agent** — Policy-aware orchestrator that routes queries to specialists and enforces compliance gates\n",
    "- **Market Data Agent** — Wraps yfinance and Finnhub for real-time stock data (prices, P/E, market cap)\n",
    "- **Financial Analysis Agent** — Computes portfolio metrics using Finnhub API endpoints\n",
    "- **News Agent** — Retrieves qualitative market context via Tavily search\n",
    "- **Risk Scoring** — Composite 0–10 scale combining annualized volatility (40%), maximum drawdown (35%), and Value at Risk (25%)\n",
    "- **Compliance-by-Architecture** — Structurally impossible for non-compliant recommendations to reach the client\n",
    "\n",
    "### Part 2 — Legal Intelligence Agent (Section 14.2, pp. 408–419)\n",
    "\n",
    "A **RAG-powered legal research system** with hybrid retrieval and citation verification:\n",
    "\n",
    "- **Legal Knowledge Base** — Hybrid search combining dense vector retrieval with authority-weighted ranking\n",
    "- **Authority Ranking** — `final_score = 0.5 × similarity + 0.3 × authority + 0.2 × recency`\n",
    "- **3-Stage Precedent Pipeline** — Issue Extraction → Multi-Dimensional Retrieval → Synthesis and Verification\n",
    "- **5-Stage Contract Analysis** — Document Ingestion → Clause Extraction → Risk Flagging → Compliance Validation → Summary Generation\n",
    "- **Citation Verification Gate** — Cross-references every citation against the knowledge base to detect hallucinated case law\n",
    "\n",
    "### Key Insight: Compliance Is Architecture, Not a Feature\n",
    "\n",
    "In regulated industries, compliance is not a feature added after the agent works correctly. It is an **architectural constraint** that shapes every design decision from the outset. Both agents demonstrate this principle — the financial agent's validation gate makes it structurally impossible for non-compliant recommendations to pass through, and the legal agent's citation verification gate ensures only verified precedent enters the final brief.\n",
    "\n",
    "---\n",
    "\n",
    "**Chapter Reference:** pp. 391–420\n",
    "**Figures:** 14.1 (Multi-Agent Financial Architecture), 14.2 (Legal Precedent Pipeline), 14.3 (Contract Analysis Framework)\n",
    "**Table:** 14.1 (Performance Impact of AI-Powered Personalization, p. 403)\n",
    "\n",
    "> ⚠️ **Disclaimer:** Both agents are designed as educational demonstrations. Financial outputs are illustrative and must not be treated as investment advice. Legal outputs are illustrative and must not be treated as legal opinions."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d789ee28",
   "metadata": {},
   "source": [
    "## Cell 0: Setup and Configuration\n",
    "\n",
    "**Ref:** Technical Requirements (p.392)\n",
    "\n",
    "This cell loads environment variables, detects API key availability per service, and configures the notebook to run in either **Live Mode** (with real APIs) or **Simulation Mode** (with chapter-faithful mock data)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c0d6e1aa",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 0: Setup and Configuration\n",
    "# Ref: Technical Requirements (p.392)\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "import os\n",
    "import sys\n",
    "import json\n",
    "import operator\n",
    "import warnings\n",
    "from functools import partial\n",
    "from typing import Annotated, Sequence, TypedDict, Literal, List\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "from dotenv import load_dotenv\n",
    "load_dotenv()\n",
    "\n",
    "from mock_llm import (\n",
    "    ColorLogger,\n",
    "    ServiceConfig,\n",
    "    graceful_fallback,\n",
    "    MockChatOpenAI,\n",
    "    MockStructuredChain,\n",
    "    MockEmbeddingModel,\n",
    "    MockVectorStore,\n",
    ")\n",
    "\n",
    "from mock_data import (\n",
    "    MOCK_STOCK_DATA,\n",
    "    MOCK_FINNHUB_QUOTES,\n",
    "    MOCK_FINNHUB_FINANCIALS,\n",
    "    generate_mock_price_history,\n",
    "    MOCK_TAVILY_NEWS,\n",
    "    MOCK_CLIENT_PROFILES,\n",
    "    MOCK_LEGAL_CASES,\n",
    "    MOCK_CONTRACT,\n",
    "    MOCK_INTER_AGENT_MESSAGE,\n",
    ")\n",
    "\n",
    "warnings.filterwarnings(\"ignore\", category=DeprecationWarning)\n",
    "\n",
    "config = ServiceConfig()\n",
    "logger = ColorLogger(\"Chapter14\")\n",
    "\n",
    "# Conditional LLM selection — Ref: Technical Requirements (p.392)\n",
    "if config.is_live(\"OPENAI_API_KEY\"):\n",
    "    try:\n",
    "        from langchain_openai import ChatOpenAI\n",
    "        llm = ChatOpenAI(model=\"gpt-4o-mini-2024-07-18\", temperature=0)\n",
    "        logger.success(\"Using LIVE OpenAI LLM (gpt-4o-mini-2024-07-18)\")\n",
    "    except Exception as e:\n",
    "        logger.error(f\"ChatOpenAI init failed: {e}. Falling back to MockChatOpenAI.\")\n",
    "        llm = MockChatOpenAI(model=\"gpt-4o-mini-2024-07-18\", temperature=0)\n",
    "else:\n",
    "    llm = MockChatOpenAI(model=\"gpt-4o-mini-2024-07-18\", temperature=0)\n",
    "    logger.info(\"Using SIMULATED LLM (MockChatOpenAI)\")\n",
    "\n",
    "from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
    "from langchain_core.messages import HumanMessage, AIMessage, BaseMessage\n",
    "from langchain_core.tools import tool\n",
    "from pydantic import BaseModel\n",
    "from langgraph.prebuilt import create_react_agent\n",
    "from langgraph.graph import END, START, StateGraph\n",
    "\n",
    "logger.success(\"Setup complete — all imports loaded\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "50d88509",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Multi-provider LLM support (OpenAI / Anthropic / Google Gemini)\n",
    "# Set LLM_PROVIDER in .env to choose: openai | anthropic | google | auto\n",
    "# Auto-detection uses the first available key.\n",
    "# See supporting/llm_provider.py for details.\n",
    "\n",
    "import sys, os\n",
    "sys.path.insert(0, os.path.join(os.path.dirname(os.path.abspath('.')), ''))\n",
    "sys.path.insert(0, '..')\n",
    "\n",
    "try:\n",
    "    from supporting.llm_provider import detect_provider, get_llm, PROVIDER_MODELS, print_provider_banner\n",
    "    _PROVIDER, _PROVIDER_KEY, _PROVIDER_MODE = detect_provider()\n",
    "    print_provider_banner(_PROVIDER, _PROVIDER_MODE)\n",
    "except ImportError:\n",
    "    print('[INFO] supporting/llm_provider.py not found — using default OpenAI path')\n",
    "    _PROVIDER, _PROVIDER_KEY, _PROVIDER_MODE = 'openai', os.getenv('OPENAI_API_KEY'), 'LIVE' if os.getenv('OPENAI_API_KEY') else 'SIMULATION'\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "81706cf1",
   "metadata": {},
   "source": [
    "## Cell 1: Supervisor Architecture\n",
    "\n",
    "**Ref:** Section 14.1, Figure 14.1 (pp. 393–395)\n",
    "\n",
    "The Financial Advisory Agent uses a **supervised multi-agent architecture** (Figure 14.1). A central Supervisor Agent serves as both entry point and policy-aware orchestrator, deciding which specialist to invoke, in what order, and with what state recorded at each step.\n",
    "\n",
    "**Specialist agents:** Market Data Agent, Financial Analysis Agent, News Agent\n",
    "\n",
    "**Figure 14.1 — Multi-agent architecture for the Financial Advisory Agent:**\n",
    "\n",
    "```\n",
    "                  ┌─────────────────┐\n",
    "                  │ Analysis Agent  │\n",
    "                  │(Financial       │\n",
    "                  │ Metrics)        │\n",
    "                  └───────┬─────────┘\n",
    "                          │\n",
    "                          ▼\n",
    "┌─────────────────┐ ┌───────────┐ ┌─────────────────┐\n",
    "│ Market Data     │←│Supervisor │→│   Risk Agent    │\n",
    "│ Agent           │ │   Agent   │ │  (VaR /         │\n",
    "│(yfinance /      │ │           │ │   Volatility)   │\n",
    "│ Finnhub)        │ └─────┬─────┘ └─────────────────┘\n",
    "└─────────────────┘       │\n",
    "                    FINISH (Post-Audit)\n",
    "                          │\n",
    "                          ▼\n",
    "                  ┌───────────────┐\n",
    "                  │  Compliance   │\n",
    "                  │  Validated    │\n",
    "                  │Recommendation│\n",
    "                  └───────────────┘\n",
    "```\n",
    "\n",
    "The state-graph routing mechanism is the key architectural safeguard — it turns the advisory process into a traceable sequence of states and transitions, making it possible to enforce tool permissions, require human checkpoints, and attach an audit trail to every recommendation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6b86028f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 1: Supervisor Architecture\n",
    "# Ref: Section 14.1, Figure 14.1 (p.393-395)\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "class RouteResponse(BaseModel):\n",
    "    next: Literal[\n",
    "        \"Market_Data_Agent\",\n",
    "        \"Financial_Analysis_Agent\",\n",
    "        \"News_Agent\",\n",
    "        \"FINISH\"\n",
    "    ]\n",
    "\n",
    "members = [\"Market_Data_Agent\", \"Financial_Analysis_Agent\", \"News_Agent\"]\n",
    "\n",
    "system_prompt = (\n",
    "    \"You are a Financial Services Supervisor managing: \"\n",
    "    f\"{', '.join(members)}. \"\n",
    "    \"Route queries to the appropriate specialist. \"\n",
    "    \"Use Market_Data_Agent for price and volume data. \"\n",
    "    \"Use Financial_Analysis_Agent for financial computations. \"\n",
    "    \"Use News_Agent for market news and sentiment. \"\n",
    "    \"Select FINISH when the query is fully resolved.\"\n",
    ")\n",
    "\n",
    "prompt = ChatPromptTemplate.from_messages([\n",
    "    (\"system\", system_prompt),\n",
    "    MessagesPlaceholder(variable_name=\"messages\"),\n",
    "    (\"system\", \"Choose the next agent from: {options}.\")\n",
    "]).partial(options=str(members + [\"FINISH\"]))\n",
    "\n",
    "def supervisor_agent(state):\n",
    "    \"\"\"Route to the next specialist agent via structured output.\n",
    "    Ref: Section 14.1, p.395\"\"\"\n",
    "    chain = prompt | llm.with_structured_output(RouteResponse)\n",
    "    return {\"next\": chain.invoke(state).next}\n",
    "\n",
    "logger.success(\"Supervisor architecture initialized\")\n",
    "logger.info(f\"Agent team: {members}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bb230f39",
   "metadata": {},
   "source": [
    "## Cell 2: Market Data Agent\n",
    "\n",
    "**Ref:** Section 14.1.1 (p.395–396)\n",
    "\n",
    "The Market Data Agent wraps the `yfinance` library to retrieve real-time stock information. The `@graceful_fallback` decorator ensures that if the live API fails, the agent falls back to chapter-derived mock data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5690a6c7",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 2: Market Data Agent\n",
    "# Ref: Section 14.1.1, p.395-396\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "@tool\n",
    "def get_market_data(query: str) -> str:\n",
    "    \"\"\"Retrieve current market data for a given stock symbol.\n",
    "    Ref: Section 14.1.1, p.395\"\"\"\n",
    "    stock_symbol = query.strip().upper()\n",
    "\n",
    "    if config.is_live(\"OPENAI_API_KEY\"):\n",
    "        try:\n",
    "            import yfinance as yf\n",
    "            ticker = yf.Ticker(stock_symbol)\n",
    "            info = ticker.info\n",
    "            if not info or not info.get(\"currentPrice\"):\n",
    "                raise ValueError(\"Empty response\")\n",
    "            logger.success(f\"[Market Data] LIVE data for {stock_symbol}\")\n",
    "        except Exception:\n",
    "            info = MOCK_STOCK_DATA.get(stock_symbol, MOCK_STOCK_DATA[\"AAPL\"])\n",
    "            logger.info(f\"[Market Data] Fallback to mock for {stock_symbol}\")\n",
    "    else:\n",
    "        info = MOCK_STOCK_DATA.get(stock_symbol, MOCK_STOCK_DATA[\"AAPL\"])\n",
    "        logger.info(f\"[Market Data] SIMULATED data for {stock_symbol}\")\n",
    "\n",
    "    return (\n",
    "        f\"Market Data for {stock_symbol}: \"\n",
    "        f\"Price: ${info.get('currentPrice', 'N/A')}, \"\n",
    "        f\"Market Cap: ${info.get('marketCap', 'N/A')}, \"\n",
    "        f\"P/E Ratio: {info.get('trailingPE', 'N/A')}, \"\n",
    "        f\"Day Range: ${info.get('dayLow', 'N/A')}-${info.get('dayHigh', 'N/A')}, \"\n",
    "        f\"Volume: {info.get('volume', 'N/A')}\"\n",
    "    )\n",
    "\n",
    "market_agent = create_react_agent(\n",
    "    llm, tools=[get_market_data],\n",
    "    state_modifier=\"You are the Market Data Agent. \"\n",
    "    \"Retrieve real-time stock data for client queries.\"\n",
    ")\n",
    "\n",
    "# Demo\n",
    "print(get_market_data.invoke(\"AAPL\"))\n",
    "logger.success(\"Market Data Agent initialized\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ba8d2920",
   "metadata": {},
   "source": [
    "## Cell 3: Finnhub Integration — Portfolio Analysis\n",
    "\n",
    "**Ref:** Section 14.1.1 (p.396)\n",
    "\n",
    "For production deployments, the Finnhub API provides endpoints for basic financials, company metrics, real-time quotes, and company news."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14ceb268",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 3: Finnhub Integration — Portfolio Analysis\n",
    "# Ref: Section 14.1.1, p.396\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "finnhub_client = None\n",
    "if config.is_live(\"FINNHUB_API_KEY\"):\n",
    "    try:\n",
    "        import finnhub\n",
    "        finnhub_client = finnhub.Client(api_key=config.get_key(\"FINNHUB_API_KEY\"))\n",
    "        logger.success(\"[Finnhub] LIVE client initialized\")\n",
    "    except ImportError:\n",
    "        logger.warning(\"[Finnhub] finnhub-python not installed — using mock\")\n",
    "\n",
    "@tool\n",
    "def portfolio_analysis(query: str) -> str:\n",
    "    \"\"\"Fetch financial metrics using the Finnhub API.\n",
    "    Ref: Section 14.1.1, p.396\"\"\"\n",
    "    symbol = query.split()[-1].upper()\n",
    "\n",
    "    if finnhub_client is not None:\n",
    "        try:\n",
    "            financials = finnhub_client.company_basic_financials(symbol, \"all\")\n",
    "            metrics = financials.get(\"metric\", {})\n",
    "            logger.success(f\"[Finnhub] LIVE financials for {symbol}\")\n",
    "        except Exception:\n",
    "            financials = MOCK_FINNHUB_FINANCIALS.get(symbol, MOCK_FINNHUB_FINANCIALS[\"AAPL\"])\n",
    "            metrics = financials.get(\"metric\", {})\n",
    "            logger.info(f\"[Finnhub] Fallback to mock for {symbol}\")\n",
    "    else:\n",
    "        financials = MOCK_FINNHUB_FINANCIALS.get(symbol, MOCK_FINNHUB_FINANCIALS[\"AAPL\"])\n",
    "        metrics = financials.get(\"metric\", {})\n",
    "        logger.info(f\"[Finnhub] SIMULATED financials for {symbol}\")\n",
    "\n",
    "    return (\n",
    "        f\"Portfolio Analysis for {symbol}: \"\n",
    "        f\"P/E Ratio: {metrics.get('peRatio')}, \"\n",
    "        f\"Revenue Growth: {metrics.get('revenueGrowth')}, \"\n",
    "        f\"52W High: {metrics.get('52WeekHigh')}, \"\n",
    "        f\"52W Low: {metrics.get('52WeekLow')}\"\n",
    "    )\n",
    "\n",
    "analysis_agent = create_react_agent(\n",
    "    llm, tools=[portfolio_analysis],\n",
    "    state_modifier=\"You are the Financial Analysis Agent. \"\n",
    "    \"Perform portfolio analysis and compute financial metrics.\"\n",
    ")\n",
    "\n",
    "print(portfolio_analysis.invoke(\"Analyze AAPL\"))\n",
    "logger.success(\"Financial Analysis Agent initialized\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "03350b27",
   "metadata": {},
   "source": [
    "## Cell 4: Financial News Agent\n",
    "\n",
    "**Ref:** Section 14.1.1 (p.397)\n",
    "\n",
    "The Financial News Agent provides qualitative context using Tavily search-based retrieval. In Simulation Mode, it returns chapter-derived mock news results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fe26848e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 4: Financial News Agent\n",
    "# Ref: Section 14.1.1, p.397\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "@tool\n",
    "def search_financial_news(query: str) -> str:\n",
    "    \"\"\"Search for financial news using Tavily or mock data.\n",
    "    Ref: Section 14.1.1, p.397\"\"\"\n",
    "    if config.is_live(\"TAVILY_API_KEY\"):\n",
    "        try:\n",
    "            from langchain_community.tools.tavily_search import TavilySearchResults\n",
    "            tavily_tool = TavilySearchResults(max_results=5)\n",
    "            results = tavily_tool.invoke(query)\n",
    "            logger.success(f\"[Tavily] LIVE search: {len(results)} results\")\n",
    "            return json.dumps(results, indent=2)\n",
    "        except Exception as e:\n",
    "            logger.warning(f\"[Tavily] API error: {e} — using mock\")\n",
    "\n",
    "    logger.info(\"[Tavily] SIMULATED news search\")\n",
    "    return json.dumps(MOCK_TAVILY_NEWS, indent=2)\n",
    "\n",
    "financial_news_agent = create_react_agent(\n",
    "    llm, tools=[search_financial_news],\n",
    "    state_modifier=\"You are the Financial News Agent. \"\n",
    "    \"Retrieve and summarize the latest financial news \"\n",
    "    \"relevant to the user's query.\"\n",
    ")\n",
    "\n",
    "# Demo\n",
    "news_result = json.loads(search_financial_news.invoke(\"technology sector outlook\"))\n",
    "for item in news_result[:3]:\n",
    "    print(f\"  * {item['title']} (score: {item['score']})\")\n",
    "logger.success(\"Financial News Agent initialized\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "643a7c5c",
   "metadata": {},
   "source": [
    "## Cell 5: StateGraph Assembly and Streaming Execution\n",
    "\n",
    "**Ref:** Section 14.1 (p.397–399)\n",
    "\n",
    "The supervisor orchestrates the specialist agents through a LangGraph `StateGraph`. This loop-until-complete pattern ensures complex multi-source queries are fully resolved before generating a response.\n",
    "\n",
    "**Topology:** `START` → `supervisor` → conditional edge to specialist or `FINISH` → `END`. Each specialist returns to the supervisor for re-evaluation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e3604108",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 5: StateGraph Assembly and Streaming Execution\n",
    "# Ref: Section 14.1, p.397-399\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "class AgentState(TypedDict):\n",
    "    messages: Annotated[Sequence[BaseMessage], operator.add]\n",
    "    next: str\n",
    "\n",
    "def agent_node(state, agent, name):\n",
    "    \"\"\"Execute a specialist agent and wrap its result for the state graph.\n",
    "    Ref: Section 14.1, p.397\"\"\"\n",
    "    result = agent.invoke(state)\n",
    "    return {\"messages\": [HumanMessage(\n",
    "        content=result[\"messages\"][-1].content, name=name\n",
    "    )]}\n",
    "\n",
    "market_data_node = partial(agent_node, agent=market_agent, name=\"Market_Data_Agent\")\n",
    "analysis_node = partial(agent_node, agent=analysis_agent, name=\"Financial_Analysis_Agent\")\n",
    "news_node = partial(agent_node, agent=financial_news_agent, name=\"News_Agent\")\n",
    "\n",
    "workflow = StateGraph(AgentState)\n",
    "workflow.add_node(\"Market_Data_Agent\", market_data_node)\n",
    "workflow.add_node(\"Financial_Analysis_Agent\", analysis_node)\n",
    "workflow.add_node(\"News_Agent\", news_node)\n",
    "workflow.add_node(\"supervisor\", supervisor_agent)\n",
    "\n",
    "for member in members:\n",
    "    workflow.add_edge(member, \"supervisor\")\n",
    "\n",
    "conditional_map = {m: m for m in members}\n",
    "conditional_map[\"FINISH\"] = END\n",
    "\n",
    "workflow.add_conditional_edges(\n",
    "    \"supervisor\", lambda x: x[\"next\"], conditional_map\n",
    ")\n",
    "workflow.add_edge(START, \"supervisor\")\n",
    "\n",
    "graph = workflow.compile()\n",
    "logger.success(\"Financial Advisory StateGraph compiled\")\n",
    "\n",
    "# Execute streaming query — Ref: p.398-399\n",
    "MockStructuredChain.reset()\n",
    "logger.info(\"Executing: 'Analyze the portfolio for AAPL.'\")\n",
    "inputs = {\"messages\": [HumanMessage(content=\"Analyze the portfolio for AAPL.\")]}\n",
    "\n",
    "for output in graph.stream(inputs, stream_mode=\"values\"):\n",
    "    if \"messages\" in output:\n",
    "        last_msg = output[\"messages\"][-1]\n",
    "        sender = getattr(last_msg, \"name\", \"system\")\n",
    "        content = last_msg.content[:120] if last_msg.content else \"(routing)\"\n",
    "        logger.info(f\"[{sender}] {content}\")\n",
    "\n",
    "logger.success(\"StateGraph execution complete\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "11686e69",
   "metadata": {},
   "source": [
    "## Cell 6: Risk Assessment Framework\n",
    "\n",
    "**Ref:** Section 14.1.2 (pp. 399–404)\n",
    "\n",
    "Three levels of risk evaluation:\n",
    "1. **Basic volatility classification** — `|dp| > 5` → HIGH, `|dp| > 2` → MODERATE, else LOW\n",
    "2. **Composite risk scoring** (`RiskScorer`) — 0.4 × volatility + 0.35 × drawdown + 0.25 × VaR (0–10 scale)\n",
    "3. **Client tolerance adjustment** (`assess_risk`) — Maps market risk against conservative/moderate/aggressive tolerance\n",
    "\n",
    "---\n",
    "\n",
    "> 📌 **Info Box — Forty-five minutes, four hundred forty million dollars (p. 399)**\n",
    ">\n",
    "> On August 1, 2012, Knight Capital Group deployed a software update to its automated trading system. A configuration error reactivated dormant code that began executing millions of unintended trades across 154 stocks. In forty-five minutes, the firm accumulated $7 billion in erroneous positions and lost approximately $440 million — nearly its entire market capitalization. Knight was rescued through an emergency capital raise but never fully recovered, merging with Getco LLC the following year. The incident remains the canonical warning for automated financial systems: **speed without safeguards is not an advantage — it is a liability.** Every compliance gate, risk threshold, and human checkpoint described in this chapter exists to prevent precisely this kind of cascading failure.\n",
    "\n",
    "---\n",
    "\n",
    "> 📌 **Info Box — Key risk metrics (p. 400)**\n",
    ">\n",
    "> **Value at Risk (VaR)** estimates the maximum expected loss over a given time horizon at a specified confidence level (e.g., a 1-day 95% VaR of $10,000 means there is a 5% chance of losing more than $10,000 in a single day). **Conditional Value at Risk (CVaR)**, also called expected shortfall, measures the average loss in the worst-case scenarios beyond the VaR threshold, making it more sensitive to tail risk. **Annualized volatility** expresses the standard deviation of returns scaled to a one-year period, providing a comparable measure of price instability across assets. **Maximum drawdown** captures the largest peak-to-trough decline over a period, reflecting the worst loss an investor would have experienced had they bought at the peak and sold at the trough.\n",
    "\n",
    "---\n",
    "\n",
    "> 📌 **Info Box — Production data feeds (p. 398)**\n",
    ">\n",
    "> The yfinance library is suitable for prototyping and demonstration but does not provide the reliability guarantees required in regulated financial environments. Production systems should source market data from commercial providers such as Bloomberg, Refinitiv (LSEG Data & Analytics), or FactSet, which offer contractual SLA commitments, real-time feeds with sub-second latency, and data quality controls. When evaluating a provider, confirm coverage for all required asset classes, jurisdictional data permissions, and API rate limits that will hold under production load."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ff2b3c04",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 6: Risk Assessment Framework\n",
    "# Ref: Section 14.1.2, p.399-404\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "# ── 1. Basic Volatility Classification ──\n",
    "# Ref: p.400 — thresholds: abs(dp) > 5 → HIGH, > 2 → MODERATE, else LOW\n",
    "\n",
    "def risk_assessment(query: str) -> str:\n",
    "    \"\"\"Evaluate investment risk using real-time volatility metrics.\n",
    "    Ref: Section 14.1.2, p.400\"\"\"\n",
    "    symbol = query.split()[-1].upper()\n",
    "\n",
    "    if finnhub_client is not None:\n",
    "        try:\n",
    "            quote = finnhub_client.quote(symbol)\n",
    "            logger.success(f\"[Risk] LIVE quote for {symbol}\")\n",
    "        except Exception:\n",
    "            quote = MOCK_FINNHUB_QUOTES.get(symbol, MOCK_FINNHUB_QUOTES[\"AAPL\"])\n",
    "            logger.info(f\"[Risk] Fallback to mock quote for {symbol}\")\n",
    "    else:\n",
    "        quote = MOCK_FINNHUB_QUOTES.get(symbol, MOCK_FINNHUB_QUOTES[\"AAPL\"])\n",
    "        logger.info(f\"[Risk] SIMULATED quote for {symbol}\")\n",
    "\n",
    "    price_change = quote.get(\"dp\", 0)\n",
    "\n",
    "    if abs(price_change) > 5:\n",
    "        risk_level = \"High Risk\"\n",
    "    elif abs(price_change) > 2:\n",
    "        risk_level = \"Moderate Risk\"\n",
    "    else:\n",
    "        risk_level = \"Low Risk\"\n",
    "\n",
    "    return (\n",
    "        f\"Risk Assessment for {symbol}: \"\n",
    "        f\"Price Change: {price_change}%, \"\n",
    "        f\"Risk Level: {risk_level}\"\n",
    "    )\n",
    "\n",
    "for sym in [\"AAPL\", \"GOOGL\", \"MSFT\"]:\n",
    "    print(risk_assessment(f\"Assess {sym}\"))\n",
    "\n",
    "print()\n",
    "\n",
    "# ── 2. Composite Risk Scoring ──\n",
    "# Ref: p.400-401 — weights: 0.4 vol + 0.35 dd + 0.25 var\n",
    "\n",
    "class RiskScorer:\n",
    "    \"\"\"Multi-dimensional risk scoring for portfolio positions.\n",
    "    Ref: Section 14.1.2, p.400-401\"\"\"\n",
    "\n",
    "    def compute_risk_score(self, symbol: str,\n",
    "                           lookback_days: int = 90) -> dict:\n",
    "        \"\"\"Compute composite risk score incorporating\n",
    "        volatility, drawdown, and VaR metrics.\"\"\"\n",
    "        try:\n",
    "            if config.is_live(\"OPENAI_API_KEY\"):\n",
    "                import yfinance as yf\n",
    "                ticker = yf.Ticker(symbol)\n",
    "                hist = ticker.history(period=f\"{lookback_days}d\")\n",
    "                if hist.empty:\n",
    "                    raise ValueError(\"Empty history\")\n",
    "                logger.success(f\"[RiskScorer] LIVE history for {symbol}\")\n",
    "            else:\n",
    "                raise ValueError(\"Simulation mode\")\n",
    "        except Exception:\n",
    "            mock_hist = generate_mock_price_history(symbol, days=lookback_days)\n",
    "            hist = pd.DataFrame(mock_hist)\n",
    "            logger.info(f\"[RiskScorer] SIMULATED history for {symbol}\")\n",
    "\n",
    "        returns = hist[\"Close\"].pct_change().dropna()\n",
    "\n",
    "        # Ref: p.401\n",
    "        volatility = returns.std() * np.sqrt(252)\n",
    "        cumulative = (1 + returns).cumprod()\n",
    "        rolling_max = cumulative.cummax()\n",
    "        drawdown = (cumulative - rolling_max) / rolling_max\n",
    "        max_drawdown = drawdown.min()\n",
    "        var_95 = np.percentile(returns, 5)\n",
    "\n",
    "        vol_score = min(volatility / 0.05, 10)\n",
    "        dd_score = min(abs(max_drawdown) / 0.05, 10)\n",
    "        var_score = min(abs(var_95) / 0.03, 10)\n",
    "\n",
    "        composite = 0.4 * vol_score + 0.35 * dd_score + 0.25 * var_score\n",
    "\n",
    "        return {\n",
    "            \"symbol\": symbol,\n",
    "            \"annualized_volatility\": round(float(volatility), 4),\n",
    "            \"max_drawdown\": round(float(max_drawdown), 4),\n",
    "            \"var_95\": round(float(var_95), 4),\n",
    "            \"composite_risk_score\": round(float(composite), 2),\n",
    "            \"risk_category\": self._categorize(float(composite)),\n",
    "        }\n",
    "\n",
    "    @staticmethod\n",
    "    def _categorize(score: float) -> str:\n",
    "        \"\"\"Ref: p.401 — >= 7.0 HIGH, >= 4.0 MODERATE, else LOW\"\"\"\n",
    "        if score >= 7.0:\n",
    "            return \"HIGH\"\n",
    "        elif score >= 4.0:\n",
    "            return \"MODERATE\"\n",
    "        return \"LOW\"\n",
    "\n",
    "scorer = RiskScorer()\n",
    "risk_result = scorer.compute_risk_score(\"AAPL\")\n",
    "print(\"--- Composite Risk Score ---\")\n",
    "for key, value in risk_result.items():\n",
    "    print(f\"  {key}: {value}\")\n",
    "print()\n",
    "\n",
    "# ── 3. Client Tolerance Adjustment ──\n",
    "# Ref: p.401-402\n",
    "\n",
    "def assess_risk(stock_symbol: str, composite_score: float,\n",
    "                client_risk_tolerance: str) -> dict:\n",
    "    \"\"\"Evaluate risk level adjusted for client tolerance.\n",
    "    Ref: Section 14.1.2, p.401-402\"\"\"\n",
    "    if composite_score >= 7.0:\n",
    "        market_risk = \"HIGH\"\n",
    "    elif composite_score >= 4.0:\n",
    "        market_risk = \"MODERATE\"\n",
    "    else:\n",
    "        market_risk = \"LOW\"\n",
    "\n",
    "    tolerance_map = {\n",
    "        \"conservative\": {\"HIGH\": \"UNACCEPTABLE\", \"MODERATE\": \"HIGH\", \"LOW\": \"MODERATE\"},\n",
    "        \"moderate\": {\"HIGH\": \"HIGH\", \"MODERATE\": \"MODERATE\", \"LOW\": \"LOW\"},\n",
    "        \"aggressive\": {\"HIGH\": \"MODERATE\", \"MODERATE\": \"LOW\", \"LOW\": \"LOW\"},\n",
    "    }\n",
    "\n",
    "    adjusted_risk = tolerance_map.get(\n",
    "        client_risk_tolerance, {}\n",
    "    ).get(market_risk, market_risk)\n",
    "\n",
    "    return {\n",
    "        \"symbol\": stock_symbol,\n",
    "        \"market_risk\": market_risk,\n",
    "        \"client_risk_tolerance\": client_risk_tolerance,\n",
    "        \"adjusted_risk\": adjusted_risk,\n",
    "    }\n",
    "\n",
    "print(\"--- Client Tolerance Adjustment ---\")\n",
    "for tolerance in [\"conservative\", \"moderate\", \"aggressive\"]:\n",
    "    result = assess_risk(\"AAPL\", risk_result[\"composite_risk_score\"], tolerance)\n",
    "    print(f\"  {tolerance}: market={result['market_risk']} -> adjusted={result['adjusted_risk']}\")\n",
    "\n",
    "logger.success(\"Risk Assessment Framework complete\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ba189706",
   "metadata": {},
   "source": [
    "## Cell 7: Personalized Financial Planning and Compliance Gate\n",
    "\n",
    "**Ref:** Section 14.1.3 (pp. 403–408)\n",
    "\n",
    "The compliance gate implements **compliance-by-architecture**: it is structurally impossible for a non-compliant recommendation to reach the client. The `validate_compliance` node checks suitability and concentration limits (max 25%). If it fails, the `revise` node adjusts and loops back for re-validation.\n",
    "\n",
    "**Table 14.1 (p. 403): Performance impact of AI-powered personalization in retail financial advisory**\n",
    "\n",
    "| Metric | Before AI | After AI |\n",
    "|:-------|:----------|:---------|\n",
    "| Response time | 4-hour average | 30 seconds |\n",
    "| Compliance accuracy | 95% | 99.99% |\n",
    "| Client capacity | 100 per advisor | 1,000+ per advisor |\n",
    "| Monthly revenue | $8,000 | $25,000 |\n",
    "\n",
    "> 📌 **Industry Examples (p. 408):** JPMorgan Chase has invested in AI-driven summarization of market research, natural language understanding for client inquiries, and compliance monitoring that flags risky advisory patterns. Virgin Money's Redi agent demonstrates how agentic systems can elevate retail banking beyond static FAQs, handling nuanced financial questions involving transactional histories and account-linked decisions with context-aware personalization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0831e017",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 7: Personalized Financial Planning and Compliance Gate\n",
    "# Ref: Section 14.1.3, p.403-408\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "class ClientProfileAgent:\n",
    "    \"\"\"Retrieves and contextualizes client financial profiles.\n",
    "    Ref: Section 14.1.3, p.403-404\"\"\"\n",
    "\n",
    "    def __init__(self, profile_store: dict):\n",
    "        self.profiles = profile_store\n",
    "\n",
    "    def get_contextualized_profile(self, client_id: str,\n",
    "                                   query_context: str = \"\") -> dict:\n",
    "        profile = self.profiles.get(client_id, {})\n",
    "        if not profile:\n",
    "            logger.warning(f\"[ClientProfile] No profile found for {client_id}\")\n",
    "            return {}\n",
    "        logger.info(f\"[ClientProfile] Retrieved profile for {profile.get('name', client_id)}\")\n",
    "        return {\n",
    "            \"profile\": profile,\n",
    "            \"risk_tolerance\": profile.get(\"risk_tolerance\"),\n",
    "            \"max_risk_tolerance\": profile.get(\"max_risk_tolerance\", 5.0),\n",
    "            \"investment_horizon\": profile.get(\"investment_horizon\"),\n",
    "            \"regulatory_constraints\": profile.get(\"constraints\", []),\n",
    "        }\n",
    "\n",
    "client_agent = ClientProfileAgent(MOCK_CLIENT_PROFILES)\n",
    "profile_data = client_agent.get_contextualized_profile(\"retail_client_4521\")\n",
    "print(f\"Client: {profile_data['profile']['name']}\")\n",
    "print(f\"  Risk tolerance: {profile_data['risk_tolerance']}, Horizon: {profile_data['investment_horizon']}\")\n",
    "print()\n",
    "\n",
    "# ── Compliance-Gated Advisory Workflow ──\n",
    "# Ref: p.405-406\n",
    "\n",
    "class AdvisoryState(TypedDict):\n",
    "    messages: list\n",
    "    client_profile: dict\n",
    "    recommendation: dict\n",
    "    compliance_result: dict\n",
    "    final_response: str\n",
    "\n",
    "policy_rules = {\"max_concentration\": 0.25}\n",
    "\n",
    "def generate_recommendation(state: AdvisoryState):\n",
    "    \"\"\"Generate allocation based on client profile. Ref: p.405, p.406-407\"\"\"\n",
    "    tolerance = state[\"client_profile\"].get(\"risk_tolerance\", \"moderate\")\n",
    "    allocations = {\n",
    "        \"conservative\": ({\"us_equities\": 0.25, \"international_equities\": 0.10,\n",
    "                          \"fixed_income\": 0.50, \"alternatives\": 0.15}, 3.2),\n",
    "        \"aggressive\": ({\"us_equities\": 0.55, \"international_equities\": 0.25,\n",
    "                        \"fixed_income\": 0.10, \"alternatives\": 0.10}, 7.8),\n",
    "        \"moderate\": ({\"us_equities\": 0.45, \"international_equities\": 0.20,\n",
    "                      \"fixed_income\": 0.25, \"alternatives\": 0.10}, 6.2),\n",
    "    }\n",
    "    alloc, risk = allocations.get(tolerance, allocations[\"moderate\"])\n",
    "    logger.info(f\"[Recommend] Generated allocation for {tolerance} client\")\n",
    "    return {\"recommendation\": {\"allocation\": alloc, \"risk_score\": risk,\n",
    "                               \"expected_annual_return\": 0.078,\n",
    "                               \"max_drawdown_estimate\": -0.18}}\n",
    "\n",
    "def validate_compliance(state: AdvisoryState):\n",
    "    \"\"\"Validate against regulatory requirements. Ref: p.405-406\"\"\"\n",
    "    rec = state[\"recommendation\"]\n",
    "    profile = state[\"client_profile\"]\n",
    "    issues = []\n",
    "    max_tol = profile.get(\"max_risk_tolerance\", 5.0)\n",
    "    if rec[\"risk_score\"] > max_tol:\n",
    "        issues.append(f\"SUITABILITY: Risk ({rec['risk_score']}) exceeds tolerance ({max_tol})\")\n",
    "    max_conc = policy_rules.get(\"max_concentration\", 0.25)\n",
    "    for asset, weight in rec[\"allocation\"].items():\n",
    "        if weight > max_conc:\n",
    "            issues.append(f\"CONCENTRATION: {asset} at {weight:.0%} exceeds {max_conc:.0%}\")\n",
    "    if issues:\n",
    "        for i in issues:\n",
    "            logger.warning(f\"[Compliance] {i}\")\n",
    "    else:\n",
    "        logger.success(\"[Compliance] All checks passed\")\n",
    "    return {\"compliance_result\": {\"passed\": len(issues) == 0, \"issues\": issues}}\n",
    "\n",
    "def route_after_compliance(state: AdvisoryState):\n",
    "    return \"deliver\" if state[\"compliance_result\"][\"passed\"] else \"revise\"\n",
    "\n",
    "def revise_recommendation(state: AdvisoryState):\n",
    "    \"\"\"Revise non-compliant recommendation. Ref: p.408\"\"\"\n",
    "    rec = state[\"recommendation\"]\n",
    "    logger.info(f\"[Revise] Adjusting recommendation\")\n",
    "    allocation = dict(rec[\"allocation\"])\n",
    "    max_conc = policy_rules.get(\"max_concentration\", 0.25)\n",
    "    # Cap all over-limit positions and redistribute evenly\n",
    "    total_excess = 0.0\n",
    "    for asset in list(allocation):\n",
    "        if allocation[asset] > max_conc:\n",
    "            total_excess += allocation[asset] - max_conc\n",
    "            allocation[asset] = max_conc\n",
    "    # Spread excess proportionally to assets still under limit\n",
    "    if total_excess > 0:\n",
    "        under = [a for a in allocation if allocation[a] < max_conc]\n",
    "        if under:\n",
    "            share = total_excess / len(under)\n",
    "            for a in under:\n",
    "                allocation[a] = round(min(allocation[a] + share, max_conc), 4)\n",
    "    new_risk = min(rec[\"risk_score\"],\n",
    "                   state[\"client_profile\"].get(\"max_risk_tolerance\", 5.0))\n",
    "    return {\"recommendation\": {**rec, \"allocation\": allocation, \"risk_score\": new_risk}}\n",
    "\n",
    "def deliver_to_client(state: AdvisoryState):\n",
    "    \"\"\"Deliver validated recommendation. Ref: p.407\"\"\"\n",
    "    rec = state[\"recommendation\"]\n",
    "    name = state[\"client_profile\"].get(\"name\", \"Client\")\n",
    "    response = (f\"Advisory Recommendation for {name}:\\n\"\n",
    "                f\"  Allocation: {json.dumps(rec['allocation'], indent=4)}\\n\"\n",
    "                f\"  Risk Score: {rec['risk_score']}\\n\"\n",
    "                f\"  Compliance: VALIDATED\")\n",
    "    logger.success(f\"[Deliver] Recommendation delivered to {name}\")\n",
    "    print(\"\\n\" + response)\n",
    "    return {\"final_response\": response}\n",
    "\n",
    "compliance_workflow = StateGraph(AdvisoryState)\n",
    "compliance_workflow.add_node(\"recommend\", generate_recommendation)\n",
    "compliance_workflow.add_node(\"comply\", validate_compliance)\n",
    "compliance_workflow.add_node(\"deliver\", deliver_to_client)\n",
    "compliance_workflow.add_node(\"revise\", revise_recommendation)\n",
    "compliance_workflow.add_edge(START, \"recommend\")\n",
    "compliance_workflow.add_edge(\"recommend\", \"comply\")\n",
    "compliance_workflow.add_conditional_edges(\n",
    "    \"comply\", route_after_compliance,\n",
    "    {\"deliver\": \"deliver\", \"revise\": \"revise\"}\n",
    ")\n",
    "compliance_workflow.add_edge(\"revise\", \"comply\")\n",
    "compliance_workflow.add_edge(\"deliver\", END)\n",
    "compliance_graph = compliance_workflow.compile()\n",
    "logger.success(\"Compliance-gated advisory workflow compiled\")\n",
    "\n",
    "# Execute for moderate client\n",
    "MockStructuredChain.reset()\n",
    "compliance_graph.invoke({\n",
    "    \"messages\": [], \"client_profile\": MOCK_CLIENT_PROFILES[\"retail_client_4521\"],\n",
    "    \"recommendation\": {}, \"compliance_result\": {}, \"final_response\": \"\",\n",
    "})"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ca50eb2e",
   "metadata": {},
   "source": [
    "## Cell 8: RetailAdvisor Case Study\n",
    "\n",
    "**Ref:** Section 14.1.4 (p.406–410)\n",
    "\n",
    "End-to-end demonstration: *\"I have $50,000 to invest and want moderate growth over the next ten years.\"* Shows the inter-agent JSON communication protocol (p.407), risk scoring, and compliance validation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a60ebfb3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 8: RetailAdvisor Case Study\n",
    "# Ref: Section 14.1.4, p.406-410\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "logger.info(\"=\" * 60)\n",
    "logger.info(\"CASE STUDY: RetailAdvisor\")\n",
    "logger.info(\"=\" * 60)\n",
    "\n",
    "# Step 1: Client query from chapter p.408\n",
    "query = \"I have $50,000 to invest and want moderate growth over the next ten years.\"\n",
    "logger.info(f\"[Client Query] {query}\")\n",
    "\n",
    "# Step 2: Client profile\n",
    "cp = client_agent.get_contextualized_profile(\"retail_client_4521\")\n",
    "print(f\"\\nClient: {cp['profile']['name']}\")\n",
    "print(f\"  Investment: ${cp['profile']['initial_investment']:,}, Horizon: {cp['investment_horizon']}\")\n",
    "\n",
    "# Step 3: Risk scoring\n",
    "print(\"\\n--- Portfolio Risk Assessment ---\")\n",
    "for sym in [\"AAPL\", \"MSFT\", \"GOOGL\"]:\n",
    "    s = scorer.compute_risk_score(sym)\n",
    "    print(f\"  {sym}: composite={s['composite_risk_score']}, category={s['risk_category']}\")\n",
    "\n",
    "# Step 4: Inter-agent message protocol (p.407)\n",
    "print(\"\\n--- Inter-Agent Communication Protocol (p.407) ---\")\n",
    "print(json.dumps(MOCK_INTER_AGENT_MESSAGE, indent=2))\n",
    "\n",
    "# Step 5: Compliance-gated pipeline\n",
    "MockStructuredChain.reset()\n",
    "print(\"\\n--- Compliance-Gated Advisory Pipeline ---\")\n",
    "result = compliance_graph.invoke({\n",
    "    \"messages\": [], \"client_profile\": MOCK_CLIENT_PROFILES[\"retail_client_4521\"],\n",
    "    \"recommendation\": {}, \"compliance_result\": {}, \"final_response\": \"\",\n",
    "})\n",
    "\n",
    "# Step 6: Test with conservative client (may trigger revisions)\n",
    "print(\"\\n--- Conservative client (may trigger compliance revision) ---\")\n",
    "compliance_graph.invoke({\n",
    "    \"messages\": [], \"client_profile\": MOCK_CLIENT_PROFILES[\"retail_client_7832\"],\n",
    "    \"recommendation\": {}, \"compliance_result\": {}, \"final_response\": \"\",\n",
    "})\n",
    "\n",
    "logger.success(\"RetailAdvisor Case Study complete\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9f49dcdd",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "# Part 2: The Legal Intelligence Agent\n",
    "\n",
    "**Ref:** Section 14.2 (p.408–419)\n",
    "\n",
    "The Legal Intelligence Agent must reason over language where the same words carry different weight depending on context, jurisdiction, and the authority of the source. This section implements:\n",
    "- Legal Knowledge Base with hybrid retrieval (Section 14.2.1)\n",
    "- Precedent Finding with a 3-stage pipeline (Section 14.2.2)\n",
    "- Contract Analysis with a 5-stage pipeline (Section 14.2.3)\n",
    "- LegalBrief Case Study with citation verification (Section 14.2.4)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "515c07e3",
   "metadata": {},
   "source": [
    "## Cell 9: Legal Knowledge Base\n",
    "\n",
    "**Ref:** Section 14.2.1 (p.408–410)\n",
    "\n",
    "The foundation of the Legal Intelligence Agent is a structured, searchable repository of case law. The hybrid search strategy combines dense vector retrieval (semantic similarity) with authority-weighted ranking. Final score formula:\n",
    "\n",
    "`final_score = 0.5 * similarity + 0.3 * authority_boost + 0.2 * recency_boost`\n",
    "\n",
    "A Supreme Court decision (authority 10) outranks a district court ruling (authority 3) when semantic similarity is comparable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b6e4e013",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 9: Legal Knowledge Base\n",
    "# Ref: Section 14.2.1, p.408-410\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "from datetime import datetime\n",
    "\n",
    "class LegalKnowledgeBase:\n",
    "    \"\"\"Legal knowledge base with hierarchical authority\n",
    "    tracking and hybrid retrieval.\n",
    "    Ref: Section 14.2.1, p.408-410\"\"\"\n",
    "\n",
    "    def __init__(self, vector_store, embedding_model):\n",
    "        self.store = vector_store\n",
    "        self.embedder = embedding_model\n",
    "\n",
    "    def ingest_case(self, case: dict):\n",
    "        \"\"\"Ingest a case with structured authority metadata.\n",
    "        Ref: p.409\"\"\"\n",
    "        embedding = self.embedder.encode(case[\"text\"])\n",
    "        metadata = {\n",
    "            \"case_name\": case[\"name\"],\n",
    "            \"citation\": case[\"citation\"],\n",
    "            \"court\": case[\"court\"],\n",
    "            \"jurisdiction\": case[\"jurisdiction\"],\n",
    "            \"date\": case[\"date\"],\n",
    "            \"authority_level\": self._classify_authority(case.get(\"authority_level\", 0)),\n",
    "            \"status\": case.get(\"status\", \"good_law\"),\n",
    "            \"legal_issues\": case.get(\"issues\", []),\n",
    "            \"key_holdings\": case.get(\"holdings\", []),\n",
    "        }\n",
    "        self.store.upsert(\n",
    "            id=case[\"citation\"],\n",
    "            embedding=embedding,\n",
    "            metadata=metadata,\n",
    "        )\n",
    "\n",
    "    @staticmethod\n",
    "    def _classify_authority(level):\n",
    "        \"\"\"Return the authority level directly (already encoded in mock data).\"\"\"\n",
    "        return level\n",
    "\n",
    "    def hybrid_search(self, query: str, jurisdiction: str = None,\n",
    "                      min_authority: int = 0) -> list:\n",
    "        \"\"\"Hybrid search combining semantic similarity with\n",
    "        authority-weighted ranking.\n",
    "        Ref: Section 14.2.1, p.409-410\"\"\"\n",
    "        query_embedding = self.embedder.encode(query)\n",
    "        filters = {}\n",
    "        if jurisdiction:\n",
    "            filters[\"jurisdiction\"] = jurisdiction\n",
    "        if min_authority > 0:\n",
    "            filters[\"authority_level\"] = {\"$gte\": min_authority}\n",
    "\n",
    "        results = self.store.query(\n",
    "            embedding=query_embedding,\n",
    "            filter=filters,\n",
    "            top_k=50,\n",
    "        )\n",
    "\n",
    "        # Re-rank: 0.5 * similarity + 0.3 * authority + 0.2 * recency\n",
    "        # Ref: p.410\n",
    "        for result in results:\n",
    "            authority_boost = result.metadata[\"authority_level\"] / 10.0\n",
    "            recency_boost = self._recency_score(result.metadata[\"date\"])\n",
    "            result.final_score = (\n",
    "                0.5 * result.similarity_score +\n",
    "                0.3 * authority_boost +\n",
    "                0.2 * recency_boost\n",
    "            )\n",
    "\n",
    "        return sorted(results, key=lambda x: x.final_score, reverse=True)\n",
    "\n",
    "    @staticmethod\n",
    "    def _recency_score(date_str: str) -> float:\n",
    "        \"\"\"Compute recency score (0-1) based on how recent the case is.\"\"\"\n",
    "        try:\n",
    "            case_date = datetime.strptime(date_str, \"%Y-%m-%d\")\n",
    "            days_ago = (datetime.now() - case_date).days\n",
    "            return max(0, 1.0 - (days_ago / 3650))  # 10-year decay\n",
    "        except (ValueError, TypeError):\n",
    "            return 0.5\n",
    "\n",
    "    def citation_search(self, statutes: list, top_k: int = 10) -> list:\n",
    "        \"\"\"Search by statute/citation references. Ref: p.409-410\"\"\"\n",
    "        results = []\n",
    "        for statute in statutes:\n",
    "            embedding = self.embedder.encode(statute)\n",
    "            results.extend(self.store.query(embedding=embedding, top_k=top_k))\n",
    "        return results\n",
    "\n",
    "    def verify_citation(self, citation_text: str, jurisdiction: str = None,\n",
    "                        check_precedential: bool = True,\n",
    "                        check_good_law: bool = True) -> bool:\n",
    "        \"\"\"Verify a citation exists and is good law. Ref: p.417-418\"\"\"\n",
    "        return self.store.verify_citation(\n",
    "            citation_text, jurisdiction=jurisdiction,\n",
    "            check_precedential=check_precedential,\n",
    "            check_good_law=check_good_law,\n",
    "        )\n",
    "\n",
    "# ── Initialize and ingest cases ──\n",
    "legal_kb = LegalKnowledgeBase(\n",
    "    vector_store=MockVectorStore(),\n",
    "    embedding_model=MockEmbeddingModel(dimension=128),\n",
    ")\n",
    "\n",
    "logger.info(\"Ingesting legal case database...\")\n",
    "for case in MOCK_LEGAL_CASES:\n",
    "    legal_kb.ingest_case(case)\n",
    "logger.success(f\"Ingested {len(MOCK_LEGAL_CASES)} cases into legal knowledge base\")\n",
    "\n",
    "# ── Demo: Hybrid search ──\n",
    "print(\"\\n--- Hybrid Search: 'data privacy corporate liability' ---\")\n",
    "results = legal_kb.hybrid_search(\n",
    "    \"data privacy corporate liability\",\n",
    "    jurisdiction=\"federal\",\n",
    "    min_authority=3,\n",
    ")\n",
    "for r in results[:3]:\n",
    "    print(f\"  [{r.final_score:.3f}] {r.metadata['case_name']} \"\n",
    "          f\"({r.metadata['citation']}) — authority: {r.metadata['authority_level']}\")\n",
    "\n",
    "logger.success(\"Legal Knowledge Base initialized\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "db8eacd1",
   "metadata": {},
   "source": [
    "## Cell 10: Precedent Finding — 3-Stage Pipeline\n",
    "\n",
    "**Ref:** Section 14.2.2, Figure 14.2 (pp. 410–414)\n",
    "\n",
    "The precedent finding pipeline operates in three stages (Figure 14.2):\n",
    "1. **Issue Extraction** — Decompose legal matter into discrete questions\n",
    "2. **Multi-Dimensional Retrieval** — Parallel semantic + citation search\n",
    "3. **Synthesis and Verification** — Rank by authority and factual relevance\n",
    "\n",
    "---\n",
    "\n",
    "> 📌 **Info Box — When AI cited cases that never existed (p. 411)**\n",
    ">\n",
    "> In June 2023, New York attorney Steven Schwartz made headlines that rippled across the legal profession. He had used ChatGPT to research a personal injury case, and the model generated a brief peppered with confident, properly formatted citations to cases like *Varghese v. China Southern Airlines* and *Martinez v. Delta Airlines*. **None of them existed.** The court sanctioned Schwartz and his firm, and within months, courts across the United States began issuing standing orders requiring attorneys to disclose AI usage and personally verify every AI-generated citation. It was a vivid demonstration of why citation verification in any legal AI pipeline is not a nice-to-have feature but a **professional survival requirement**.\n",
    "\n",
    "---\n",
    "\n",
    "**Figure 14.2** depicts the three-stage workflow:\n",
    "\n",
    "```\n",
    "┌─────────────────────────────────────────────┐\n",
    "│     Stage 1: Issue Extraction               │\n",
    "│  (Decomposing Matter into Discrete Questions)│\n",
    "└──────────────────┬──────────────────────────┘\n",
    "                   │\n",
    "        ┌──────────┼──────────┐\n",
    "        ▼          ▼          ▼\n",
    "┌─────────────┐ ┌──────────┐ ┌──────────────┐\n",
    "│  Semantic   │ │Authority │ │  Analogical  │\n",
    "│  Matching   │ │  Search  │ │  Reasoning   │\n",
    "│ (Concept    │ │(Statutes │ │(Factual      │\n",
    "│  Similarity)│ │ & Regs)  │ │ Patterns)    │\n",
    "└──────┬──────┘ └────┬─────┘ └──────┬───────┘\n",
    "       └─────────────┼──────────────┘\n",
    "                     ▼\n",
    "┌─────────────────────────────────────────────┐\n",
    "│   Stage 3: Synthesis & Verification         │\n",
    "│   ✓ Citation Verification Gate              │\n",
    "│     (Anti-Hallucination)                    │\n",
    "└─────────────────────┬───────────────────────┘\n",
    "                      ▼\n",
    "              STRUCTURED BRIEF\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6213a37d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 10: Precedent Finding — 3-Stage Pipeline\n",
    "# Ref: Section 14.2.2, Figure 14.2, p.410-414\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "class LegalIssue(BaseModel):\n",
    "    description: str\n",
    "    category: str\n",
    "    priority: int = 1  # 1 (high) to 3 (low)\n",
    "\n",
    "class IssueList(BaseModel):\n",
    "    issues: List[LegalIssue]\n",
    "\n",
    "class PrecedentFinder:\n",
    "    \"\"\"Identifies and analyzes relevant legal precedent\n",
    "    through multi-dimensional retrieval.\n",
    "    Ref: Section 14.2.2, p.410-414\"\"\"\n",
    "\n",
    "    def __init__(self, legal_kb, llm_instance):\n",
    "        self.knowledge_base = legal_kb\n",
    "        self.llm = llm_instance\n",
    "\n",
    "    def find_precedents(self, legal_matter: dict) -> list:\n",
    "        \"\"\"Execute the full precedent finding pipeline.\n",
    "        Ref: p.412\"\"\"\n",
    "        # Stage 1: Extract discrete legal issues\n",
    "        issues = self._extract_issues(legal_matter)\n",
    "        logger.info(f\"[PrecedentFinder] Stage 1: Extracted {len(issues)} issues\")\n",
    "\n",
    "        # Stage 2: Multi-dimensional retrieval\n",
    "        candidates = []\n",
    "        for issue in issues:\n",
    "            semantic_results = self.knowledge_base.hybrid_search(\n",
    "                query=issue.description,\n",
    "                jurisdiction=legal_matter.get(\"jurisdiction\"),\n",
    "                min_authority=3,\n",
    "            )\n",
    "            candidates.extend(semantic_results)\n",
    "        logger.info(f\"[PrecedentFinder] Stage 2: Retrieved {len(candidates)} candidates\")\n",
    "\n",
    "        # Stage 3: Deduplicate and rank\n",
    "        seen = set()\n",
    "        unique = []\n",
    "        for c in candidates:\n",
    "            if c.id not in seen:\n",
    "                seen.add(c.id)\n",
    "                unique.append(c)\n",
    "        ranked = sorted(unique, key=lambda x: x.final_score, reverse=True)\n",
    "        logger.info(f\"[PrecedentFinder] Stage 3: {len(ranked)} unique precedents ranked\")\n",
    "        return ranked\n",
    "\n",
    "    def _extract_issues(self, matter: dict) -> list:\n",
    "        \"\"\"Decompose a legal matter into discrete legal issues.\n",
    "        Ref: p.413-414\"\"\"\n",
    "        # In Simulation Mode, return chapter-faithful issues\n",
    "        description = matter.get(\"description\", \"\").lower()\n",
    "        issues = []\n",
    "\n",
    "        if \"data\" in description or \"privacy\" in description or \"breach\" in description:\n",
    "            issues.append(LegalIssue(\n",
    "                description=\"Standard of care in data protection\",\n",
    "                category=\"Regulatory Compliance\", priority=1))\n",
    "            issues.append(LegalIssue(\n",
    "                description=\"Elements of negligence in security breach\",\n",
    "                category=\"Tort Law\", priority=1))\n",
    "            issues.append(LegalIssue(\n",
    "                description=\"Applicable statutory obligations under privacy laws\",\n",
    "                category=\"Privacy Law\", priority=2))\n",
    "        elif \"jurisdiction\" in description or \"e-commerce\" in description:\n",
    "            issues.append(LegalIssue(\n",
    "                description=\"Personal jurisdiction over foreign corporations\",\n",
    "                category=\"Civil Procedure\", priority=1))\n",
    "            issues.append(LegalIssue(\n",
    "                description=\"Due process requirements for e-commerce disputes\",\n",
    "                category=\"Constitutional Law\", priority=2))\n",
    "        else:\n",
    "            issues.append(LegalIssue(\n",
    "                description=matter.get(\"description\", \"General legal matter\"),\n",
    "                category=\"General\", priority=1))\n",
    "\n",
    "        return issues\n",
    "\n",
    "# ── Demo: Precedent finding ──\n",
    "precedent_finder = PrecedentFinder(legal_kb, llm)\n",
    "\n",
    "legal_matter = {\n",
    "    \"description\": \"Liability for data breaches in healthcare settings \"\n",
    "                   \"involving unauthorized access to patient records\",\n",
    "    \"jurisdiction\": \"federal\",\n",
    "}\n",
    "\n",
    "print(\"--- Precedent Finding Pipeline ---\")\n",
    "print(f\"Matter: {legal_matter['description']}\")\n",
    "print()\n",
    "\n",
    "precedents = precedent_finder.find_precedents(legal_matter)\n",
    "print(\"Top precedents found:\")\n",
    "for i, p in enumerate(precedents[:5], 1):\n",
    "    status = p.metadata.get(\"status\", \"unknown\")\n",
    "    print(f\"  {i}. [{p.final_score:.3f}] {p.metadata['case_name']} \"\n",
    "          f\"({p.metadata['citation']}) — status: {status}\")\n",
    "\n",
    "logger.success(\"Precedent Finding pipeline complete\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "86d90bc8",
   "metadata": {},
   "source": [
    "## Cell 11: Contract Analysis — 5-Stage Pipeline\n",
    "\n",
    "**Ref:** Section 14.2.3, Figure 14.3 (pp. 414–416)\n",
    "\n",
    "The contract analysis pipeline (Figure 14.3) moves through:\n",
    "1. Document Ingestion — Parse structure, extract sections\n",
    "2. Clause Extraction — Classify clauses by type\n",
    "3. Risk Flagging — Score against risk matrix\n",
    "4. Compliance Validation — Check regulatory requirements (runs in parallel as a gate)\n",
    "5. Summary Generation — Structured output for attorney review\n",
    "\n",
    "**Figure 14.3 — End-to-end contract analysis framework:**\n",
    "\n",
    "```\n",
    "┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐\n",
    "│ 1.Document│→│ 2.Clause │→│ 3. Risk  │→│4.Compliance│→│5. Summary│→ Attorney\n",
    "│ Ingestion │  │Extraction│  │ Flagging │  │Validation │  │Generation│   Review\n",
    "└──────────┘  └──────────┘  └──────────┘  └──────────┘  └──────────┘\n",
    "                    │              │              │\n",
    "                    ▼              ▼              ▼\n",
    "              ┌─────────────────────────────────────────┐\n",
    "              │    Parallel Compliance Gate              │\n",
    "              │ • Validates at every stage               │\n",
    "              │ • Rejects unsafe outputs early           │\n",
    "              │ • Enforces required clauses              │\n",
    "              └─────────────────────────────────────────┘\n",
    "```\n",
    "\n",
    "The key architectural point is that compliance validation runs in parallel at every stage as a **gate**, not a post-processing step. This lets the system reject unsafe outputs early and ensure required clauses are present before a summary reaches counsel."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4144d7a8",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 11: Contract Analysis — 5-Stage Pipeline\n",
    "# Ref: Section 14.2.3, Figure 14.3, p.414-416\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "class ContractAnalysisAgent:\n",
    "    \"\"\"Analyzes legal contracts through a multi-stage\n",
    "    pipeline with compliance validation.\n",
    "    Ref: Section 14.2.3, p.415-416\"\"\"\n",
    "\n",
    "    # Risk rules by clause type\n",
    "    RISK_RULES = {\n",
    "        \"indemnification\": {\"level\": \"HIGH\", \"reason\": \"One-sided indemnification\"},\n",
    "        \"liability\": {\"level\": \"HIGH\", \"reason\": \"Low liability cap relative to term\"},\n",
    "        \"data_processing\": {\"level\": \"CRITICAL\", \"reason\": \"Missing GDPR DPA and SCCs\"},\n",
    "    }\n",
    "\n",
    "    # Required clauses for compliance\n",
    "    REQUIRED_CLAUSES = [\"scope\", \"term\", \"payment\", \"confidentiality\", \"data_processing\"]\n",
    "\n",
    "    def __init__(self, llm_instance):\n",
    "        self.llm = llm_instance\n",
    "\n",
    "    def analyze_contract(self, document: dict, context: dict) -> dict:\n",
    "        \"\"\"Execute full contract analysis pipeline. Ref: p.415-416\"\"\"\n",
    "\n",
    "        # Stage 1: Document structure extraction\n",
    "        structure = {\n",
    "            \"title\": document.get(\"title\", \"Unknown\"),\n",
    "            \"parties\": document.get(\"parties\", {}),\n",
    "            \"effective_date\": document.get(\"effective_date\", \"Unknown\"),\n",
    "            \"governing_law\": document.get(\"governing_law\", \"Unknown\"),\n",
    "            \"total_sections\": len(document.get(\"sections\", [])),\n",
    "        }\n",
    "        logger.info(f\"[Contract] Stage 1: Parsed structure — {structure['total_sections']} sections\")\n",
    "\n",
    "        # Stage 2: Clause extraction and classification\n",
    "        clauses = []\n",
    "        for section in document.get(\"sections\", []):\n",
    "            clauses.append({\n",
    "                \"id\": section[\"id\"],\n",
    "                \"title\": section[\"title\"],\n",
    "                \"type\": section[\"type\"],\n",
    "                \"text\": section[\"text\"],\n",
    "            })\n",
    "        logger.info(f\"[Contract] Stage 2: Classified {len(clauses)} clauses\")\n",
    "\n",
    "        # Stage 3: Risk assessment per clause\n",
    "        risk_findings = []\n",
    "        for clause in clauses:\n",
    "            rule = self.RISK_RULES.get(clause[\"type\"])\n",
    "            if rule:\n",
    "                risk_findings.append({\n",
    "                    \"clause_id\": clause[\"id\"],\n",
    "                    \"clause_title\": clause[\"title\"],\n",
    "                    \"clause_type\": clause[\"type\"],\n",
    "                    \"risk_level\": rule[\"level\"],\n",
    "                    \"reason\": rule[\"reason\"],\n",
    "                    \"recommendation\": section.get(\"risk_notes\",\n",
    "                        f\"Review {clause['type']} clause with counsel\"),\n",
    "                })\n",
    "        logger.info(f\"[Contract] Stage 3: Found {len(risk_findings)} risk items\")\n",
    "\n",
    "        # Stage 4: Compliance validation\n",
    "        clause_types = {c[\"type\"] for c in clauses}\n",
    "        compliance_issues = []\n",
    "        for required in self.REQUIRED_CLAUSES:\n",
    "            if required not in clause_types:\n",
    "                compliance_issues.append(\n",
    "                    f\"MISSING REQUIRED CLAUSE: {required}\"\n",
    "                )\n",
    "\n",
    "        # Check for GDPR DPA (specific to data_processing)\n",
    "        for clause in clauses:\n",
    "            if clause[\"type\"] == \"data_processing\":\n",
    "                text_lower = clause[\"text\"].lower()\n",
    "                if \"gdpr\" not in text_lower and \"data processing addendum\" not in text_lower:\n",
    "                    compliance_issues.append(\n",
    "                        \"DATA PROCESSING: No GDPR Data Processing Addendum found\"\n",
    "                    )\n",
    "                if \"standard contractual clauses\" not in text_lower:\n",
    "                    compliance_issues.append(\n",
    "                        \"DATA PROCESSING: No Standard Contractual Clauses for cross-border transfer\"\n",
    "                    )\n",
    "        logger.info(f\"[Contract] Stage 4: {len(compliance_issues)} compliance issues\")\n",
    "\n",
    "        # Stage 5: Summary generation\n",
    "        summary = {\n",
    "            \"document\": structure[\"title\"],\n",
    "            \"parties\": structure[\"parties\"],\n",
    "            \"total_clauses\": len(clauses),\n",
    "            \"high_risk_count\": sum(1 for r in risk_findings if r[\"risk_level\"] in [\"HIGH\", \"CRITICAL\"]),\n",
    "            \"compliance_gaps\": len(compliance_issues),\n",
    "            \"overall_risk\": \"HIGH\" if risk_findings else \"LOW\",\n",
    "            \"recommendation\": \"Requires attorney review\" if risk_findings else \"Low risk\",\n",
    "        }\n",
    "        logger.info(\"[Contract] Stage 5: Summary generated\")\n",
    "\n",
    "        return {\n",
    "            \"structure\": structure,\n",
    "            \"clauses\": clauses,\n",
    "            \"risk_findings\": risk_findings,\n",
    "            \"compliance_issues\": compliance_issues,\n",
    "            \"summary\": summary,\n",
    "        }\n",
    "\n",
    "# ── Demo: Analyze mock contract ──\n",
    "contract_agent = ContractAnalysisAgent(llm)\n",
    "analysis = contract_agent.analyze_contract(\n",
    "    MOCK_CONTRACT,\n",
    "    context={\"reviewing_party\": \"client\", \"jurisdiction\": \"New York\",\n",
    "             \"regulations\": [\"GDPR\", \"CCPA\"]}\n",
    ")\n",
    "\n",
    "print(\"\\n--- Contract Analysis Results ---\")\n",
    "print(f\"Document: {analysis['summary']['document']}\")\n",
    "print(f\"Parties: {analysis['summary']['parties']}\")\n",
    "print(f\"Total clauses: {analysis['summary']['total_clauses']}\")\n",
    "\n",
    "print(\"\\nRisk Findings:\")\n",
    "for rf in analysis[\"risk_findings\"]:\n",
    "    print(f\"  [{rf['risk_level']}] Section {rf['clause_id']}: {rf['clause_title']}\")\n",
    "    print(f\"         Reason: {rf['reason']}\")\n",
    "\n",
    "print(\"\\nCompliance Issues:\")\n",
    "for ci in analysis[\"compliance_issues\"]:\n",
    "    print(f\"  * {ci}\")\n",
    "\n",
    "print(f\"\\nOverall Risk: {analysis['summary']['overall_risk']}\")\n",
    "print(f\"Recommendation: {analysis['summary']['recommendation']}\")\n",
    "\n",
    "logger.success(\"Contract Analysis pipeline complete\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49ae2159",
   "metadata": {},
   "source": [
    "## Cell 12: LegalBrief Case Study — Citation Verification\n",
    "\n",
    "**Ref:** Section 14.2.4 (p.416–419)\n",
    "\n",
    "LegalBrief is a litigation support assistant that implements a five-stage `StateGraph` pipeline:\n",
    "1. Issue Decomposition\n",
    "2. Multi-Dimensional Retrieval\n",
    "3. Doctrinal Analysis\n",
    "4. Structured Memo Drafting\n",
    "5. **Citation Verification** — The primary defense against hallucinated precedent\n",
    "\n",
    "The system flags the fabricated *Varghese v. China Southern Airlines* case, demonstrating that citation verification is non-negotiable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b9d8589b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 12: LegalBrief Case Study — Citation Verification\n",
    "# Ref: Section 14.2.4, p.416-419\n",
    "# Author: Imran Ahmad\n",
    "\n",
    "logger.info(\"=\" * 60)\n",
    "logger.info(\"CASE STUDY: LegalBrief — Legal Research Assistant\")\n",
    "logger.info(\"=\" * 60)\n",
    "\n",
    "class LegalResearchState(TypedDict):\n",
    "    query: str\n",
    "    jurisdiction: str\n",
    "    issues: list\n",
    "    precedents: list\n",
    "    analysis: dict\n",
    "    draft: str\n",
    "    citations_verified: bool\n",
    "    quality_score: float\n",
    "\n",
    "def decompose_issues(state: LegalResearchState):\n",
    "    \"\"\"Stage 1: Break research question into discrete legal issues.\n",
    "    Ref: p.417\"\"\"\n",
    "    matter = {\"description\": state[\"query\"], \"jurisdiction\": state[\"jurisdiction\"]}\n",
    "    finder = PrecedentFinder(legal_kb, llm)\n",
    "    issues = finder._extract_issues(matter)\n",
    "    logger.info(f\"[LegalBrief] Stage 1: Decomposed into {len(issues)} issues\")\n",
    "    return {\"issues\": [{\"description\": i.description, \"category\": i.category,\n",
    "                        \"priority\": i.priority} for i in issues]}\n",
    "\n",
    "def retrieve_precedents(state: LegalResearchState):\n",
    "    \"\"\"Stage 2: Multi-dimensional retrieval.\n",
    "    Ref: p.417-418\"\"\"\n",
    "    all_precedents = []\n",
    "    for issue in state[\"issues\"]:\n",
    "        results = legal_kb.hybrid_search(\n",
    "            query=issue[\"description\"],\n",
    "            jurisdiction=state[\"jurisdiction\"],\n",
    "            min_authority=3,\n",
    "        )\n",
    "        all_precedents.extend(results)\n",
    "    # Deduplicate\n",
    "    seen = set()\n",
    "    unique = []\n",
    "    for p in all_precedents:\n",
    "        if p.id not in seen:\n",
    "            seen.add(p.id)\n",
    "            unique.append(p)\n",
    "    logger.info(f\"[LegalBrief] Stage 2: Retrieved {len(unique)} unique precedents\")\n",
    "    return {\"precedents\": [{\"citation\": p.id, \"case_name\": p.metadata[\"case_name\"],\n",
    "                            \"score\": round(p.final_score, 3),\n",
    "                            \"authority\": p.metadata[\"authority_level\"],\n",
    "                            \"status\": p.metadata[\"status\"]}\n",
    "                           for p in unique]}\n",
    "\n",
    "def analyze_authorities(state: LegalResearchState):\n",
    "    \"\"\"Stage 3: Analyze authority strength.\n",
    "    Ref: p.417-418\"\"\"\n",
    "    analysis = {\n",
    "        \"total_authorities\": len(state[\"precedents\"]),\n",
    "        \"binding\": [p for p in state[\"precedents\"] if p[\"authority\"] >= 7],\n",
    "        \"persuasive\": [p for p in state[\"precedents\"] if 3 <= p[\"authority\"] < 7],\n",
    "        \"strongest\": state[\"precedents\"][0] if state[\"precedents\"] else None,\n",
    "    }\n",
    "    logger.info(f\"[LegalBrief] Stage 3: {len(analysis['binding'])} binding, \"\n",
    "                f\"{len(analysis['persuasive'])} persuasive authorities\")\n",
    "    return {\"analysis\": analysis}\n",
    "\n",
    "def generate_draft(state: LegalResearchState):\n",
    "    \"\"\"Stage 4: Generate research memo draft.\n",
    "    Ref: p.417-418\"\"\"\n",
    "    citations = [p[\"citation\"] for p in state[\"precedents\"]]\n",
    "    # Deliberately include the fabricated Varghese citation to test verification\n",
    "    citations.append(\"No. 22-cv-1234 (S.D.N.Y. 2023)\")  # Varghese — fabricated\n",
    "\n",
    "    draft_sections = [\n",
    "        f\"RESEARCH MEMORANDUM\",\n",
    "        f\"Query: {state['query']}\",\n",
    "        f\"Jurisdiction: {state['jurisdiction']}\",\n",
    "        f\"\",\n",
    "        f\"ISSUES IDENTIFIED: {len(state['issues'])}\",\n",
    "    ]\n",
    "    for i, issue in enumerate(state[\"issues\"], 1):\n",
    "        draft_sections.append(f\"  {i}. {issue['description']} ({issue['category']})\")\n",
    "\n",
    "    draft_sections.append(f\"\\nAUTHORITIES CITED: {len(citations)}\")\n",
    "    for c in citations:\n",
    "        draft_sections.append(f\"  - {c}\")\n",
    "\n",
    "    draft = \"\\n\".join(draft_sections)\n",
    "    logger.info(f\"[LegalBrief] Stage 4: Draft generated with {len(citations)} citations\")\n",
    "    return {\"draft\": draft}\n",
    "\n",
    "def verify_citations(state: LegalResearchState):\n",
    "    \"\"\"Stage 5: Cross-reference every citation against the knowledge base.\n",
    "    Ref: p.417-418 — the primary defense against hallucinated precedent.\"\"\"\n",
    "    import re\n",
    "    # Extract citations from draft\n",
    "    lines = state[\"draft\"].split(\"\\n\")\n",
    "    citations = [line.strip(\"  - \").strip() for line in lines\n",
    "                 if line.strip().startswith(\"- \")]\n",
    "\n",
    "    verified = 0\n",
    "    flagged = []\n",
    "    draft = state[\"draft\"]\n",
    "\n",
    "    for citation in citations:\n",
    "        exists = legal_kb.verify_citation(\n",
    "            citation,\n",
    "            jurisdiction=state[\"jurisdiction\"],\n",
    "            check_precedential=True,\n",
    "            check_good_law=True,\n",
    "        )\n",
    "        if exists:\n",
    "            verified += 1\n",
    "            logger.success(f\"[Citation] VERIFIED: {citation}\")\n",
    "        else:\n",
    "            flagged.append(citation)\n",
    "            logger.error(f\"[Citation] UNVERIFIED: {citation}\")\n",
    "            draft = draft.replace(citation, f\"[UNVERIFIED] {citation}\")\n",
    "\n",
    "    total = max(len(citations), 1)\n",
    "    score = verified / total\n",
    "\n",
    "    logger.info(f\"[LegalBrief] Stage 5: {verified}/{len(citations)} citations verified \"\n",
    "                f\"(quality: {score:.0%})\")\n",
    "\n",
    "    if flagged:\n",
    "        logger.warning(f\"[LegalBrief] {len(flagged)} citation(s) could not be verified:\")\n",
    "        for f_cite in flagged:\n",
    "            logger.warning(f\"  - {f_cite}\")\n",
    "\n",
    "    return {\n",
    "        \"draft\": draft,\n",
    "        \"citations_verified\": verified == len(citations),\n",
    "        \"quality_score\": score,\n",
    "    }\n",
    "\n",
    "# ── Build the LegalBrief StateGraph ──\n",
    "# Ref: p.418\n",
    "legal_workflow = StateGraph(LegalResearchState)\n",
    "legal_workflow.add_node(\"decompose\", decompose_issues)\n",
    "legal_workflow.add_node(\"retrieve\", retrieve_precedents)\n",
    "legal_workflow.add_node(\"analyze\", analyze_authorities)\n",
    "legal_workflow.add_node(\"draft_memo\", generate_draft)\n",
    "legal_workflow.add_node(\"verify\", verify_citations)\n",
    "\n",
    "legal_workflow.add_edge(START, \"decompose\")\n",
    "legal_workflow.add_edge(\"decompose\", \"retrieve\")\n",
    "legal_workflow.add_edge(\"retrieve\", \"analyze\")\n",
    "legal_workflow.add_edge(\"analyze\", \"draft_memo\")\n",
    "legal_workflow.add_edge(\"draft_memo\", \"verify\")\n",
    "legal_workflow.add_edge(\"verify\", END)\n",
    "\n",
    "legal_research_graph = legal_workflow.compile()\n",
    "logger.success(\"LegalBrief StateGraph compiled\")\n",
    "\n",
    "# ── Execute research query ──\n",
    "print(\"\\n--- Executing Legal Research Pipeline ---\")\n",
    "research_query = (\n",
    "    \"What is the current standard for establishing personal jurisdiction \"\n",
    "    \"over foreign corporations in e-commerce disputes under the Ninth Circuit?\"\n",
    ")\n",
    "\n",
    "result = legal_research_graph.invoke({\n",
    "    \"query\": research_query,\n",
    "    \"jurisdiction\": \"federal\",\n",
    "    \"issues\": [],\n",
    "    \"precedents\": [],\n",
    "    \"analysis\": {},\n",
    "    \"draft\": \"\",\n",
    "    \"citations_verified\": False,\n",
    "    \"quality_score\": 0.0,\n",
    "})\n",
    "\n",
    "print(\"\\n\" + \"=\" * 60)\n",
    "print(\"FINAL DRAFT:\")\n",
    "print(\"=\" * 60)\n",
    "print(result[\"draft\"])\n",
    "print(f\"\\nCitations Verified: {result['citations_verified']}\")\n",
    "print(f\"Quality Score: {result['quality_score']:.0%}\")\n",
    "\n",
    "logger.success(\"LegalBrief Case Study complete\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f1b1c33b",
   "metadata": {},
   "source": [
    "## Cell 13: Summary and Extensions\n",
    "\n",
    "**Ref:** Summary (p.419–420)\n",
    "\n",
    "This chapter demonstrated how agent engineering patterns converge in two of the most demanding application domains: financial advisory and legal intelligence. Both impose hard constraints absent in more permissive contexts.\n",
    "\n",
    "### Key Architectural Patterns\n",
    "\n",
    "**Financial Advisory Agent:**\n",
    "- Supervisor pattern with state-graph routing (Figure 14.1)\n",
    "- Composite risk scoring: 0.4 × volatility + 0.35 × drawdown + 0.25 × VaR\n",
    "- Client tolerance adjustment (conservative → aggressive spectrum)\n",
    "- Compliance-by-architecture: structurally impossible for non-compliant recommendations to reach the client\n",
    "\n",
    "**Legal Intelligence Agent:**\n",
    "- Hybrid retrieval combining semantic similarity with authority-weighted ranking\n",
    "- Authority ranking: 0.5 × similarity + 0.3 × authority + 0.2 × recency\n",
    "- Three-stage precedent pipeline: Issue Extraction → Multi-Dimensional Retrieval → Synthesis\n",
    "- Citation verification gate as defense against hallucinated precedent\n",
    "\n",
    "### Extension Ideas\n",
    "\n",
    "1. **Add more stock symbols** — Extend `MOCK_STOCK_DATA` and `MOCK_FINNHUB_QUOTES` with additional tickers\n",
    "2. **Connect a real vector database** — Replace `MockVectorStore` with Pinecone, Weaviate, or ChromaDB\n",
    "3. **Add more legal cases** — Expand `MOCK_LEGAL_CASES` with cases from different jurisdictions\n",
    "4. **Implement portfolio rebalancing** — Add a rebalancing agent that triggers when risk drift is detected\n",
    "5. **Add multi-jurisdiction support** — Extend the Legal Knowledge Base to handle cross-jurisdictional queries\n",
    "6. **Connect real financial APIs** — Add your Finnhub and Tavily API keys to `.env` for live data\n",
    "\n",
    "> *\"Build agents that are not only capable but accountable. Every recommendation must be traceable. Every decision must be auditable. Every interaction must be logged.\"* — Chapter 14, p.419\n",
    "\n",
    "---\n",
    "\n",
    "*Book: 30 Agents Every AI Engineer Must Build — Imran Ahmad (Packt Publishing, 2026)*\n",
    "*Chapter: 14 — Financial and Legal Domain Agents*"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}