{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "533b0fe6",
   "metadata": {},
   "source": [
    "# Chapter 12 — Explainable Agent\n",
    "\n",
    "**Book:** *30 Agents Every AI Engineer Must Build*\n",
    "**Author:** Imran Ahmad | **Publisher:** Packt Publishing, 2026\n",
    "**Notebook:** 02 of 02 — The Explainable Agent (pp. 346–360)\n",
    "\n",
    "---\n",
    "\n",
    "> *\"An agent that makes correct decisions is useful — but an agent that can explain its decisions is trusted.\"*\n",
    "\n",
    "## Chapter Context\n",
    "\n",
    "Trust is the critical enabler of adoption in every domain where agents operate alongside human decision-makers. A physician will not act on a diagnostic recommendation unless they understand the reasoning behind it. A loan officer will not approve a recommendation without a justification that satisfies regulatory requirements.\n",
    "\n",
    "Explainability is not a feature to be added at the end of development. It is an **architectural property** that must be designed into the agent from the beginning. This notebook builds the **Explainable Agent** — an architecture that makes internal reasoning visible through structured explanation frameworks and calibrated confidence communication.\n",
    "\n",
    "### What This Notebook Covers\n",
    "\n",
    "1. **Reasoning Transparency** — Decision logging and trace recording (pp. 347–348)\n",
    "2. **LIME & SHAP Frameworks** — Feature attribution for individual and global explanations (p. 349)\n",
    "3. **Counterfactual Analysis** — Minimal change explanations for recourse generation (p. 349)\n",
    "4. **Confidence Communication** — Epistemic vs. aleatoric uncertainty, temperature scaling calibration (pp. 350–352)\n",
    "5. **DiagnosticAssistant Case Study** — Multi-agent medical diagnosis with edge privacy and explanations (pp. 352–356)\n",
    "6. **Audience Adaptation** — Clinician vs. patient explanation templates (p. 356)\n",
    "7. **Production Failure Modes** — Sensor dropout, model failure, explanation failure (p. 356)\n",
    "8. **Governance & Regulatory Landscape** — Tables 12.2 and 12.3, five global frameworks (pp. 357–358)\n",
    "\n",
    "### Key Architectural Insight\n",
    "\n",
    "The Explainable Agent extends the cognitive loop with a **dedicated explanation generation layer** that runs in parallel with the decision pipeline. Explanations are faithful representations of actual system behavior — not post-hoc rationalizations.\n",
    "\n",
    "**Figures:** 12.2 (Medical Diagnosis Assistant Pipeline, p. 353)\n",
    "**Tables:** 12.2 (Global AI Regulatory Frameworks, p. 357), 12.3 (Technique Selection Reference, pp. 358–359)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3edace1b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 2 — Setup: Imports, sys.path, and mode detection\n",
    "# Ref: Technical Requirements (p.330)\n",
    "\n",
    "import sys\n",
    "import os\n",
    "\n",
    "# Ensure project root is on the path\n",
    "project_root = os.path.abspath(os.path.join(os.getcwd(), \"..\"))\n",
    "if project_root not in sys.path:\n",
    "    sys.path.insert(0, project_root)\n",
    "\n",
    "# Core utilities\n",
    "from chapter12.utils import ColorLogger, graceful_fallback, resolve_api_key, get_mode, is_simulation\n",
    "from chapter12.mock_llm import MockLLM, strip_meta\n",
    "from chapter12.synthetic_data import generate_medical_dataset, summarize_medical_dataset\n",
    "\n",
    "# Explainability core\n",
    "from chapter12.explainability_core import (\n",
    "    ExplainableAgent,\n",
    "    DecisionLogger,\n",
    "    ExplanationGenerator,\n",
    "    ConfidenceAwareAgent,\n",
    "    TemperatureScaler,\n",
    "    DiagnosticAssistant,\n",
    "    ClinicalExplainer,\n",
    "    BiometricAnalyzer,\n",
    "    SymptomInterpreter,\n",
    "    DiagnosticCoordinator,\n",
    "    ClinicalMemorySystem,\n",
    "    compute_shap_explanation,\n",
    "    compute_lime_explanation,\n",
    "    generate_counterfactual,\n",
    ")\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "# Visualization\n",
    "import matplotlib\n",
    "matplotlib.use(\"Agg\")\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "# ML tools\n",
    "from sklearn.ensemble import GradientBoostingClassifier\n",
    "\n",
    "# Initialize mode\n",
    "logger = ColorLogger(\"Notebook02\")\n",
    "api_key = resolve_api_key()\n",
    "mode = get_mode()\n",
    "logger.info(f\"Operating mode: {mode.upper()}\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "86e6bca3",
   "metadata": {},
   "source": [
    "## 1. Reasoning Transparency Techniques (p.347–348)\n",
    "\n",
    "Transparency begins with recording the agent's reasoning process in a structured, auditable format. The `ExplainableAgent` constructs an explanation **directly from the actual reasoning trace** recorded during execution — not through post-hoc rationalization.\n",
    "\n",
    "Four-step decision process:\n",
    "1. **Analyze inputs** — feature inventory, data quality\n",
    "2. **Apply domain rules** — qualification thresholds, experience weights\n",
    "3. **Assess risks and confidence** — risk level, uncertainty type\n",
    "4. **Synthesize decision** — final recommendation with rationale\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9485dd28",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 4 — ExplainableAgent four-step decision with trace\n",
    "# Ref: ExplainableAgent (p.347–348)\n",
    "\n",
    "agent = ExplainableAgent()\n",
    "\n",
    "# Sample input\n",
    "sample_input = {\n",
    "    \"wbc_count\": 12.5,\n",
    "    \"chest_imaging\": \"right_lower_consolidation\",\n",
    "    \"temperature\": 38.5,\n",
    "    \"spo2_min\": 91.5,\n",
    "    \"reported_symptoms\": \"productive cough, fever, shortness of breath\",\n",
    "}\n",
    "\n",
    "decision, explanation = agent.make_decision(sample_input, audience=\"engineer\")\n",
    "\n",
    "logger.info(\"--- Decision Result ---\")\n",
    "logger.info(f\"  Decision: {decision['decision']}\")\n",
    "logger.info(f\"  Confidence: {decision['confidence']}\")\n",
    "logger.info(f\"  Rationale: {decision['rationale']}\")\n",
    "logger.info(f\"  Explanation: {explanation['explanation']}\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9ae0bbc",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 5 — Inspect the immutable reasoning trace\n",
    "# Ref: DecisionLogger.get_trace() (p.348)\n",
    "\n",
    "trace = agent.get_reasoning_trace()\n",
    "\n",
    "logger.info(f\"Reasoning trace: {len(trace)} steps recorded\")\n",
    "for step in trace:\n",
    "    logger.info(f\"  Step {step['entry_index']}: {step['stage']}\")\n",
    "    logger.debug(f\"    Data: {str(step['data'])[:100]}\")\n",
    "    logger.debug(f\"    Timestamp: {step['timestamp']}\")\n",
    "\n",
    "logger.success(\"Trace is immutable — suitable for EU AI Act audit requirements (p.348).\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3cd9c278",
   "metadata": {},
   "source": [
    "## 2. Decision Explanation Frameworks: LIME & SHAP (p. 349)\n",
    "\n",
    "Two frameworks dominate model-agnostic explanations:\n",
    "\n",
    "- **LIME** — constructs a local interpretable model approximating the original model's behavior near the instance being explained\n",
    "- **SHAP** — unified feature attribution based on Shapley values; the **unique** method satisfying efficiency, symmetry, dummy, and additivity (Shapley Uniqueness Theorem)\n",
    "\n",
    "> 📌 **Production Guidance (p. 349):**\n",
    ">\n",
    "> - **TreeSHAP** is exact and polynomial-time for tree-based models (~10ms per prediction)\n",
    "> - **KernelSHAP** is model-agnostic but approximate\n",
    "> - For **latency-sensitive deployments**, use asynchronous explanation generation: deliver the decision synchronously, the SHAP explanation asynchronously\n",
    "> - If regulation demands **feature-level attribution** (EU AI Act), prefer SHAP. If regulation requires **contrastive explanation**, prefer counterfactual methods.\n",
    "> - **Full explanations are reserved** for decisions affecting access to resources, opportunities, or care."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "231aca7e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 7 — Train a classifier and compute SHAP explanations\n",
    "# Ref: SHAP (p.349)\n",
    "\n",
    "med_data = generate_medical_dataset(n=50, seed=42)\n",
    "\n",
    "feature_names = [\"heart_rate_avg\", \"spo2_min\", \"wbc_count\", \"temperature\"]\n",
    "X = np.array([[r[f] for f in feature_names] for r in med_data])\n",
    "y = np.array([1 if r[\"true_diagnosis\"] == \"pneumonia\" else 0 for r in med_data])\n",
    "\n",
    "model = GradientBoostingClassifier(n_estimators=50, random_state=42)\n",
    "model.fit(X, y)\n",
    "logger.success(f\"Model trained: accuracy = {model.score(X, y):.2f} on training set\")\n",
    "\n",
    "# SHAP explanation for first instance\n",
    "shap_result = compute_shap_explanation(model, X, feature_names, instance_index=0)\n",
    "\n",
    "logger.info(\"--- SHAP Feature Attributions ---\")\n",
    "for feat, val in shap_result[\"shap_values\"].items():\n",
    "    direction = \"↑ positive\" if val > 0 else \"↓ negative\"\n",
    "    logger.info(f\"  {feat}: {val:+.4f} ({direction})\")\n",
    "\n",
    "logger.info(f\"  Base value: {shap_result.get('base_value', 'N/A')}\")\n",
    "logger.info(f\"  Predicted value: {shap_result.get('predicted_value', 'N/A')}\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "22cc0adf",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 8 — LIME explanation and visual comparison with SHAP\n",
    "# Ref: LIME and SHAP (p.349)\n",
    "\n",
    "lime_result = compute_lime_explanation(model, X, feature_names, instance_index=0, num_features=4)\n",
    "\n",
    "logger.info(\"--- LIME Feature Weights ---\")\n",
    "for feat, weight in lime_result[\"lime_weights\"].items():\n",
    "    logger.info(f\"  {feat}: {weight:+.4f}\")\n",
    "\n",
    "# Side-by-side visualization\n",
    "fig, axes = plt.subplots(1, 2, figsize=(14, 5))\n",
    "fig.suptitle(\"SHAP vs. LIME Explanations for Patient P-0000 (p.349)\", fontsize=13, fontweight=\"bold\")\n",
    "\n",
    "# SHAP plot\n",
    "shap_feats = list(shap_result[\"shap_values\"].keys())\n",
    "shap_vals = list(shap_result[\"shap_values\"].values())\n",
    "colors_shap = [\"#E91E63\" if v > 0 else \"#2196F3\" for v in shap_vals]\n",
    "axes[0].barh(shap_feats, shap_vals, color=colors_shap, alpha=0.8)\n",
    "axes[0].axvline(x=0, color=\"gray\", linewidth=0.5)\n",
    "axes[0].set_title(\"SHAP Feature Attributions\")\n",
    "axes[0].set_xlabel(\"SHAP Value (impact on prediction)\")\n",
    "\n",
    "# LIME plot\n",
    "lime_feats = list(lime_result[\"lime_weights\"].keys())\n",
    "lime_vals = list(lime_result[\"lime_weights\"].values())\n",
    "colors_lime = [\"#E91E63\" if v > 0 else \"#2196F3\" for v in lime_vals]\n",
    "axes[1].barh(lime_feats, lime_vals, color=colors_lime, alpha=0.8)\n",
    "axes[1].axvline(x=0, color=\"gray\", linewidth=0.5)\n",
    "axes[1].set_title(\"LIME Feature Weights\")\n",
    "axes[1].set_xlabel(\"LIME Weight (local linear approximation)\")\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.savefig(\"shap_vs_lime.png\", dpi=100, bbox_inches=\"tight\")\n",
    "plt.show()\n",
    "logger.success(\"SHAP vs. LIME comparison rendered.\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2a7045c1",
   "metadata": {},
   "source": [
    "## 3. Counterfactual Analysis (p.349)\n",
    "\n",
    "A counterfactual explanation answers: **\"What would need to change for the decision to be different?\"**\n",
    "\n",
    "The Minimal Counterfactual Theorem guarantees that the optimization produces the smallest change that flips the decision, providing users with actionable, concrete guidance.\n",
    "\n",
    "> *\"If your annual income were $5,000 higher, or if your debt-to-income ratio were below 0.35, the application would have been approved.\"* (p.349)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "177dd574",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 10 — Counterfactual analysis on a negative-class patient\n",
    "# Ref: Counterfactual Analysis (p.349)\n",
    "\n",
    "# Find a patient predicted as NOT pneumonia\n",
    "preds = model.predict(X)\n",
    "neg_indices = [i for i, p in enumerate(preds) if p == 0]\n",
    "\n",
    "if neg_indices:\n",
    "    idx = neg_indices[0]\n",
    "    logger.info(f\"Patient P-{idx:04d} predicted as NOT pneumonia. Finding counterfactual...\")\n",
    "\n",
    "    cf = generate_counterfactual(model, X[idx], feature_names, desired_class=1)\n",
    "\n",
    "    logger.info(f\"  Success: {cf['success']}\")\n",
    "    logger.info(f\"  Iterations: {cf['iterations']}\")\n",
    "    logger.info(\"  Original values:\")\n",
    "    for feat, val in cf[\"original\"].items():\n",
    "        logger.info(f\"    {feat}: {val}\")\n",
    "    logger.info(\"  Required changes:\")\n",
    "    for feat, delta in cf[\"changes\"].items():\n",
    "        direction = \"increase\" if delta > 0 else \"decrease\"\n",
    "        logger.info(f\"    {feat}: {direction} by {abs(delta):.2f}\")\n",
    "else:\n",
    "    logger.info(\"All patients predicted as pneumonia. Adjusting to find negative case...\")\n",
    "    # Use a patient with low WBC and clear imaging\n",
    "    test_instance = np.array([72.0, 98.0, 5.0, 36.5])\n",
    "    cf = generate_counterfactual(model, test_instance, feature_names, desired_class=1)\n",
    "    logger.info(f\"  Success: {cf['success']}, Changes: {cf['changes']}\")\n",
    "\n",
    "# Visualize counterfactual changes\n",
    "if cf[\"changes\"]:\n",
    "    fig, ax = plt.subplots(figsize=(8, 4))\n",
    "    feats = list(cf[\"changes\"].keys())\n",
    "    deltas = list(cf[\"changes\"].values())\n",
    "    colors = [\"#4CAF50\" if d > 0 else \"#F44336\" for d in deltas]\n",
    "    ax.barh(feats, deltas, color=colors, alpha=0.8)\n",
    "    ax.axvline(x=0, color=\"gray\", linewidth=0.5)\n",
    "    ax.set_title(\"Counterfactual: Minimum Changes to Flip Prediction (p.349)\", fontweight=\"bold\")\n",
    "    ax.set_xlabel(\"Change Required\")\n",
    "    plt.tight_layout()\n",
    "    plt.savefig(\"counterfactual_changes.png\", dpi=100, bbox_inches=\"tight\")\n",
    "    plt.show()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ec7761b3",
   "metadata": {},
   "source": [
    "## 4. Confidence Communication Methods (pp. 350–352)\n",
    "\n",
    "Explanations are incomplete without an honest assessment of confidence. The `ConfidenceAwareAgent` addresses the *illusion of confidence* by:\n",
    "\n",
    "1. Generating **multiple candidate hypotheses**\n",
    "2. Scoring each with **calibrated confidence** (TemperatureScaler)\n",
    "3. Communicating uncertainty with **qualifiers** (p. 350):\n",
    "   - \\>0.9 → \"High confidence\"\n",
    "   - \\>0.7 → \"Moderate confidence\"\n",
    "   - ≤0.7 → \"Low confidence — human review recommended\"\n",
    "\n",
    "---\n",
    "\n",
    "> 📌 **Info Box — Epistemic vs. Aleatoric Uncertainty (p. 350)**\n",
    ">\n",
    "> - **Epistemic uncertainty** arises from the agent's lack of knowledge — can be reduced with more data. *Action: defer to a specialist.*\n",
    "> - **Aleatoric uncertainty** arises from inherent randomness — cannot be reduced. *Action: recommend monitoring and repeated measurement.*\n",
    ">\n",
    "> **Ensemble methods** approximate epistemic uncertainty through prediction variance: high variance = model uncertainty; low variance with moderate confidence = inherent outcome variability.\n",
    "\n",
    "---\n",
    "\n",
    "> 📌 **Info Box — Calibration (p. 350)**\n",
    ">\n",
    "> **Calibration** means when the agent says \"80% confident,\" roughly 80% of those predictions should be correct. Achieving this takes temperature scaling, Platt calibration, and continuous monitoring. Overconfident predictions invite over-reliance; underconfident ones cause unnecessary deferrals."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5f9a5ad3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 12 — ConfidenceAwareAgent multi-hypothesis ranking\n",
    "# Ref: ConfidenceAwareAgent (p.351)\n",
    "\n",
    "conf_agent = ConfidenceAwareAgent(n_hypotheses=5)\n",
    "\n",
    "hypotheses = conf_agent.reason_with_confidence(\n",
    "    \"Patient presents with productive cough, fever, and right lower lobe consolidation\",\n",
    "    context={\"domain\": \"pulmonology\"}\n",
    ")\n",
    "\n",
    "logger.info(f\"Generated {len(hypotheses)} hypotheses:\")\n",
    "for i, h in enumerate(hypotheses):\n",
    "    logger.info(f\"  H{i+1}: confidence={h['confidence']:.4f} — {h['answer'][:60]}\")\n",
    "\n",
    "# Communicate uncertainty\n",
    "comm = conf_agent.communicate_uncertainty(hypotheses)\n",
    "logger.info(f\"Recommendation: {comm['recommendation'][:60]}\")\n",
    "logger.info(f\"Confidence level: {comm['confidence_level']}\")\n",
    "logger.info(f\"Score: {comm['confidence_score']}\")\n",
    "logger.info(f\"Alternatives: {len(comm['alternative_hypotheses'])}\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4c7024b6",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 13 — Temperature scaling calibration demo\n",
    "# Ref: Confidence calibration (p.349)\n",
    "\n",
    "scaler = TemperatureScaler(temperature=1.0)\n",
    "\n",
    "raw_scores = np.linspace(0.1, 0.99, 50)\n",
    "calibrated_t1 = [scaler.calibrate(s) for s in raw_scores]\n",
    "\n",
    "scaler_hot = TemperatureScaler(temperature=2.0)\n",
    "calibrated_t2 = [scaler_hot.calibrate(s) for s in raw_scores]\n",
    "\n",
    "scaler_cold = TemperatureScaler(temperature=0.5)\n",
    "calibrated_t05 = [scaler_cold.calibrate(s) for s in raw_scores]\n",
    "\n",
    "fig, ax = plt.subplots(figsize=(8, 6))\n",
    "ax.plot(raw_scores, raw_scores, \"k--\", label=\"Perfect calibration\", linewidth=1)\n",
    "ax.plot(raw_scores, calibrated_t1, label=\"T=1.0 (identity)\", linewidth=2, color=\"#2196F3\")\n",
    "ax.plot(raw_scores, calibrated_t2, label=\"T=2.0 (smoothed)\", linewidth=2, color=\"#FF9800\")\n",
    "ax.plot(raw_scores, calibrated_t05, label=\"T=0.5 (sharpened)\", linewidth=2, color=\"#E91E63\")\n",
    "ax.set_xlabel(\"Raw Confidence Score\", fontsize=11)\n",
    "ax.set_ylabel(\"Calibrated Confidence Score\", fontsize=11)\n",
    "ax.set_title(\"Temperature Scaling Calibration (p.349)\", fontsize=13, fontweight=\"bold\")\n",
    "ax.legend(fontsize=10)\n",
    "ax.set_xlim(0, 1)\n",
    "ax.set_ylim(0, 1)\n",
    "ax.grid(True, alpha=0.3)\n",
    "\n",
    "# Annotate qualifier zones\n",
    "ax.axhspan(0.9, 1.0, alpha=0.1, color=\"green\", label=\"_\")\n",
    "ax.axhspan(0.7, 0.9, alpha=0.1, color=\"orange\", label=\"_\")\n",
    "ax.axhspan(0.0, 0.7, alpha=0.1, color=\"red\", label=\"_\")\n",
    "ax.text(0.05, 0.95, \"High\", fontsize=9, color=\"green\", fontweight=\"bold\")\n",
    "ax.text(0.05, 0.80, \"Moderate\", fontsize=9, color=\"orange\", fontweight=\"bold\")\n",
    "ax.text(0.05, 0.55, \"Low — human review\", fontsize=9, color=\"red\", fontweight=\"bold\")\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.savefig(\"calibration_curves.png\", dpi=100, bbox_inches=\"tight\")\n",
    "plt.show()\n",
    "logger.success(\"Calibration curves rendered with qualifier zones.\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "89fc5946",
   "metadata": {},
   "source": [
    "## 5. Case Study: Medical Diagnosis Assistant with Explanation (pp. 352–356)\n",
    "\n",
    "The `DiagnosticAssistant` operates as a layered multi-agent system (pp. 352–353):\n",
    "\n",
    "- **BiometricAnalyzer** — processes aggregated wearable features (edge-processed, not raw)\n",
    "- **SymptomInterpreter** — maps patient-reported symptoms to SNOMED CT concepts\n",
    "- **DiagnosticCoordinator** — integrates evidence to rank differential diagnoses\n",
    "- **ClinicalExplainer** — generates audience-adapted SHAP-based explanations\n",
    "- **ConfidenceEngine** — calibrates confidence with epistemic/aleatoric distinction\n",
    "\n",
    "**Figure 12.2** (p. 353) shows the complete architecture:\n",
    "\n",
    "```\n",
    " ┌─ EDGE BOUNDARY (Local Privacy) ─────────────────────────────┐\n",
    " │                                                              │\n",
    " │ ┌──────────────┐                    ┌──────────────────────┐ │\n",
    " │ │  Biometric   │  Aggregated        │  Confidence Engine   │ │\n",
    " │ │  Agent       │──Vitals──┐         │  (Epistemic vs       │ │\n",
    " │ │ (Raw stays   │          │         │   Aleatoric)         │ │\n",
    " │ │  local)      │          ▼         └──────────────────────┘ │\n",
    " │ └──────────────┘  ┌──────────────┐  ┌──────────────────────┐ │\n",
    " │                   │  Diagnostic  │  │  Clinical Explainer  │ │\n",
    " │ ┌──────────────┐  │  Coordinator │  │  (SHAP-Based         │ │\n",
    " │ │  Symptom     │──│  (Evidence   │  │   Clinical Reasoning)│ │\n",
    " │ │  Agent       │  │   Synthesis) │  └──────────────────────┘ │\n",
    " │ │ (NLP →       │  └──────┬───────┘                           │\n",
    " │ │  SNOMED CT)  │         │          ┌──────────────────────┐ │\n",
    " │ └──────────────┘         └─────────→│  DIAGNOSTIC REPORT   │ │\n",
    " │                                     │  Differentials |      │ │\n",
    " │ ┌──────────────┐                    │  Explanations |       │ │\n",
    " │ │  Episodic    │                    │  Confidence |         │ │\n",
    " │ │  Memory      │                    │  Audit Trail          │ │\n",
    " │ └──────────────┘                    └──────────────────────┘ │\n",
    " └──────────────────────────────────────────────────────────────┘\n",
    "```\n",
    "\n",
    "> 📌 **Info Box — Edge Computing for Privacy (pp. 353–354)**\n",
    ">\n",
    "> The biometric agent processes raw patient data locally on the patient's device or a hospital edge server. Raw vitals never leave the local processing boundary — only aggregated features are transmitted. This satisfies HIPAA's minimum necessary standard and GDPR's data minimization principle by construction. The edge layer also provides a natural location for **differential privacy** mechanisms."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ba2e3d93",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 15 — Generate synthetic medical dataset\n",
    "# Ref: Medical Diagnosis Assistant (p.352–356), edge privacy (p.353–354)\n",
    "\n",
    "med_data = generate_medical_dataset(n=50, seed=42)\n",
    "med_summary = summarize_medical_dataset(med_data)\n",
    "\n",
    "logger.info(f\"Dataset: {med_summary['total_patients']} patients (de-identified)\")\n",
    "logger.info(f\"Diagnosis distribution: {med_summary['diagnosis_distribution']}\")\n",
    "logger.info(f\"Vitals: HR={med_summary['avg_heart_rate']}, SpO2={med_summary['avg_spo2']}, WBC={med_summary['avg_wbc']}\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "86eb27f6",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 16 — DiagnosticAssistant: pneumonia-profile patient\n",
    "# Ref: DiagnosticAssistant.diagnose() (p.354–356)\n",
    "\n",
    "assistant = DiagnosticAssistant(n_hypotheses=5)\n",
    "\n",
    "# Patient matching the chapter's pneumonia example (p.356)\n",
    "patient_pneumonia = {\n",
    "    \"patient_id\": \"P-DEMO\",\n",
    "    \"heart_rate_avg\": 92.0,\n",
    "    \"spo2_min\": 91.5,\n",
    "    \"wbc_count\": 12.5,\n",
    "    \"temperature\": 38.5,\n",
    "    \"chest_imaging\": \"right_lower_consolidation\",\n",
    "    \"patient_history\": [\"COPD\", \"hypertension\"],\n",
    "}\n",
    "symptoms_pneumonia = [\"productive cough\", \"fever\", \"shortness of breath\"]\n",
    "\n",
    "report = assistant.diagnose(patient_pneumonia, symptoms_pneumonia, audience=\"clinician\")\n",
    "rd = report.to_dict()\n",
    "\n",
    "logger.info(\"--- Diagnostic Report (Clinician) ---\")\n",
    "logger.info(f\"Top diagnosis: {rd['differentials'][0]['answer'] if rd['differentials'] else 'N/A'}\")\n",
    "logger.info(f\"Confidence: {rd['confidence_summary']['confidence_level']}\")\n",
    "logger.info(f\"Score: {rd['confidence_summary']['confidence_score']}\")\n",
    "\n",
    "# Display differentials\n",
    "logger.info(\"Differentials:\")\n",
    "for diff in rd[\"differentials\"]:\n",
    "    logger.info(f\"  {diff['answer']}: {diff['confidence']:.4f} ({diff['qualifier']})\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "41d2f6bc",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 17 — Display the full clinical explanation narrative\n",
    "# Ref: ClinicalExplainer.generate() (p.356)\n",
    "\n",
    "explanation = rd[\"explanation\"]\n",
    "\n",
    "logger.info(\"--- Clinician-Facing Explanation ---\")\n",
    "logger.success(explanation[\"narrative\"])\n",
    "\n",
    "logger.info(\"Feature contributions (SHAP):\")\n",
    "for feat, val in explanation[\"feature_contributions\"].items():\n",
    "    bar = \"█\" * int(abs(val) * 30)\n",
    "    logger.info(f\"  {feat}: {val:.2f} {bar}\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "56a28f80",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 18 — Differential diagnosis ranking chart\n",
    "# Ref: DiagnosticAssistant case study (p.354–356)\n",
    "\n",
    "differentials = rd[\"differentials\"]\n",
    "if differentials:\n",
    "    diag_names = [d[\"answer\"] for d in differentials]\n",
    "    diag_confs = [d[\"confidence\"] for d in differentials]\n",
    "    colors = [\"#4CAF50\" if c > 0.7 else \"#FF9800\" if c > 0.3 else \"#F44336\"\n",
    "              for c in diag_confs]\n",
    "\n",
    "    fig, ax = plt.subplots(figsize=(10, 4))\n",
    "    bars = ax.barh(diag_names[::-1], diag_confs[::-1], color=colors[::-1], alpha=0.8)\n",
    "    ax.set_xlabel(\"Calibrated Confidence\", fontsize=11)\n",
    "    ax.set_title(\"Differential Diagnosis Ranking — Patient P-DEMO (p.355)\",\n",
    "                 fontsize=13, fontweight=\"bold\")\n",
    "    ax.set_xlim(0, 1.0)\n",
    "\n",
    "    for bar, conf in zip(bars, diag_confs[::-1]):\n",
    "        ax.text(bar.get_width() + 0.02, bar.get_y() + bar.get_height()/2,\n",
    "                f\"{conf:.2f}\", va=\"center\", fontweight=\"bold\", fontsize=11)\n",
    "\n",
    "    plt.tight_layout()\n",
    "    plt.savefig(\"differential_ranking.png\", dpi=100, bbox_inches=\"tight\")\n",
    "    plt.show()\n",
    "    logger.success(\"Differential diagnosis chart rendered.\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6c9a25c9",
   "metadata": {},
   "source": [
    "## 6. Audience-Adapted Explanations (p.356)\n",
    "\n",
    "The same finding is communicated differently to clinicians and patients. This adaptation is not cosmetic — it is a trust mechanism ensuring each stakeholder receives the appropriate level of detail.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b2da0e6d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 19 — Side-by-side clinician vs. patient explanation\n",
    "# Ref: Audience adaptation (p.356)\n",
    "\n",
    "# Clinician explanation (already generated)\n",
    "clinician_narrative = rd[\"explanation\"][\"narrative\"]\n",
    "\n",
    "# Generate patient-facing explanation\n",
    "report_patient = assistant.diagnose(patient_pneumonia, symptoms_pneumonia, audience=\"patient\")\n",
    "rpd = report_patient.to_dict()\n",
    "patient_narrative = rpd[\"explanation\"][\"narrative\"]\n",
    "\n",
    "logger.info(\"═══ CLINICIAN-FACING EXPLANATION ═══\")\n",
    "logger.success(clinician_narrative)\n",
    "\n",
    "logger.info(\"\")\n",
    "logger.info(\"═══ PATIENT-FACING EXPLANATION ═══\")\n",
    "logger.success(patient_narrative)\n",
    "\n",
    "logger.info(\"\")\n",
    "logger.info(\"Key differences:\")\n",
    "logger.info(\"  • Clinician: includes SHAP values, confidence scores, differential ranking\")\n",
    "logger.info(\"  • Patient: plain language, no numeric scores, emphasizes next steps with doctor\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4b7fdbfb",
   "metadata": {},
   "source": [
    "## 7. Production Failure Modes (p. 356)\n",
    "\n",
    "Three failure modes require specific architectural provisions:\n",
    "1. **Sensor dropout** — biometric agent operates on stale data with degrading confidence; after timeout (e.g., 30 min), alerts care team\n",
    "2. **Model serving failure** — coordinator falls back to rule-based triage (e.g., SpO2 < 90% triggers escalation); test via chaos engineering\n",
    "3. **Explanation generation failure** — delivers diagnosis with simplified feature-importance summary; audit trail records fallback\n",
    "\n",
    "> 📌 **Info Box — Pilot Deployment Results (p. 357)**\n",
    ">\n",
    "> Pilot deployments showed 2–3% daily data transmission failures. Graceful degradation prevented clinical impact. Results: **30% increase in early detection** of chronic condition exacerbations, **3% false alarm rate**, **40% improvement** in clinician response times. Clinicians reported structured explanations increased their confidence in recommendations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f9be4aeb",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 20 — Production failure mode demonstrations\n",
    "# Ref: Production failure modes (p.356)\n",
    "\n",
    "logger.info(\"═══ Failure Mode 1: Sensor Dropout ═══\")\n",
    "# Patient with missing biometrics (edge device disconnected)\n",
    "patient_stale = {\n",
    "    \"patient_id\": \"P-STALE\",\n",
    "    \"heart_rate_avg\": 75.0,  # Last known value\n",
    "    \"spo2_min\": 96.0,        # Last known value\n",
    "    \"wbc_count\": 7.5,\n",
    "    \"temperature\": 37.0,\n",
    "    \"chest_imaging\": \"clear\",\n",
    "    \"patient_history\": [],\n",
    "}\n",
    "report_stale = assistant.diagnose(patient_stale, [\"fatigue\", \"headache\"], audience=\"clinician\")\n",
    "rd_stale = report_stale.to_dict()\n",
    "logger.info(f\"  Diagnosis with stale data: {rd_stale['differentials'][0]['answer'] if rd_stale['differentials'] else 'N/A'}\")\n",
    "logger.info(f\"  Confidence: {rd_stale['confidence_summary']['confidence_level']}\")\n",
    "logger.info(\"  → In production, confidence degrades progressively after 30-min timeout (p.356)\")\n",
    "\n",
    "logger.info(\"\")\n",
    "logger.info(\"═══ Failure Mode 2: Model Serving Failure ═══\")\n",
    "# Simulated by showing the rule-based triage fallback\n",
    "logger.info(\"  When the diagnostic model endpoint becomes unavailable:\")\n",
    "logger.info(\"  → Coordinator falls back to rule-based triage:\")\n",
    "logger.info(\"    • SpO2 < 90% → immediate escalation\")\n",
    "logger.info(\"    • Temperature > 39°C → urgent review\")\n",
    "logger.info(\"    • WBC > 15 → flag for infection workup\")\n",
    "logger.info(\"  → Fallback tested via chaos engineering exercises (p.356)\")\n",
    "\n",
    "logger.info(\"\")\n",
    "logger.info(\"═══ Failure Mode 3: Explanation Generation Failure ═══\")\n",
    "logger.info(\"  When SHAP computation times out:\")\n",
    "logger.info(\"  → Diagnosis still delivered with explanation_unavailable flag\")\n",
    "logger.info(\"  → Simplified feature-importance from model attention weights\")\n",
    "logger.info(\"  → Audit trail records both the failure and the fallback used\")\n",
    "\n",
    "logger.success(\"Failure mode demonstrations complete.\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bf4ee866",
   "metadata": {},
   "source": [
    "## 8. Governance and Regulatory Landscape (p.357–358)\n",
    "\n",
    "### Table 12.2 — Global AI Regulatory Frameworks\n",
    "\n",
    "| Framework | Scope | Key Requirements |\n",
    "|-----------|-------|-----------------|\n",
    "| **EU AI Act** | High-risk AI in EU | Risk assessment, bias mitigation, human oversight, documentation |\n",
    "| **NIST AI RMF** | US Federal guidance | Risk identification, continuous monitoring, stakeholder values |\n",
    "| **GDPR** | Data processing in EU | Data minimization, consent, right to explanation |\n",
    "| **Singapore AI Verify** | Voluntary certification | Fairness, robustness, transparency self-assessment |\n",
    "| **China CAC** | Algorithmic systems | Algorithmic transparency, user control, anti-discrimination |\n",
    "\n",
    "### Table 12.3 — Technique Selection Reference\n",
    "\n",
    "| Technique | Use Case | When to Use |\n",
    "|-----------|----------|-------------|\n",
    "| **Deontic logic** | Ethical constraint formalization | Rules must be machine-verifiable |\n",
    "| **Equalized odds** | Demographic parity enforcement | FPR must be equalized under anti-discrimination law |\n",
    "| **Disparate impact ratio** | Adverse impact detection | Regulatory safe-harbor threshold (0.8 rule) |\n",
    "| **LIME** | Local decision explanation | Individual prediction for non-technical stakeholder |\n",
    "| **SHAP** | Feature attribution | Global model audit across full feature space |\n",
    "| **Counterfactual analysis** | Recourse generation | Actionable path to a different outcome |\n",
    "| **Confidence calibration** | Uncertainty communication | Decision confidence for patients/clinicians/regulators |\n",
    "| **Compliance registry** | Multi-jurisdictional validation | Different regulatory requirements by jurisdiction |\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1a92df9c",
   "metadata": {},
   "source": [
    "## Summary & Exercises\n",
    "\n",
    "### Key Takeaways\n",
    "\n",
    "1. **Reasoning transparency** (p.347–348) means recording the agent's logic at every stage — not generating post-hoc rationalizations.\n",
    "2. **SHAP** (p.349) provides the *unique* feature attribution satisfying efficiency, symmetry, dummy, and additivity. **LIME** offers local interpretability without uniqueness guarantees.\n",
    "3. **Counterfactual analysis** (p.349) provides actionable recourse by finding the minimal change that flips a decision — the same mathematical framework as counterfactual fairness.\n",
    "4. **Confidence calibration** (p.350–352) ensures that \"80% confident\" means approximately 80% correct. Distinguish epistemic (reducible) from aleatoric (irreducible) uncertainty.\n",
    "5. The **DiagnosticAssistant** (p.352–356) demonstrates how biometric data, symptom analysis, and clinical knowledge converge through explanation-aware agents.\n",
    "6. **Audience adaptation** (p.356) is a trust mechanism: clinicians get SHAP values and differentials; patients get plain-language next steps.\n",
    "\n",
    "### Exercises\n",
    "\n",
    "1. **Add a new audience**: Implement a \"regulator\" audience template that includes the full SHAP table, the four-fifths DI ratio, and the compliance registry status.\n",
    "2. **Expand the differential**: Add \"tuberculosis\" to the differential diagnosis list in `MockLLM._mock_differential_generation` and observe how SHAP attributions change.\n",
    "3. **Calibration experiment**: Change the `TemperatureScaler` temperature to 0.5 and 2.0 — how does this affect the qualifier distribution?\n",
    "4. **Counterfactual fairness** (p.349): For the HR dataset from Notebook 01, compute counterfactuals asking \"would the outcome change if gender were different?\" Bridge this to the BiasDetector's disparate impact analysis.\n",
    "\n",
    "---\n",
    "*Author: Imran Ahmad — Packt Publishing, 2026*\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.15"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}