{ "cells": [ { "cell_type": "markdown", "id": "d21d6654", "metadata": { "papermill": { "duration": 0.016431, "end_time": "2026-05-28T02:05:37.587432+00:00", "exception": false, "start_time": "2026-05-28T02:05:37.571001+00:00", "status": "completed" }, "tags": [] }, "source": [ "# 20 · Chain-of-Verification (CoVe) — kill hallucinations with self-questioning\n", "\n", "> **TL;DR.** Produce a baseline answer, **plan verification questions** about the specific claims in that answer, **answer each question independently** (without seeing the baseline), then **revise** the baseline keeping only verified claims.\n", ">\n", "> **Reach for it when** the task is fact-heavy and the baseline tends to confabulate (lists of entities, biographical details, citations, statistics).\n", "> **Avoid when** the task has no externally-verifiable facts (creative writing, opinions) — there's nothing for verification questions to check.\n", "\n", "| Property | Value |\n", "|---|---|\n", "| Origin | Dhuliawala et al., Meta 2023. [arXiv:2309.11495](https://arxiv.org/abs/2309.11495) |\n", "| Stages | BASELINE → PLAN questions → EXECUTE answers → REVISE |\n", "| Key trick | Verification answered WITHOUT seeing the baseline (breaks consistency-bias) |\n", "| LLM-as-Scorer? | **None** — REVISE makes categorical keep/drop decisions per claim |\n", "| Default LLM | **Qwen3-Thinking** (per handoff §10) |\n", "| Cost | 1 + 1 + N + 1 = **N+3 LLM calls** (N = verification questions, usually 3-7) |\n", "\n", "**Why this is different from Reflection (nb 01).** Reflection critiques the whole answer in one shot and rewrites. CoVe *decomposes* the critique into independent atomic checks. The critical detail: verification questions are answered with no access to the baseline, so the model can't rationalise its prior claims into self-consistent (but wrong) confirmations." ] }, { "cell_type": "markdown", "id": "0b6aa77d", "metadata": { "papermill": { "duration": 0.013319, "end_time": "2026-05-28T02:05:37.609134+00:00", "exception": false, "start_time": "2026-05-28T02:05:37.595815+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 2 · Architecture at a glance\n", "\n", "```mermaid\n", "flowchart LR\n", " A([task]) --> B[BASELINE
produce initial answer]\n", " B --> P[PLAN
generate verification
questions per claim
]\n", " P --> E[EXECUTE
answer each question
independently — no baseline access
]\n", " E --> R[REVISE
keep verified claims,
drop or correct the rest
]\n", " R --> Z([final answer])\n", "\n", " style B fill:#ffebee,stroke:#c62828\n", " style P fill:#e3f2fd,stroke:#1976d2\n", " style E fill:#fff3e0,stroke:#f57c00\n", " style R fill:#e8f5e9,stroke:#388e3c\n", "```\n", "\n", "The red BASELINE node is where hallucinations enter. The orange EXECUTE node is the load-bearing fix — each verification question becomes a fresh, isolated prompt." ] }, { "cell_type": "markdown", "id": "264db53f", "metadata": { "papermill": { "duration": 0.014571, "end_time": "2026-05-28T02:05:37.634510+00:00", "exception": false, "start_time": "2026-05-28T02:05:37.619939+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 3 · Theory\n", "\n", "### 3.0 · The consistency-bias problem CoVe attacks\n", "\n", "If you ask \"Is your previous answer correct?\" the model is incentivised to agree with itself — that's the most coherent response to \"is X true?\" when X is its own prior output. The fix: ask the verification questions in a fresh context where the prior answer isn't visible. The model now treats each question as a standalone factual query and answers based on its actual world model, not its consistency with an earlier (possibly wrong) commitment.\n", "\n", "Concretely: the `_execute` node in [`chain_of_verification.py`](../src/agentic_architectures/architectures/chain_of_verification.py) loops over verification questions and calls the LLM once per question, with a prompt that contains ONLY the question — no task description, no baseline answer. The model is forced to answer from its actual knowledge.\n", "\n", "### 3.1 · Why decompose the critique\n", "\n", "Reflection (nb 01) asks one big \"is this answer good?\" question. The model can only commit to a single overall judgement, and it gets pulled toward \"yes, mostly fine\". CoVe forces the model to commit, atomically, to *N independent* judgements — one per claim. Some commitments will land on \"actually I'm not sure about that one\" or \"no, that's wrong\", which the REVISE stage can act on.\n", "\n", "### 3.2 · No LLM-as-Scorer step → no flat-scoring pathology\n", "\n", "The `_VerificationAnswer` schema captures `confidence: 'high' | 'medium' | 'low'` for each verification answer — categorical, not numeric. REVISE makes per-claim keep/drop decisions, not weighted score composition. There is no numeric judgement anywhere in the pipeline, so the LLM-as-Scorer flatness pathology (Mental Loop nb 10 §11) cannot manifest.\n", "\n", "### 3.3 · Where this sits\n", "\n", "| Pattern | Hallucination strategy |\n", "|---|---|\n", "| Plain CoT | Hope the chain doesn't go wrong |\n", "| [Reflection (nb 01)](./01_reflection.ipynb) | Holistic critique + rewrite (one big judgement) |\n", "| **CoVe (this nb)** | **Decompose into N atomic factual checks, answered in isolation** |\n", "| [Self-Consistency (nb 21)](./21_self_consistency.ipynb) | Sample N reasoning paths, majority-vote |\n", "| RAG (nb 23+) | Ground in external retrieved documents |\n", "| [Constitutional AI (nb 32)](./32_constitutional_ai.ipynb) | Critique against a written constitution |\n", "\n", "### 3.4 · Failure modes preview\n", "\n", "1. **Bad question design.** If PLAN generates yes/no questions for claims the model is already biased toward \"yes\" on, EXECUTE will return all `yes/high` and no changes will be made. Mitigation: prompt PLAN to target the *most likely wrong* claims.\n", "2. **Verification answers also hallucinate.** The model that wrote the baseline is also answering the questions — if its world model has the same gap, verification will agree with the baseline (both wrong). Mitigation: use a stronger model for EXECUTE.\n", "3. **No claims to verify.** Free-text/creative tasks have nothing to fact-check. PLAN will produce vacuous questions." ] }, { "cell_type": "markdown", "id": "faf0d15f", "metadata": { "papermill": { "duration": 0.014065, "end_time": "2026-05-28T02:05:37.659510+00:00", "exception": false, "start_time": "2026-05-28T02:05:37.645445+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 4 · Setup" ] }, { "cell_type": "code", "execution_count": 1, "id": "ec2996b1", "metadata": { "execution": { "iopub.execute_input": "2026-05-28T02:05:37.694159Z", "iopub.status.busy": "2026-05-28T02:05:37.694159Z", "iopub.status.idle": "2026-05-28T02:05:48.131018Z", "shell.execute_reply": "2026-05-28T02:05:48.127570Z" }, "papermill": { "duration": 10.462023, "end_time": "2026-05-28T02:05:48.135652+00:00", "exception": false, "start_time": "2026-05-28T02:05:37.673629+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
Reasoning LLM: Qwen/Qwen3-235B-A22B-Thinking-2507-fast ────────────────────────────────────────────────────────────\n",
       "
\n" ], "text/plain": [ "\u001b[1;36mReasoning LLM: Qwen/Qwen3-235B-A22B-Thinking-\u001b[0m\u001b[1;36m2507\u001b[0m\u001b[1;36m-fast\u001b[0m \u001b[92m────────────────────────────────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from agentic_architectures import get_llm, enable_langsmith, settings\n", "from agentic_architectures.architectures import ChainOfVerification\n", "from agentic_architectures.ui import print_md, print_header, print_step\n", "\n", "enable_langsmith()\n", "\n", "# Per handoff §10, nb 20 defaults to Qwen3-Thinking.\n", "reasoning_llm = get_llm(\n", " provider=\"nebius\",\n", " model=\"Qwen/Qwen3-235B-A22B-Thinking-2507-fast\",\n", " temperature=0.4,\n", ")\n", "print_header(f\"Reasoning LLM: {reasoning_llm.model}\")" ] }, { "cell_type": "markdown", "id": "7ef083cd", "metadata": { "papermill": { "duration": 0.014769, "end_time": "2026-05-28T02:05:48.161990+00:00", "exception": false, "start_time": "2026-05-28T02:05:48.147221+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 5 · Library walkthrough\n", "\n", "Source: [`src/agentic_architectures/architectures/chain_of_verification.py`](../src/agentic_architectures/architectures/chain_of_verification.py).\n", "\n", "Three structured-output schemas drive the four stages:\n", "\n", "- **`_VerificationQuestions`** — Stage 2: list of 3-7 questions, each targeting one specific claim.\n", "- **`_VerificationAnswer`** — Stage 3: per-question answer + categorical `confidence` (high/medium/low).\n", "- **`_RevisedResponse`** — Stage 4: revised answer + bullet list of changes made." ] }, { "cell_type": "code", "execution_count": 2, "id": "e885c109", "metadata": { "execution": { "iopub.execute_input": "2026-05-28T02:05:48.190485Z", "iopub.status.busy": "2026-05-28T02:05:48.189805Z", "iopub.status.idle": "2026-05-28T02:05:48.219905Z", "shell.execute_reply": "2026-05-28T02:05:48.216734Z" }, "papermill": { "duration": 0.042776, "end_time": "2026-05-28T02:05:48.221923+00:00", "exception": false, "start_time": "2026-05-28T02:05:48.179147+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--- _VerificationQuestions ---\n", "{\n", " \"description\": \"Stage 2 \\u2014 questions designed to probe specific claims in the baseline.\",\n", " \"properties\": {\n", " \"questions\": {\n", " \"description\": \"3-7 verification questions. Each must target ONE specific factual claim from the baseline. Phrase as standalone questions answerable without se...\n", "\n", "--- _VerificationAnswer ---\n", "{\n", " \"description\": \"Stage 3 \\u2014 independent answer to one verification question.\",\n", " \"properties\": {\n", " \"question\": {\n", " \"description\": \"The question, copied verbatim.\",\n", " \"title\": \"Question\",\n", " \"type\": \"string\"\n", " },\n", " \"answer\": {\n", " \"description\": \"The answer in 1-2 sentences....\n", "\n", "--- _RevisedResponse ---\n", "{\n", " \"description\": \"Stage 4 \\u2014 final answer after applying verification.\",\n", " \"properties\": {\n", " \"revised_response\": {\n", " \"description\": \"The rewritten answer, keeping only claims that the verification questions confirmed (or didn't disconfirm). Drop or correct any claim the verification answ...\n", "\n" ] } ], "source": [ "from agentic_architectures.architectures.chain_of_verification import (\n", " _VerificationQuestions, _VerificationAnswer, _RevisedResponse,\n", ")\n", "import json\n", "for name, schema in [\n", " ('_VerificationQuestions', _VerificationQuestions),\n", " ('_VerificationAnswer', _VerificationAnswer),\n", " ('_RevisedResponse', _RevisedResponse),\n", "]:\n", " print(f'--- {name} ---')\n", " print(json.dumps(schema.model_json_schema(), indent=2)[:300] + '...')\n", " print()" ] }, { "cell_type": "markdown", "id": "dd94170d", "metadata": { "papermill": { "duration": 0.025993, "end_time": "2026-05-28T02:05:48.259997+00:00", "exception": false, "start_time": "2026-05-28T02:05:48.234004+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 6 · State" ] }, { "cell_type": "markdown", "id": "31dbe3f3", "metadata": { "papermill": { "duration": 0.006021, "end_time": "2026-05-28T02:05:48.279137+00:00", "exception": false, "start_time": "2026-05-28T02:05:48.273116+00:00", "status": "completed" }, "tags": [] }, "source": [ "| Field | Set by |\n", "|---|---|\n", "| `task` | caller |\n", "| `baseline_response` | `_baseline` |\n", "| `verification_questions` | `_plan` |\n", "| `verification_answers` (each with `confidence`) | `_execute` (one LLM call per question) |\n", "| `revised_response` / `changes_made` | `_revise` |\n", "| `history` | every node (`Annotated[..., operator.add]`) |" ] }, { "cell_type": "markdown", "id": "18fdfa91", "metadata": { "papermill": { "duration": 0.008038, "end_time": "2026-05-28T02:05:48.297756+00:00", "exception": false, "start_time": "2026-05-28T02:05:48.289718+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 7 · Build the graph" ] }, { "cell_type": "code", "execution_count": 3, "id": "adff2c97", "metadata": { "execution": { "iopub.execute_input": "2026-05-28T02:05:48.330995Z", "iopub.status.busy": "2026-05-28T02:05:48.327242Z", "iopub.status.idle": "2026-05-28T02:05:49.073333Z", "shell.execute_reply": "2026-05-28T02:05:49.069754Z" }, "papermill": { "duration": 0.770285, "end_time": "2026-05-28T02:05:49.077443+00:00", "exception": false, "start_time": "2026-05-28T02:05:48.307158+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAGoAAAITCAIAAABg8R7gAAAQAElEQVR4nOydCVwUZR/Hn5k9uO9DBQUURMELFQvTNEPUSvO+NTXLI/NIzQ7t0G7LTjOz8sj09TXzyjPT17zyNrzSEgQURERu2GWPmfc/O7AssDu7s8+CAzzfj9HsM8/z7Mxvn/v6y1mWRQR7kSMCBkQ+LIh8WBD5sCDyYUHkwwJXvptXVDfOF+Zla9QqPaNDiEWUnGV1FEsjijV8lLGsnqJolmXgL0I0dxcCwjXLIJZCNMVdcMhYpOduyRSsXstdsIilaYq/S8kRq2e5AODORYsQUxYJ/43GR4JbrB7+x4U3BKx0F1C40AoF5eolD4v2aBPnhjCg7Gv3nTuUf+l4XnGBDjGsQkHLnWi5glMH3lCmoPVahpJT3DuwLHehYxHNvy3nh9Fx38i/OYgH6vEC0XDLcCFTUnoN/+oUd5vhrmVypGcoxPDu3B3O3aAR/43GZ6MVFKNljfKVfTRB4STTahmtmtFqWUbHOLvJm7dx6zUiAIlHtHznD+adO5Sj16OAYOeHevs1i3JCdZmi++zRnVm3b5TodUzzth59xweKCi5OvnVLUlTFTJs470cH+6L6xd+nik7svscw7PPvtECUraFEyPfNy0n+wU7D5zRF9ZfDW7Kv/JnXfWBghx6etvi3Vb7lc288PqJxdJw7agCsmJ809rUwLz+ZVZ82yff1vKTn3wlXuqKGw7evJsfG+3VO8BL2RiNrrFyQFD+ycYPSDpj6YYuT+7Pz7+mEvVmRb92S1IBmzq0fwmoc1VHi+vn/Z1masB8h+c4eyCsu0g2dGYwaJJ17ezm70j9/cVvAj5B85/6X276rD2rADJ8dmpWmFvBgUb6/DhdAi7x7vWvficLNi3Jxl/3yZbolDxblSzya16iZC6pdEhIS0tPTxYZKSkrq378/qhlievhm3baYAC3KV5SniU3wQ7XInTt3cnNzkXiuXr2KaoxO8V7QJU+7Zl5B8yMu/14ohu59SOsa6c9CS/M///nPrl27UlNTmzdvHhcXN3369AsXLkybNg3uDhw4sGfPnsuWLYM0tWXLljNnzmRkZLRo0WLQoEHDhg3jY4iPj3/uuecOHToEocaPH79+/XpwjI2Nfemll8aOHYscDeTfKyfyQ1o7V79lXr6bV4oVTjZ3/ESyadOm1atXz5kzp1u3bocPH/7666/d3NwmTZr0+eefg+OOHTuCg7m6HhQE4RYuXAiDKykpKR999FGTJk0gCNxSKBTbtm176KGHQMTOnTuDh99++w1+D1QzePgocu+Vmr1lXr7C+1pX15oaST1//nx0dDRfWg0ePLhLly4lJSXVvX3wwQfFxcVBQUHIkLJ27tx54sQJXj7Qy8vLa/78+ahW8PBVpP+rNXvLvEaaUkaurKnU16FDh6+++mrJkiUdO3bs0aNH06bmxyAgj0M6PX78OORx3oVPlTzwA6DawsmN1mgYs7fMy8fAyCRdU/KNGTMGcusff/yxePFiuVwOte2sWbMCAiqNVjIMM3v2bI1G8+KLL0LS8/DwmDx5sqkHpVKJagtQgrZQxZqXT+kiKy3Roxp6GpoebCA5Ofn06dOrVq0qKir67LPPTP1cu3btypUrK1asgAKOdyksLAwMFDeW6ShURQxtITGZl8/dS5GfbT634wNlfFRUVHh4eAsDoAvUA1X85OXlwV+jXskGIAh6EBTmaJUu5oUynyibtXbTlprP7fjs27fv5ZdfPnLkSH5+/rFjx6D9AaUhuIeFhcHfAwcOXL58GWSFfA0tkoKCAqh2P/74Y2jfQMPQbIQhISHZ2dlQiRtLSceSf1/r7SdGvrZd3Rk9m3OnRhLgokWLQJ25c+dC8+2dd96BVh60TsAd6pABAwasXLkSKpbGjRu/++67ly5devzxx6E1N2PGDGj0gazGpp8p3bt3j4mJgYp4//79qAZQF+tadfIwe8vicOl3C5MDmzkPnBaEGjbXzhYd3JQ545MIs3ctdtqiunjeFRxsaCCc2X/fN9Bi78ti27j7IP+Lx/Mu/C+/Yy/zA9aZmZmjRo0ye8vd3R0qU7O3INtClwPVDGsNmL3FzQtbyGfQNjJbJvDkZWumvh9h6a7QXMfBjff+TSyY9pH5+k6n02VlZZm9pVarnZ2dzd6CCqHm2h+FBszegirI09P85Bm4w+9t9tbGD9Jg5n7cwhBkAStTRd++lhza2rXfhMao4ZF2Xf3rqtszlkUI+LEy1zH1gxY3EotU+TXVhJYye9fcefTpRsJ+rM+09RnbeO17KaiBsebt1OBw1/Y9PYS92TTPm5Op3bg09cVPI1DD4JsFyT2HBkY/bH1NgK2rDG5eKdn1fUZMD59HB9fqEHQtk3ZNtXtNRliU2xMTbSruxS0R+vb1ZIWC7vdM46BwZ1Tv2Lj0Vn625tFBgW0f8bAxiOgFart/uJP6dwmMX0d0cH90sD+q+yQeKbx4LBfGBfyaOI2cJ24BlJ3LI/euvXvrn2KthlU6wVSe3NVdLneiaH4BqDFqfuknt3oUVfkWbiWkYZEjy7Bl6yR5d4iBofj1kMbVklC90RQNQ5CmjrQMQVD4r8JFzr0Lqwc/FMOUrcCkKe6a//KK5ZdymV7DFOXrSor0pSq9TEb5NlEOn9HUjqW2dsrHU5TLnj5w/95tdUm+tlQNI6zwkibyUYZVtAb1+EW1Fbd4yQwLQHlvPCAKy71txepdGDeVyeAXABFoY5xld1kuWqOLMazx96D4Bb7cOyLTsKCXQkk5uch8GinadfNpGmn/jBiWfLVA3759N27c6Ocn0fpK6ivroWsI/TwkVYh8WBD5sJC6fFqtFibFkVSRtHwMwzdWrHfMHxSSlk/iORcR+TCR9MNJvOBDJPVhQuTDgsiHBZEPC6nLR6oO+yGpDwsiHxZEPiyg2Uzksx+S+rAg8mFB5MOCyIcFGXHBgqQ+LGQymYeHrctNHghSnyrKz89HEkbaWUMuh/yLJAyRDwsiHxZEPiyIfFhIveFC5LMfkvqwIPJhQeTDgsiHBZEPCyIfFkQ+LIh8WBD5sJC+fFLcVbR48eKdO3fyD2bYZcVB0/SZM2eQxJDiovXp06eHhYXRBqDbC39BPksHrT1YpChfYGBgQkKCqQvIN3DgQCQ9JLplYuzYsaGhocaPwcHBgwYNQtJDovLBBNvTTz9t3BDTp08fb29vJD2ku2Fn9OjRfHkXFBQ0ZMgQJElsqnmP/pJbXKTRarjTXGRyWq8z7JWSU7zZHJkC6Y1HrZXvOTaauIEEBNWmXl+2YbzM9o4BLgbO/BCqfouPNi0l/UbSjaAmQS1btuS2nqOKuzS3HRpVf3aZnNLrWNMt1og31FO5/cNvmeY9o2rIZJSTu7JtnEejECtnVFqRb/PnGffTVXInGbfN3WA/iDfdhIwWlUwejjEYeCo3sIRYXdmDwssw+orXZoxb72Usw1BGTUxvGaJl9TqKMViIorjfpWyXufH9GfjMVj3Usezxyg0VVXI09WaQj3sAvZljIeHVFEpKW8q6ecrGLwxFlhGSb9fqu1lp6uGzQpF1ux/1k72r7xQXlE56K8ySB4vybfv6TmGObvCsZqhhc3BjZt499cQ3w8zetVh1ZKaqug+ycoJTQyB+TGN1CZN0SWP2rnn5rp8uhvImIKT2DveVMk7OsmunzB+mb37IQFWi1zfEQ9PMo9MzqhLzIxfm5dMzOoapqZNz6xysnjE2vKpATHzaAs1aSEtEPuvQNAsNbLO3iHzWgba92c4JIvLZAsUd+S8m9XGWYKmasphQ9+B612KqDq4nIu2jwWoVmkKUqLJP4ueq1TYWxSBln3Vg0IchDRe7oWhWXNVBMIUr+WgxDRduhFK6w/i1DVtufL46tKUASGSX9623F8ybPx3VPIOG9P5x/fdw8cvWTfEJD6GaR3zZVxdq3uiotuPHPYdqHsowuG+WOlz2RUW1hX+oFuBmcMzjSPmgo3L23Kn//vfHy1cSw8MjZ81cENmyNbgXFRX9vOWn02f+TElJ8vP1f+SRns9Oms4b9EhLS1mzduVfiecgubdp037UiGfatYtBhq2oP6xecfLUsayszLZtYwYPHBEX173K10HmXfHNpwcPnEaGHD1p4rT8/Lx1P65ycXHpEtv1xRnz/fz8bYxKGIGMSFuWAoklNe3m9h2bx4yZ9P57n8Nw4aI35vJlwNZtmzb+Z+3IEePBferU2Yf/OAAvCe4ajWbO3CkymeyjD79a9vE3cpl84aKX1GrOws+XXy3d8svGwYNGbtzwa88e8W8tXvDHkYMCX61QKOBng2n17dsOrlvzy6XLf61d9y1/S2xUZmCQ2HafPYVfbm7OnFmv+vtz9v6eGf/8a6/PTkw8HxPTecTwcfDcoaHNeW+XLyeePnNi6pRZt26lQpChQ0bzifStNz9MvHgeEktpaen+33aNGT3x6QFDwf3JJwZCkB/XfweRCHx7cHCzcWOf5a7cPSD1/fPP33BpX1S2YzH1IfGEt2jJawe0bcOZXsu4cxsZksaZs39Of+GZhL5xveJjN//8E6iGOMtiId7ePh8uffunDavhrSDtdIyJdXd3hzeHhAkSGGOO6dA5OflGfoHQ3t7IyCjjtYeHZ3ExZyzJvqiqYEez2Z6K182twsCFq6sr4iwEcU+56ruv9uzZDtkWXqNRo8bf//D1nr07wN3JyemLz77bvWc7ZC4onoKCmk58ZkpCwpNFRZy9oZmzJ1eJPzfnvpenl6VvN1vc2BdV9bgtCeLIqkOlVhmviww/vqenF5QCv+76ZdjQMf2fGlx2q6jCGlNISNj0aXOg1D9//vTefTvf//DN0LAWfoYkPG/uQsiPpvEHBoo2meSQqMyuBuGxNN5nT92RlnbTaCPr+nXO4HXT4BCtVqtSqfz9y6xjQVY68eeRcv8pV65efKLf0xDkkUd6PPxwt35PdoPs9nivvpAwwQPkZd4nZHb4GfgULQp4AEdFZRYLvQ7GnvE+Z2eXT5a9U1BYkJeXu2Hj6sDARtAKUSqVkMQgZaVn3IaGxdJPlrRrG1NYWFBcXAxZe+nHS75Z+fnt9FtQjWzYuAbqDSg04d0mTpgKBfylS3+B3FBRzl/wwudffIjE45ioxKY+JL7w0+q08OYhIc2Hj+gHrZbWrdu8+86nfBJ+Y+H7X69YNnHSMEhlL0yfGxMTe/r0icFDe69b+8vcl16HFgZUJuAttvPDny5bGRbWAq5HjXwGWo4bN62FTA1Fapvo9vPmLUJ24cCoqmN+jcu5Qzl/7sqZ8FYEIiC0aWmyl79ixEtmlvuQAStbsFgNWJKPzBRVAPlTXK+Dpuxr+dVXWHG2yRkyVVQJMtdRM1iQzzBPjggGaDkrl4ud5yXZtxxGT+n14vq8RDsT7Oh1EGzB0pCBzNIIVwMEWi2WzrC0sESI0fNmHgmIW9+HLG3KJpkXCyIfFhY6bXKZQklWaZShdKZdXM3vSzOvUUSUF9nXYUSnYfwam7eGbV4+jwDk5EKf+jUbY1x/bAAAEABJREFUNXjy7+m0Grb7YF+zdy3m0CefDbpxsQBpUANn9/e3WnW2PL0n0DnTa9CqRck+gU6hrTyc3JHVlgxLsTT8Zxp75e4Ljcyt22KFZpVNb1KGD2afguKtQJuNAp6Jqbhl+kic1WlzYTh74Hoq5VrxvVslfZ9p3LyNxXkla7vJ9WjTZ7cL72t1OkZXeW9D9SFB2mBqXCAyg5lyoTkofpjC1EOlb6G4/c9mhaCoCiPaVW/RLLIgn+ku88qxIaUT7equ6DbQv3lbF2QZqRvX7tev34YNG4hxbTsh5o2xIPJhIXFrTyT1YSFp+aBaYxhGJpPuQR7EWgwWRD4siKknLEjqw4LIhwWRDwtS9mFBUh8WRD4siHxYEPmwIPJhQeTDgsiHBZEPC9JsxoKkPiyIfFhI3VpMQEAAkjCSlk+v12dlZSEJQ2wVYUHkw4LIhwWRDwsiHxZEPiykLp/ETQ+Q1IcFkQ8LqcsHgy5IwpDUhwWRDwsiHxZEPiyIfFgQ+bCQ4q6imTNnHjt2zHiQDE3TDMPAx3PnziGJIcU9z7Nnz27atCldDjIoGBISgqSHFOWLiIjo3r27abaApNezZ08kPSS6437cuHHNmlUcNwjXw4YNQ9JDovIFBwfHx5edTA0FX2xsLG8pWmpI97yHUaNG8dbd4e/IkSORJHF8wyU5UV1SYrKH31B/lhnGNuw55vZwl9+B8q3MEDZfzfKWpctMYzv1iZv8P/Uf7Vq1VWUFXLlXUOatLExZFKYmt1kKCWxml8lkUZ3dHGuq2ZENl/9+cjvnroaikVbDwEvRiGIMfw3fw701v/mdKtthz5Zvry/7TJVvja+wLM6LVW7li/fGGLIMW/74lOkmcrrMnprBG/fVpu+mVFKMHrl7K8Yvclgl7jD5Nnxwm2XY+DFN3X0lffjVoQ2ZmaklUz9qgRyBY+RbuyTV3VPZd1ITVBdIuqA6tT9z6gfNETYOqDquni5Wl+jrinZAeEcXpQu967tMhI0D5Lt+qsDNo46Z4Q4Mds1KL0XYOKDmLSnS0tLdsGweZzdKo3bAYIQD5NNpWLauHdap1er0Wgc8MzmADgsiHxYNVD4KxsEccThrQ019LEux0ij7uFON69opu4461tsB8lGIarCWuBtq5qUcY5HEEamPG0Kpc7nXMUZgHZH6aEvnNkoXziKERGpeVo/qXK8DxtZYRiI1b51Le/zYtiPmKRwQBzfi6zgBjbazaxTueUVaDzcL6bRhQeTD4sHI1//pnmNGT7p+/eqRo4fc3Nzatev4+mvveLh7VPG2ddt/T548+vffl5VOTh3ad5o8eUZwEDfbu3jJq9Bq6x3/xIdL31apSqKj202bMlucnXKKcki778HM88pk8p+3bOjff8ih388s/XB5WlrKV8s/ruLn0qW/wLFNmw5Llnzy6iuLc3Nz3nu/zLakXC6/cvXigd/3rPxm/d7dx5yUTh989BYShYN6bY6QT8ZS4qOJCI/sEhsHSQDSzsCnhx0+fKDKKnBwX/PD5rFjJnWMiQWfI4aPg2RotOqsKil5ef6bQU2CQcr4x/vdupVaUlKCah1HZF49ZUe7LyKilfE6OKgZaJeRcdtovxwZZrXB5esVy/6+drm4uJh3zMvN4a06NwsJM9rYdTfk+sLCAkdZ3bUdRzRcZMiOFryTU4UJB2cX7mBz3pq4kePH/1j4xtxWraI///Q7Lo9/tNz0Lk1jPTnloF7mA+t1mIqlVnFWuZ2dK50Ov2vPtnbtYp6bPIP/aGqSGx+u4pDV2aoDSEysWCr6743rUIRVMR9eUJAfUG6SGzh69BByHAx02fTSqDq4eTbxP+S97CyofPV6PVS7u3Zv7dWrD29E3AjULWfOnrzw11mdTgc+ecfMu3eQlHDMgJUdnbb+Tw2+cuXiim8+g+tOHbvMfPHlKh6effaFkpLiRW/MValUQwaPgrbLnTvpr742a+Hr7yLJ4IA1LusWp0L6GzonzPYgAwfHDx0y+pnxz6EHxPGdmUl/Fc1YFoHwaKgzbQ6qex0yXMpSdW602UHP6wj5GNHN5h3bDqIHCuugdSWOmaisc4nPUTii11EH7UhDN4mWSWSuA5p9dU1BaDUzemnMdXDGv+rcNDlFsVKpeesirGMmuMhgPRaOyLx03VvhQjlosN4RVQdjV6f3gcI6aLCeZF4siHxYOEA+uROM3NYxS9wKpVzp5IDdFA54bTdPhVZdx8q+kjydwkkag/Wxj/mVFEn6vIvqZKWXBrdwQ9g4QL6m0Upvf+W2L2+hOsIfW+5Bpy1hvANOhHZYf2vPD3fvpKrbPuIT3dUTSZW0q6oLh7O1pcykt0ORI3Bkd3XPmru3/y3RaRi9jb1x29qLFPeMVsopymDYW9gPTVNyBeXb2Gn4nGDkIGqgt69HGnMHZlbXyqx6VRyHDBnyw6pVPv7+wgFl3NdaiVwpQ8jRWxdroN0nMzyogygtLXRylStt2O/6QDZ1EvPGWBD5sCDyYUHkw4JYi8GCyIcFkQ8LIh8WUrfTRuSzH5L6sCDyYUHkw4LYqMSCpD4siHxYkMyLBUl9WBD5sCDyYUHKPixI6sOCyIcFaNeoUSMkYaSe+u7evYskDLFVhAWRDwtJywetFmKj0n5I6sOCyIcFkQ8LYlwbC5L6sCDyYUHkw4LIhwWRDwsiHxZEPiyIfFgQ49r28Pzzz585c4Y/XpPbj2U4dAAuLly4gCSGFLcxT58+PTg4mLesLZPJ+Atin9dWOnXqFBMTY5otoOfboUMHJD0kuol+/PjxQUFBxo9wPXbsWCQ9JCpf69atu3btyidAhmGio6OjoqKQ9JC0cW3euntgYOCYMWOQJJGufC1atIAECEkvMjKyY8eOSJLY1HA5tj3n+rkCjVqv15X7NrUmbtzpzVsNR5V8GE4lruoojiqBbIrDBk+VnrYCmZySK+kmoa4Dplifobcu36k9OYnH8iM6+LTt5q10Ktu0TVUWkIc3Gc6aejB4EvJQ+SNtsOFiuhHc9CtMfwRUOSyq/CS2hKr+MEZunM27erLAzYseMddKa8mKfDu+zbyXXjpyXihqeOxala5R6ya8IfTugmWfHmX8WzJyZkPUDug/Jbi0hDlzIF/Aj5B8B3++r3ChUR2zm+1IvPydbvwlZGpASL6i+6UyeR07nMqxKJ1pVbHQMgehEZfSUr1WLel5wppGCwgagCcH0GFB5MNCSD65gqYbtrwwWiaTCTXshOTRaRmmjh0s52D0ekavF+q9kMyLhaB8FKp7lndrFyH5oENNUw263UdZM4IsJB+LkAQnkmoT1poRZMHMyzRYi/e2Iph5aUTJiH5CCGZeSH36hmoHxgDNWUUR8kAaLkIwnFUUIQ9C8oH2FE0yrxBC7RLQ3mANoS6xeMmre/buQLVFfWvWXb9+FTkOuVwmvCFWSD6uySxSXp1O9+2qLydNHvHUgB6vvDbr5MljvPuBA3viEx66ceMf/uPVvy/3io89YjB6aikIMqzN2PTfH594qjv8mzd/+qVLf/Hu8BHcjd6Wfrxk6rRxcAFx3snM+PiTdwYMfIy/tW//ry+8OBH8w98tv2wU24zV6fTCK7yE5IGal5v4EsOXXy2Fpxw8aOTGDb/27BH/1uIFfxzhLNolJDzZudNDyz59FxnaonDRO75fj0cfFwgCrPruqx07fl6y+JNFr78XENDolddmpqWlCHz7vj3H4e/L89/4dcdhuPj94L6Pli6ObNl64087n5s8A75l+YplyKEIpj45osWchqzRaPb/tmvM6IlPDxjq5en15BMD4x/v9+P67/i78+YuupmSBAXT9h0/5+Tcnz3rVcQNaJdaCpJfkL/5559GjZrQJTauW7ee8+ctiu0cdz8n2/bn2bNne/v2HefMftXHx7dTxy6TJkzbvn1zbm4OchyCmROSnhhzXEnJ/4KCXWK7Gl1iOnROTr7B22Nv1Kjxs5OmQ4JavXrFKwvednd3B8d//vnbUpCUm0mIW+zShneXy+VLFn/cMSbWxodhGObylUTTmDt27AKOFy85cpGglWYzw4goLHj71zNnT67inptzn7fHPmTwqLXrvoXiuH27jlaD8LecTUyYiwJ+FZio+GH1CvhXKWYxqQ+qDpnC3qkimuaO20c24+fLnSw/b+7CKmayAwMb8xdQ3jdpEgxvteq7LyFPcUH8AywFycvLhYuSkmJkDb25pq2zs7Orq2ufhKd69Ig3dQ9qImKZJVQdeq29w6UG5URkXpCGN5BtzGLwU0NFAa8B1ykpyet+XPXlFz/otNpZc56DF4uObtc0OMRSkIiIVpBhEy+ej4pqiwwVzmsL5/TqmdC3b3+l0kmlKjF+761bqWafJzw8srCo0Bgz/Gx37qQHBjrybAnBmpcr+kSkPhcXl4kTpkLBDy0MyDtQgc5f8MLnX3yIDCXRu+8v7B3/RFTrNu3axcQ/3vf9D9+EJgvIZCkIFI4JvZ+Emnfvvp0X/jr71fKPz507xUsJuoPPoqIiuF7/0w/Z2Vn8A8AvERAQeLbcovnzk188fvwwVFbw7RD/kndemzt/GnwLchxCa1x+/uJWfpZu5ILmSAxnzp7cum3T+fOn3dzc20S3nz//DSj44CU3b16/YcNOTw/OGApkzLHjBw4bOmbSxGmWgiBDvQxSHvh9DzQAI8Ijoebp2vVRcE/PuL1s2bugESTPkSPGw10I++3Kn+DWjp1b1qxdqdNp/7Nxl4e7ByTMDRvX/HnyqFqtgpinTJnVulW07e+yd82t3Lu6qR9YVEBIvs2fpeXd049+RZx89Yl969Jz7pZOfa+FJQ/Cg/UwVN2gB6xYBhofdlcdIhsuDRAh+WQKSiZv0KnPKkLy6XUszBMjgmXIRCUWgmUfixi2Qac+rua0e57XELZBl30w3I7sn+elUMNWzzqCmVePGnbetY5gw0UO0+SIIIBww4VLgAQBBKsOKDZJ2SeIULNO5ixr4L0OGNERPjdaSD43NyXVsIcMoPhSuggV/0LydentX1rSoAu/gvvqphHuAh6E5PMNpjz9lbu+SUcNkgv7c2G86bHhvgJ+rG9I3fJZenEh02d8U3ffBpSRf1uXeT9TNeV9K0PFNm2H/vnz9PsZpTB6oNOx1RvS0DzUV1vJAPPrVea/YN6OqRyWoistQYL+Ec3ZyDb1UNFur3q3yhbcio/cuS9l72SynZiWI36TBWvwwDsan9z0i+QKGuZGPLyV4xdWmvwzi4hjcK6dLlYV66rPCtKIZqot5qhuLJwyqCXgwvWvuddgTTxQW7du79u3j6uri+FuhXfuQ/mXUvx0arlmBim5a+jwM6is0yqjaD0fmKUMO6FZ0wcwKF72vQpneYcunjbuI5XiKUKmJCQkbN682cfHB0kSqa8ulbjJDmLeGAsiHxZSlw+mwIl8dgLayWSSHjIj9nmxILaKsCDyYUHkw4KYucOCpD4siHxYEPmwIPJhQaoOLEjqw4LIhwWRDwupy0fKPj+5v4MAAA7jSURBVPshqQ8LIh8WRD4saJoOCAhAEkbqqS87W8Tu+9qH2CrCgsiHBZEPCyIfFkQ+LIh8WBD5sCDyYUHkw4LIhwWRDwsiHxZEPiyIfFgQ+bAg8mEhxW0xU6dOTU5ORoa1zXl5eS4uLgzDaLXas2fPIokhxVNuRo0aBYkuNze3oKAAxutLS0tBuyZNmiDpIUX5evXqFRkZaZot4Lply5ZIekj0jKVJkyb5+fkZP8KE0ejRo5H0kKh8cXFx0dHRRuPa4eHhXbp0QdJDuid8GROgj4/P8OHDkSSRrnwdOnSIiYmBOiQ0NPSxxx5DksT+hsvRrfeTEos4M6BayzFUMWBt+pG1yUY2tzuaovgDPeBRzZ7sQRnitfLViDvQC1W2eyVXUAoF3TjU+annGiO7sFO+Xd9nZqaqm4S6+gU76y2ZEoRHZahKlt5Md9Cbla/qFnuqbI+44cJcCIo/F541jZI1fGmVqKpHjpBCLs/P1mTcULEUM/HNUCQee+Rb/16aXo+Gzg5B9YXjW+6lJRVZPfihOqLLvlN7clUluvqkHdBtWICru/yXrzKQSETL98+FQv8mrqje0SrWJzujVGwo0fKpVXov/3poXyu0lbtOK/rIJNFCaNXQe6+HJzNpWT0j/rXqYTqyDxYhJL4NQuQrg0L2HNQqWj6WRvX2ULpakI+qp2ZToaFtRxNYvHzcEVD1MPmxdh1SLT7zcidv1cPkZ98ria86KJaqjwbL7ctQ4uVjoZtcDzMvpApUC2VffYUzakXVfNlXX6mlso8z4FEfjXjUUtnHHVZZLw0BsJQdSbBupKOBg+N/XP89qlEoFqF6WvaNHDE+OqodqlHsGjOoG/KNGT0R1TR2jRmIz7wUEmX7Ljn5Rq/42JMnjw0b0e+5KdxKAUvWtGfOnrzglRdNw762cM4LL05EJpkXuqVbftn4/JQx/Z7sNnXauO++X67Xl43SXblyEYI/PbDX+AlDVnzzWXGxddOq+IiXjzVY8LEZ/jCCH3/6HjLgvLmLkGVr2r16Jpw7f9r42mq1+uzZk70f72ca29atm37asHrY0DGbNu4aMGDo7j3beSPlt9NvzV/wgrpUvfyrNe8s/iQ5+d+X5k6phcVtNV518DOzXWLjhg8bG9W6jYA17Z49ezMMc/TYIT7gseOH4eNjjyWYxpZ48XyrVtF9+/b39vbp/9Tgr5evffihbuD+++97FXIFCBcSEhYW1mL+vDf+vXEdYkA1TC3VvJEto/gLAWvafn7+cH302P949+PHD3fu9JCvr59pPG3bdjh37tTSj5fs2/8rBAkOahoREYm4nJvYunUbLy9v3lvjxk2CgpqKsqMt6dFmpcGGMbJmgBvS2vKvP4FsK5PJ/jx5dNbMBVW8QbZ1dXU7fuKPj5Yulsvl4H/q87P8/QMg2mvXr0IhWyVOZDMUVTvDpTSW2VQBa9rwF+SAkvHEn0eUSiWXc3smVAlO0zTkWfiXkpJ8/vzptT+uKi4uev/dz3z9/Nu1i+Gt/Rrx8vRGtmNXr020fAzDMhjDzQLWtBH3wl6QYU+fPlFaqu72SE/e0ZT9+3dFRkY1bx4OBRz8Kywq3L1nG7iHt2j524HdHdp3An15n6Bv06Yi5vIp1h4BRZd9mGNVAta0eaACuXjxPBRwVSoNnoOH9r359ssnThyBgg9aPFDPtG3TAdyHDRsLqXX5imWQ8W/dSoWG0bPPjUy+eQPZDGuXQcQH0GweNfKZ8PDIjZvWGq1pz5u3yHgXMuynn70PKRRSX/Ww0PSBwnHhG3PhGmoVyMXDh42Da08Pzx++/++mTeumTh+XlpYC1cjL89+IbNka1TCilwitmJ8UHuP+yABH2peXAgW5+q1f3Jz5WYSoUHYM1iOCEfEDVoitjzNFdiK+4cLWx4mi2psqqqfU0mA9LYN/pPwrQ3yzWQ//6mP2tWv2lWTecuwq0u2qeRu23U9T7Cj7aCKeEfGpT89K3DIeBmSRBhZkkUbtIr7XIeMGTFG9w74SXbR8Tk4UYuph5aEq5kzLIpGIls/L3+nubRWqd1w/levsKtqknmi9h84KKsrVonq3Meb2jcL2j/qKDWXPjsqMZM3Olbdj+wW06uyB6j6FWZpfV6d37OX7UB8vJBI79/NmgoLfp0NQpRNdqqqUFKsb0a5iaJuqvAq2ykdTz3zvxrijl98QjUxsYVeELbcJXSVIFZcq36h0pqH/rtOw0XHePYaITnoI8xicc78X3E0r0VSWj5JB07qSN1pOMTpThShUyYJ25XeTUWz5kATIlJKSAlOacpmMr+3LVLMsn2EBDmvqjQvImpdP4UJ7+Tl1H2iPcGVRSbwL0adPn02bNvn62v+GNQoxro0FsQ6NBZEPCyIfFnXAuDYl4dFZYmwHCyIfFkQ+LIh8WEjdQiqRz35I6sOCyIcFkQ8LYh0aC5L6sCDyYUHkw4LIhwWpOrAgqQ8LIh8WRD4sYBLVy0v0zH9tImn5GIYpKipCEobYKsKCyIcFkQ8LIh8WRD4siHxYEPmwIPJhQeTDgsiHBZEPCyIfFkQ+LIh8WBD5sCDyYSHFbTEzZszIysqiaRpm2tLS0oKDg+Eh4Xrv3r1IYkhxY3Pv3r3T09OTkpJAO/gI1xkZGRqNBkkPKco3ePDgZs0qHc0Jo/YRERFIekh0W/0zzzxjevKmm5vbiBEjkPSQqHxPPfVUWFgYY9jZCn+bN28eHx+PpId0D3WYMGGCtzd39K2zszMxri0aSG6RkZFQ5wYFBQ0YMABJEsc0XE7vz7l1vaQwV6fXs4we6XVV4zRuYC77WL7Pu+qX80Yzyh0ZVq/VcIvrabmMMj3hjL9kK8WGaLbqER+QNky+lKYpSkY5uVIuLrImzV26P+mHlAgTLPlO7sm7fCJPVayDF5ArZTKFTO6k4Dbj66qdFCG8A9+it3KD2cLPaMlbFRcZDW8Lv662VMvoWb2OUTrTYVFufcYFInuxU77zvxec2n8PEpCrl0twtJ/CRfQRHlIg/cr9wuxiRse0aOfeb4I9ZwHbI9+axSklhXr/Zt6NIsWcai5VCrNUGdfuURQ75f0WSCTiz21+OcnJVRkeF4TqF5ASc9MLBk0LbtrKxfZQ4uQD7QJDff3DPVF9BFqZfx9KGfdKmFegrWWRCPm+np8U0r6xR4AzqtdcPZTSd3zj8PZutni2td337WvJvsGe9V47oFXXkH3rMm30bJN8m5bdRrSsSWuJHqbiWGQutGeg26qFN23xbF2+nEzdvdvqVt2bogZDs/YBOg3zv033rPq0Lt+2FbfdfOp/nq1Cowi/v88WWPVmRT51IVtSqGvRpQmSJEXFufPfePivS78jR+MX4gGV6pFtVqwdWZFvz5oMhVOd7FHg4+LtfN1aArQiX1Z6qYefTVV4/SM4MqDK4XDVsTLTptMxgeE11TMrKLz/697PU25d1GjUrVrG9e75bGBAKLjfuZu0bPmYWVNXHzqy7vLff3h5Bsa0S3gyYYZMxuWDCxd/23fwW5WqILr1oz27jUU1hpOnnKKpq38WRne1eMqjUOpLTiyhEVVDwwF6vX7l6heSUs4PHfDqvBc3urv5frnq2ez7t+GWXMbtY/t5xwcd2/f98K1jY4Yt/uP4hsQrXAF35+6NjVvejO345KtzfomNeWrH7mWoJpHJZanXhGxdCsl3J1WNasy2xM20v7KyU0YPW9w6squnh9+AfrPcXL2P/rnJ6KFDm8c7tI2XyxXhzTv5+QTfTr8GjidO/eLt1Tjhscmurp4RLTo/HDsI1SSUnCrIEZpoFsq8MJBH19j5USmpiTBA2LJFmbU2iqJApuSUCrOITYOijNfOzh4qNWdeMDvnVuNGFeMizYKjUU0CT6XTCnVqrZR9bI3Z0Vapi/R6LTQ7TB3d3XyM15S5w7VLSgr8/SrmMJVKEaMj9kBZsQ4hJJ+bh6LmDGN5uPvByz87tlLhZbTxZwnIs1qt2vixtLRmjfCyDKtQCCkgJF9QuOuFwzmoZghuEqnRqLy9G/n7lnUH7+ekm6Y+s/h4N7l67ShMXfJCX71+DNUkjJbx8BXqcQn92qFRTtDyLi2skUU6LcO7tG7Z9eft7+XmZRYV5x0/teWLlRNPn/9VOFSHNr2hp7F99zJ4sBvJ506c2oJqEpgPCWvjKuDBStkHXY6sm7nQhUY1wLPjPv3zzNafNi9KvXUpwD+0U4d+j3YdKRykVcuH+/ed+efprS+/GQdV8Njhi7/+fiqqGcNxJfc1LGJbdxE62tvKcOnOVRkZyaWte4owNVpvSD6TSTG6SW+HCvixUlQ/PSVIW1rvzqe3DXWBus0jVnpc1pdHevgokk5lhD9sfm4ISvE3P0gwe0un00DLzuzJo40DWrw45TvkOH5YP/dmWqLZW1ptqULhVN1dqXB+c8FuZIHMf/KhzWv1FHsb5jq0aPkrN9omNLd0Pyc3w6y7Wl3k7Oxu9hZNy7297J+crk5BQbZOb34BYHFJgZur2bktytfH4kDc3wdTOz7uE/eklZaATVNFv36bCR24yEeboYZB6vm7jFYrXOrx2DTXMWBqY5kc3bpkffC6HlB0T1Wcp7JFO2T7TNvkJWHF90syruai+k5q4t0Jb4bb6FncNPmq12+6eLk1a++H6iOFWaqUxMwXPo6Q2TxEJ3qRxspXk2laFvlofZt4SzmbWZyvev79CKWYVWv2LBGCad/sdLVngFtIjCNrzwfF3Rv5ObfynF3pSW+HIZHYuUDt9nX1vvV3SlWM0kUR0MLbu0ndmw9RFegy/8lWF5RSMtS+m88jA3zsiARreWRyovrIjsziAj2jZ2RyaMxBM1lOUQyrq7So09R2EQuNUdbEujlVvrSxympTU4M85ascWZqmmMpGdiAy7popsw5btmrVYGuXrTAZCx/hmuWs7lAwBoX0LMzhgDcXN1mbOO+Hn7B/Mscxi3PTrqmuny28f1ej17A6EFNTESe0ePQmQzYgJvdeTIU1Is58E0WDC8zLGN1pOWKMocrloxQsq61s6ok2vIGeqoiNN05EGTxU/kU464YypDTYJ2oe7Rr1sDvCRuq2iiQOMfGJBZEPCyIfFkQ+LIh8WBD5sPg/AAAA//8l2zVcAAAABklEQVQDABY2Tl922NThAAAAAElFTkSuQmCC", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from IPython.display import Image, display\n", "arch = ChainOfVerification(llm=reasoning_llm)\n", "graph = arch.build()\n", "display(Image(graph.get_graph().draw_mermaid_png()))" ] }, { "cell_type": "markdown", "id": "2fde045e", "metadata": { "papermill": { "duration": 0.014367, "end_time": "2026-05-28T02:05:49.102336+00:00", "exception": false, "start_time": "2026-05-28T02:05:49.087969+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 8 · Live run — a hallucination trap\n", "\n", "We pick a task **designed** to force baseline hallucination: ask for 5 novels by Ursula K. Le Guin that won the Hugo Award for Best Novel — but only **two** exist (*The Left Hand of Darkness* 1970, *The Dispossessed* 1975). The baseline must invent 3 to fill the quota; CoVe's job is to catch those inventions and revise." ] }, { "cell_type": "code", "execution_count": 4, "id": "52676ab2", "metadata": { "execution": { "iopub.execute_input": "2026-05-28T02:05:49.129565Z", "iopub.status.busy": "2026-05-28T02:05:49.128822Z", "iopub.status.idle": "2026-05-28T02:07:13.053070Z", "shell.execute_reply": "2026-05-28T02:07:13.052063Z" }, "papermill": { "duration": 83.945676, "end_time": "2026-05-28T02:07:13.060325+00:00", "exception": false, "start_time": "2026-05-28T02:05:49.114649+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "GROUND_TRUTH: Only 2 of Le Guin's novels won Hugo Best Novel: The Left Hand of Darkness (1970) and The Dispossessed (1975).\n", "\n", "BASELINE_LINES: 2\n", "BASELINE_REAL_WINNERS_FOUND: ['The Dispossessed', 'The Left Hand of Darkness']\n", "\n", "VERIFICATION_QUESTION_COUNT: 4\n", "LOW_CONFIDENCE_COUNT: 0\n", "\n", "REVISED_LINES: 2\n", "REVISED_REAL_WINNERS_FOUND: ['The Dispossessed', 'The Left Hand of Darkness']\n", "CHANGES_MADE_COUNT: 0\n", "\n", "HALLUCINATION_REDUCED: 0 fewer lines after revise\n", "PRECISION_IMPROVED: revised=2/2 real, baseline=2/2 real\n" ] } ], "source": [ "TASK = (\n", " \"Name 5 novels by Ursula K. Le Guin that won the Hugo Award for Best Novel. \"\n", " \"Return them as a numbered list with the year of the win in parentheses.\"\n", ")\n", "GROUND_TRUTH = {\n", " \"The Left Hand of Darkness\": 1970,\n", " \"The Dispossessed\": 1975,\n", "}\n", "GROUND_TRUTH_NOTE = (\n", " \"Only 2 of Le Guin's novels won Hugo Best Novel: The Left Hand of Darkness (1970) \"\n", " \"and The Dispossessed (1975).\"\n", ")\n", "\n", "r = arch.run(TASK)\n", "\n", "# Programmatic check: which baseline lines / revised lines name a real winner?\n", "def _names_in(text):\n", " return {name for name in GROUND_TRUTH if name.lower() in text.lower()}\n", "\n", "baseline_real = _names_in(r.metadata['baseline_response'])\n", "revised_real = _names_in(r.output)\n", "baseline_lines = [l for l in r.metadata['baseline_response'].splitlines() if l.strip()]\n", "revised_lines = [l for l in r.output.splitlines() if l.strip()]\n", "\n", "print(f\"GROUND_TRUTH: {GROUND_TRUTH_NOTE}\")\n", "print()\n", "print(f\"BASELINE_LINES: {len(baseline_lines)}\")\n", "print(f\"BASELINE_REAL_WINNERS_FOUND: {sorted(baseline_real)}\")\n", "print()\n", "print(f\"VERIFICATION_QUESTION_COUNT: {r.metadata['question_count']}\")\n", "print(f\"LOW_CONFIDENCE_COUNT: {r.metadata['low_confidence_count']}\")\n", "print()\n", "print(f\"REVISED_LINES: {len(revised_lines)}\")\n", "print(f\"REVISED_REAL_WINNERS_FOUND: {sorted(revised_real)}\")\n", "print(f\"CHANGES_MADE_COUNT: {len(r.metadata['changes_made'])}\")\n", "print()\n", "print(f\"HALLUCINATION_REDUCED: {len(baseline_lines) - len(revised_lines)} fewer lines after revise\")\n", "print(f\"PRECISION_IMPROVED: revised={len(revised_real)}/{len(revised_lines)} real, baseline={len(baseline_real)}/{len(baseline_lines)} real\")" ] }, { "cell_type": "markdown", "id": "e5a0c5f8", "metadata": { "papermill": { "duration": 0.006044, "end_time": "2026-05-28T02:07:13.072389+00:00", "exception": false, "start_time": "2026-05-28T02:07:13.066345+00:00", "status": "completed" }, "tags": [] }, "source": [ "### 8.1 · Inspect each stage" ] }, { "cell_type": "code", "execution_count": 5, "id": "63caf1d7", "metadata": { "execution": { "iopub.execute_input": "2026-05-28T02:07:13.087317Z", "iopub.status.busy": "2026-05-28T02:07:13.085308Z", "iopub.status.idle": "2026-05-28T02:07:13.099588Z", "shell.execute_reply": "2026-05-28T02:07:13.099588Z" }, "papermill": { "duration": 0.023179, "end_time": "2026-05-28T02:07:13.101597+00:00", "exception": false, "start_time": "2026-05-28T02:07:13.078418+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "=== BASELINE (likely contains hallucinations) ===\n", "1. The Left Hand of Darkness (1970) \n", "2. The Dispossessed (1975)\n", "\n", "=== VERIFICATION QUESTIONS ===\n", " 1. Is 'The Left Hand of Darkness' a novel written by Ursula K. Le Guin?\n", " 2. Did 'The Left Hand of Darkness' win the Hugo Award for Best Novel in 1970?\n", " 3. Is 'The Dispossessed' a novel written by Ursula K. Le Guin?\n", " 4. Did 'The Dispossessed' win the Hugo Award for Best Novel in 1975?\n", "\n", "=== VERIFICATION ANSWERS (independent of baseline) ===\n", " [1] Q: Is 'The Left Hand of Darkness' a novel written by Ursula K. Le Guin?\n", " A: Yes, 'The Left Hand of Darkness' is a science fiction novel written by Ursula K. Le Guin, published in 1969 as part of her Hainish Cycle.\n", " confidence: high\n", "\n", " [2] Q: Did 'The Left Hand of Darkness' win the Hugo Award for Best Novel in 1970?\n", " A: Yes, 'The Left Hand of Darkness' by Ursula K. Le Guin won the Hugo Award for Best Novel in 1970.\n", " confidence: high\n", "\n", " [3] Q: Is 'The Dispossessed' a novel written by Ursula K. Le Guin?\n", " A: Yes, 'The Dispossessed' is a 1974 science fiction novel by Ursula K. Le Guin, part of her Hainish Cycle and winner of both the Hugo and Nebula awards.\n", " confidence: high\n", "\n", " [4] Q: Did 'The Dispossessed' win the Hugo Award for Best Novel in 1975?\n", " A: Yes, 'The Dispossessed' by Ursula K. Le Guin won the Hugo Award for Best Novel in 1975.\n", " confidence: high\n", "\n", "=== CHANGES MADE ===\n", "\n", "=== REVISED ANSWER ===\n", "1. The Left Hand of Darkness (1970) \n", "2. The Dispossessed (1975)\n" ] } ], "source": [ "print('=== BASELINE (likely contains hallucinations) ===')\n", "print(r.metadata['baseline_response'])\n", "print()\n", "print('=== VERIFICATION QUESTIONS ===')\n", "for i, q in enumerate(r.metadata['verification_questions'], 1):\n", " print(f' {i}. {q}')\n", "print()\n", "print('=== VERIFICATION ANSWERS (independent of baseline) ===')\n", "for i, a in enumerate(r.metadata['verification_answers'], 1):\n", " print(f' [{i}] Q: {a[\"question\"]}')\n", " print(f' A: {a[\"answer\"]}')\n", " print(f' confidence: {a[\"confidence\"]}')\n", " print()\n", "print('=== CHANGES MADE ===')\n", "for c in r.metadata['changes_made']:\n", " print(f' - {c}')\n", "print()\n", "print('=== REVISED ANSWER ===')\n", "print(r.output)" ] }, { "cell_type": "markdown", "id": "3af27ebc", "metadata": { "papermill": { "duration": 0.007799, "end_time": "2026-05-28T02:07:13.115435+00:00", "exception": false, "start_time": "2026-05-28T02:07:13.107636+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 9 · What we just observed\n", "\n", "The cells above ran CoVe on a hallucination trap: ask for **5** Le Guin Hugo Best Novel wins when only **2** exist. We measure whether the BASELINE invented fillers and whether REVISE dropped them.\n", "\n", "### 9.1 · Hallucination-reduction summary (reasoning model)\n", "\n", "| Metric | Value |\n", "|---|---|\n", "| BASELINE lines | 2 |\n", "| BASELINE real winners (of 2 possible) | 2 — ['The Dispossessed', 'The Left Hand of Darkness'] |\n", "| BASELINE hallucinated lines (lines − real) | **0** |\n", "| Verification questions generated | 4 |\n", "| Low-confidence verification answers | 0 |\n", "| REVISED lines | 2 |\n", "| REVISED real winners | 2 — ['The Dispossessed', 'The Left Hand of Darkness'] |\n", "| REVISED hallucinated lines | **0** |\n", "| Changes made (REVISE bullet count) | 0 |\n", "\n", "### 9.2 · Reasoning vs non-reasoning LLM\n", "\n", "| Model | Lines (baseline → revised) | Real winners present (baseline → revised) | Changes made |\n", "|---|---|---|---|\n", "| Reasoning (Qwen3-Thinking) | 2 → 2 | 2/2 → 2/2 | 0 |\n", "| Plain (Llama-3.3-70B) | 5 → 2 | 2/5 → 1/2 | 5 |\n", "\n", "### 9.3 · Patterns surfaced in this run\n", "\n", "- **🟰 Baseline was already correct** — no hallucinations to catch. Either the model knew the answer (Qwen-Thinking has strong factual recall) or it hedged. The interesting comparison is § 9.2 below: did Llama hallucinate where Qwen didn't?\n", "\n", "- **⚠️ Llama's revised answer has 1 hallucinations vs Qwen-Thinking's 0.** Same architecture, weaker model — CoVe helps but the lift depends on the underlying model's verification accuracy.\n", "\n", "### 9.4 · Verbatim BEFORE → AFTER (reasoning model)\n", "\n", "**Baseline (before verification):**\n", "\n", "```\n", "1. The Left Hand of Darkness (1970) \n", "2. The Dispossessed (1975)\n", "```\n", "\n", "**Revised (after CoVe):**\n", "\n", "```\n", "1. The Left Hand of Darkness (1970) \n", "2. The Dispossessed (1975)\n", "```\n", "\n", "### 9.5 · The verification Q&A (executed independently of the baseline)\n", "\n", "| # | Verification question | Verification answer | Confidence |\n", "|---|---|---|---|\n", "| 1 | Is 'The Left Hand of Darkness' a novel written by Ursula K. Le Guin? | Yes, 'The Left Hand of Darkness' is a science fiction novel written by Ursula K. Le Guin, published in 1969 as part of h… | high |\n", "| 2 | Did 'The Left Hand of Darkness' win the Hugo Award for Best Novel in 1970? | Yes, 'The Left Hand of Darkness' by Ursula K. Le Guin won the Hugo Award for Best Novel in 1970. | high |\n", "| 3 | Is 'The Dispossessed' a novel written by Ursula K. Le Guin? | Yes, 'The Dispossessed' is a 1974 science fiction novel by Ursula K. Le Guin, part of her Hainish Cycle and winner of bo… | high |\n", "| 4 | Did 'The Dispossessed' win the Hugo Award for Best Novel in 1975? | Yes, 'The Dispossessed' by Ursula K. Le Guin won the Hugo Award for Best Novel in 1975. | high |\n", "\n", "### 9.6 · The takeaway\n", "\n", "CoVe's pedagogical value lives in two cells of § 9.1: **`BASELINE hallucinated lines`** and **`REVISED hallucinated lines`**. If the second is smaller than the first, the architecture worked. If they're equal (or both zero), either the task was too easy or the same-model consistency-bias trap (§ 9.3) defeated the verification — mitigation: use a different / stronger model in the EXECUTE stage, or compose with RAG (§ 11.3 extension #1)." ] }, { "cell_type": "markdown", "id": "7bef3845", "metadata": { "papermill": { "duration": 0.006053, "end_time": "2026-05-28T02:07:13.127532+00:00", "exception": false, "start_time": "2026-05-28T02:07:13.121479+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 10 · Contrast — plain Llama-3.3-70B on the same trap\n", "\n", "CoVe should help any model, but the lift is largest on hallucination-prone models. Compare the reasoning-LLM run above against plain Llama on the same Le Guin trap." ] }, { "cell_type": "code", "execution_count": 6, "id": "4746897d", "metadata": { "execution": { "iopub.execute_input": "2026-05-28T02:07:13.142442Z", "iopub.status.busy": "2026-05-28T02:07:13.142442Z", "iopub.status.idle": "2026-05-28T02:07:55.301707Z", "shell.execute_reply": "2026-05-28T02:07:55.300183Z" }, "papermill": { "duration": 42.173121, "end_time": "2026-05-28T02:07:55.306448+00:00", "exception": false, "start_time": "2026-05-28T02:07:13.133327+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "LLAMA_BASELINE_LINES: 5\n", "LLAMA_BASELINE_REAL: ['The Dispossessed', 'The Left Hand of Darkness']\n", "LLAMA_REVISED_LINES: 2\n", "LLAMA_REVISED_REAL: ['The Left Hand of Darkness']\n", "LLAMA_CHANGES_COUNT: 5\n" ] } ], "source": [ "plain_llm = get_llm(provider=\"nebius\", model=\"meta-llama/Llama-3.3-70B-Instruct\", temperature=0.4)\n", "arch_llama = ChainOfVerification(llm=plain_llm)\n", "r_llama = arch_llama.run(TASK)\n", "\n", "llama_baseline_real = _names_in(r_llama.metadata['baseline_response'])\n", "llama_revised_real = _names_in(r_llama.output)\n", "llama_baseline_lines = [l for l in r_llama.metadata['baseline_response'].splitlines() if l.strip()]\n", "llama_revised_lines = [l for l in r_llama.output.splitlines() if l.strip()]\n", "\n", "print(f\"LLAMA_BASELINE_LINES: {len(llama_baseline_lines)}\")\n", "print(f\"LLAMA_BASELINE_REAL: {sorted(llama_baseline_real)}\")\n", "print(f\"LLAMA_REVISED_LINES: {len(llama_revised_lines)}\")\n", "print(f\"LLAMA_REVISED_REAL: {sorted(llama_revised_real)}\")\n", "print(f\"LLAMA_CHANGES_COUNT: {len(r_llama.metadata['changes_made'])}\")" ] }, { "cell_type": "markdown", "id": "7cf96167", "metadata": { "papermill": { "duration": 0.016412, "end_time": "2026-05-28T02:07:55.342905+00:00", "exception": false, "start_time": "2026-05-28T02:07:55.326493+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 11 · Failure modes, safety, extensions\n", "\n", "### 11.1 · Where this breaks\n", "\n", "| Failure | Mechanism | Mitigation |\n", "|---|---|---|\n", "| **Verification confirms the wrong answer** | Same model wrote baseline & answers → same knowledge gap | Use a stronger / different model for EXECUTE; or RAG-ground EXECUTE in real documents |\n", "| **Vacuous questions** | PLAN asks \"Is the answer correct?\" instead of probing specific claims | Prompt PLAN to target specific entities/numbers/dates; reject questions that mention \"the answer\" |\n", "| **Over-revision** | REVISE drops valid claims because verification was unsure | The `confidence` field gives REVISE a low/medium/high signal; instruct it to drop only on *contradiction*, not on *uncertainty* |\n", "| **Verification cost** | N+3 LLM calls per task — expensive at scale | Cache verification answers per claim; reuse across tasks that share entities |\n", "\n", "### 11.2 · Production safety\n", "\n", "- **Pair with RAG.** CoVe + RAG is the production combination: PLAN generates questions, EXECUTE answers them via retrieval, REVISE only keeps claims grounded in retrieved sources.\n", "- **Track residual claims.** Even after CoVe, low-confidence claims may slip through. Surface the verification trace to the user / audit log.\n", "- **Per-claim citations.** Extend `_RevisedResponse` to also emit a citation per kept claim. Auditors love this.\n", "\n", "### 11.3 · Three extensions\n", "\n", "1. **CoVe + RAG.** EXECUTE node calls a retriever and answers from documents, not from the LLM's parametric memory. This is the most-cited Production CoVe variant.\n", "2. **Hierarchical CoVe.** If a verification answer itself contains sub-claims, recursively verify them. Bounded by depth budget.\n", "3. **Confidence-weighted REVISE.** Track confidence per kept claim; drop low-confidence claims unless verified by multiple independent questions.\n", "\n", "### 11.4 · What to read next\n", "\n", "- [**01 · Reflection**](./01_reflection.ipynb) — sibling single-call critique loop.\n", "- [**21 · Self-Consistency**](./21_self_consistency.ipynb) — orthogonal hallucination-reduction strategy (sample-and-vote).\n", "- [**23 · Agentic RAG**](./23_agentic_rag.ipynb) — ground answers in retrieved documents (composes with CoVe).\n", "- [**32 · Constitutional AI**](./32_constitutional_ai.ipynb) — critique-and-revise against a written constitution.\n", "\n", "### 11.5 · References\n", "\n", "1. Dhuliawala, S. et al. *Chain-of-Verification Reduces Hallucination in Large Language Models.* 2023. [arXiv:2309.11495](https://arxiv.org/abs/2309.11495)\n", "2. Manakul, P. et al. *SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection.* EMNLP 2023. [arXiv:2303.08896](https://arxiv.org/abs/2303.08896) — sibling self-checking approach.\n", "3. Wang, X. et al. *Self-Consistency Improves Chain of Thought Reasoning.* ICLR 2023. [arXiv:2203.11171](https://arxiv.org/abs/2203.11171) — see nb 21." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.0" }, "papermill": { "default_parameters": {}, "duration": 145.504574, "end_time": "2026-05-28T02:07:58.368912+00:00", "environment_variables": {}, "exception": null, "input_path": "all-agentic-architectures/notebooks/20_chain_of_verification.ipynb", "output_path": "all-agentic-architectures/notebooks/20_chain_of_verification.ipynb", "parameters": {}, "start_time": "2026-05-28T02:05:32.864338+00:00", "version": "2.7.0" } }, "nbformat": 4, "nbformat_minor": 5 }