{ "cells": [ { "cell_type": "markdown", "id": "9cfc815e", "metadata": { "papermill": { "duration": 0.006208, "end_time": "2026-05-27T13:09:08.610164+00:00", "exception": false, "start_time": "2026-05-27T13:09:08.603956+00:00", "status": "completed" }, "tags": [] }, "source": [ "# 15 · RLHF-style Self-Improvement — editor critique + persistent archive\n", "\n", "> **TL;DR.** Generate → editor-critique → revise loop (like Reflection nb 01) *but* with a **persistent archive of accepted outputs across runs**. The next task's generation sees recent archived examples as positive priors. Quality compounds over the architecture instance's lifetime.\n", ">\n", "> **Reach for it when** you run the same agent against many similar tasks and want quality to improve over time.\n", "> **Avoid when** each task is one-shot (no future calls to benefit from the archive).\n", "\n", "| Property | Value |\n", "|---|---|\n", "| Origin | *Misleadingly named* — NOT real RL with human feedback. Editor-feedback loop (Madaan 2023) + persistent archive pattern. |\n", "| Loop body | generate → critique → revise (max_iterations) |\n", "| Archive criterion | `accept_for_archive=True` AND `quality_score >= target_score` (Python-side AND of LLM and threshold) |\n", "| Persistence | On the architecture instance (`arch.archive` list) |\n", "| Cost | 2-4 LLM calls per task; ARCHIVE GROWS across calls |\n", "\n", "The name is a historical artefact — the original notebook in this repo was called \"RLHF\" but the pattern is closer to *self-distillation with positive examples*. We keep the name for backward-compatibility with the existing 3.4K-star audience." ] }, { "cell_type": "markdown", "id": "3c8f1584", "metadata": { "papermill": { "duration": 0.006986, "end_time": "2026-05-27T13:09:08.620893+00:00", "exception": false, "start_time": "2026-05-27T13:09:08.613907+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 2 · Architecture at a glance\n", "\n", "```mermaid\n", "flowchart LR\n", " A([task]) --> G[Generate
prompt includes recent ARCHIVE examples]\n", " G --> C[Critique
editor: score + accept_for_archive flag]\n", " C -->|score < target
and iter < max| R[Refine
address critique]\n", " R --> C\n", " C -->|done| F[Finalize
maybe archive if score >= target]\n", " F --> M[(arch.archive
persistent list)]\n", " F --> Z([final output])\n", "\n", " style G fill:#e3f2fd,stroke:#1976d2\n", " style C fill:#fff3e0,stroke:#f57c00\n", " style F fill:#e8f5e9,stroke:#388e3c\n", " style M fill:#fce4ec,stroke:#c2185b\n", "```\n", "\n", "**The architecture is stateful across `run()` calls.** The dotted line into the archive shows the side-effect: each accepted output becomes a positive example available to all future tasks via the `_generate` prompt." ] }, { "cell_type": "markdown", "id": "622a569e", "metadata": { "papermill": { "duration": 0.004011, "end_time": "2026-05-27T13:09:08.631972+00:00", "exception": false, "start_time": "2026-05-27T13:09:08.627961+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 3 · Theory\n", "\n", "### 3.0 · Why the editor's score is computed in Python, not by the LLM\n", "\n", "Earlier iterations of this notebook had the editor LLM emit a single `quality_score: 1-10`. **It came back 9/10 on every task** — the same Llama-as-Scorer flatness pathology documented in Mental Loop (nb 10 § 11) and Ensemble (nb 13 § 11).\n", "\n", "The fix — applied here — is the **multi-dimensional deterministic-scoring** generalisation of Mental Loop's `scoring_fn`. The editor now commits to several **objective features** (each a boolean or count), and Python composes the deciding score from them:\n", "\n", "```python\n", "class _EditorCritique(BaseModel):\n", " is_on_brief: bool\n", " word_count: int\n", " has_concrete_imagery: bool\n", " avoids_cliches: bool\n", " is_engaging: bool\n", " overall_score: int # preserved for comparison; NOT used by Python\n", " critique: str\n", "\n", "def _composite_score(features, wc_range):\n", " score = 4 * features['is_on_brief']\n", " score += 2 if wc_range[0] <= features['word_count'] <= wc_range[1] else 0\n", " score += 2 * features['has_concrete_imagery']\n", " score += 1 * features['avoids_cliches']\n", " score += 1 * features['is_engaging']\n", " return score # 0-10\n", "```\n", "\n", "Python's score has REAL spread on diverse tasks because it depends on five INDEPENDENT booleans the LLM must commit to one at a time. The Llama compression that flattens a single `quality_score` doesn't flatten five independent decisions. § 9 compares the LLM's raw `overall_score` against the Python composite — usually the composite has wider spread.\n", "\n", "### 3.1 · Difference from plain Reflection (notebook 01)\n", "\n", "Plain Reflection (nb 01) treats each task in isolation: generate → critique → refine → output, throw away the intermediate work. Quality on task N+1 doesn't benefit from quality on task N.\n", "\n", "RLHF-style self-improvement *keeps the intermediate work*. After a task's loop produces an output that passes the editor's bar, that output is **archived**. The next task's `_generate` prompt includes the most recent 3 archived examples as positive priors:\n", "\n", "```python\n", "prompt = f\"# Task\n", "{task}\n", "\n", "## Recent high-quality examples ...\n", "{archive[-3:]}\n", "\n", "Match or exceed these.\"\n", "```\n", "\n", "This is the *positive* version of Reflexion (nb 18), which stores *negative* examples (verbal reflections on failures).\n", "\n", "### 3.2 · Archive gate is fully deterministic now\n", "\n", "After the multi-dim refactor in § 3.0, the archive gate is **pure Python**:\n", "\n", "```python\n", "should_archive = composite_score >= self.target_score\n", "```\n", "\n", "No `accept_for_archive` boolean from the LLM is consulted. The composite score itself already incorporates objective LLM judgements (booleans about the output's properties) via the deterministic composition function. Two layers of LLM judgement collapsed into one + a Python threshold.\n", "\n", "### 3.3 · Why archive 3 not all\n", "\n", "Including the *full* archive in every `_generate` prompt would (a) explode context length, (b) bias each new task toward the same template. We sample only the *3 most recent* — recent enough to be relevant, few enough to leave generative room. Extension idea (§ 11.3): score archive examples by similarity to the current task and pick the top-K.\n", "\n", "### 3.4 · Where this sits\n", "\n", "| Pattern | Persistence across runs? | Stores what? | When |\n", "|---|---|---|---|\n", "| Reflection (nb 01) | no | nothing | quality matters, one-shot |\n", "| **RLHF self-improvement** *(this nb)* | **yes** | **accepted outputs** (positive examples) | many similar tasks, quality compounds |\n", "| Reflexion (nb 18) | yes | verbal reflections on failures (negative examples) | learn from mistakes across episodes |\n", "| Episodic + Semantic Memory (nb 08) | yes | conversations + facts | personal assistant continuity |\n", "| Voyager (nb 29) | yes | learned *skills* (reusable functions) | open-ended exploration |\n", "\n", "### 3.5 · What goes wrong (you'll see in § 9)\n", "\n", "1. **Archive bloat.** Hundreds of accepted outputs → context too long. Mitigation: cap at N most recent OR retrieve by similarity.\n", "2. **Mode collapse.** Generator over-imitates archive style → all outputs sound the same. Mitigation: include explicit \"vary the structure\" instruction.\n", "3. **Sycophantic editor.** Editor accepts everything → archive grows to include mediocre work → quality decays. Mitigation: Python score threshold is the backstop.\n", "4. **Editor inconsistency.** Same draft scored 7 one round, 9 next round. Reduce via lower temperature on the editor." ] }, { "cell_type": "markdown", "id": "9395f156", "metadata": { "papermill": { "duration": 0.003712, "end_time": "2026-05-27T13:09:08.639439+00:00", "exception": false, "start_time": "2026-05-27T13:09:08.635727+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 4 · Setup" ] }, { "cell_type": "code", "execution_count": 1, "id": "2e7387bd", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T13:09:08.644940Z", "iopub.status.busy": "2026-05-27T13:09:08.644940Z", "iopub.status.idle": "2026-05-27T13:09:09.818113Z", "shell.execute_reply": "2026-05-27T13:09:09.818113Z" }, "papermill": { "duration": 1.174104, "end_time": "2026-05-27T13:09:09.818113+00:00", "exception": false, "start_time": "2026-05-27T13:09:08.644009+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
Provider: nebius  ·  Model: meta-llama/Llama-3.3-70B-Instruct ─────────────────────────────────────────────────────\n",
       "
\n" ], "text/plain": [ "\u001b[1;36mProvider: nebius · Model: meta-llama/Llama-\u001b[0m\u001b[1;36m3.3\u001b[0m\u001b[1;36m-70B-Instruct\u001b[0m \u001b[92m─────────────────────────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from agentic_architectures import get_llm, enable_langsmith, settings\n", "from agentic_architectures.architectures import RLHFSelfImprovement\n", "from agentic_architectures.ui import print_md, print_header, print_step\n", "\n", "enable_langsmith()\n", "print_header(f\"Provider: {settings.llm_provider} · Model: {settings.llm_model}\")" ] }, { "cell_type": "markdown", "id": "d31e0158", "metadata": { "papermill": { "duration": 0.0, "end_time": "2026-05-27T13:09:09.826777+00:00", "exception": false, "start_time": "2026-05-27T13:09:09.826777+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 5 · Library walkthrough\n", "\n", "Source: [`src/agentic_architectures/architectures/rlhf.py`](../src/agentic_architectures/architectures/rlhf.py).\n", "\n", "Three things make this architecture special compared to nb 01 Reflection:\n", "\n", "1. **`self.archive: list[dict]`** — initialised empty in `__init__`, **mutated across `run()` calls**.\n", "2. **`_generate` prompt embeds `self.archive[-3:]`** as positive examples — the LLM sees its own past good work.\n", "3. **`_finalize` archive gate** combines `accept_for_archive` (LLM flag) + `final_score >= target_score` (Python threshold) with `AND`." ] }, { "cell_type": "code", "execution_count": 2, "id": "40a5f360", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T13:09:09.841557Z", "iopub.status.busy": "2026-05-27T13:09:09.841557Z", "iopub.status.idle": "2026-05-27T13:09:09.865406Z", "shell.execute_reply": "2026-05-27T13:09:09.865406Z" }, "papermill": { "duration": 0.029872, "end_time": "2026-05-27T13:09:09.865406+00:00", "exception": false, "start_time": "2026-05-27T13:09:09.835534+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " \"description\": \"Multi-dimensional objective features the editor must commit to.\\n\\nThe score that drives loop continuation and archive gating is COMPUTED IN\\nPYTHON from these features, not from the LLM's `overall_score` field \\u2014\\nsidesteps the LLM-as-Scorer flatness pathology (same fix as Mental Loop).\",\n", " \"properties\": {\n", " \"is_on_brief\": {\n", " \"description\": \"True iff the output satisfies EVERY explicit constraint in the task.\",\n", " \"title\": \"Is On Brief\",\n", " \"type\": \"boolean\"...\n" ] } ], "source": [ "from agentic_architectures.architectures.rlhf import _EditorCritique\n", "import json\n", "print(json.dumps(_EditorCritique.model_json_schema(), indent=2)[:500] + '...')" ] }, { "cell_type": "markdown", "id": "e165824d", "metadata": { "papermill": { "duration": 0.005792, "end_time": "2026-05-27T13:09:09.876620+00:00", "exception": false, "start_time": "2026-05-27T13:09:09.870828+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 6 · State" ] }, { "cell_type": "markdown", "id": "ddc1f9c3", "metadata": { "papermill": { "duration": 0.006544, "end_time": "2026-05-27T13:09:09.887997+00:00", "exception": false, "start_time": "2026-05-27T13:09:09.881453+00:00", "status": "completed" }, "tags": [] }, "source": [ "| Field | Set by |\n", "|---|---|\n", "| `task` | caller |\n", "| `draft` | `_generate`, `_refine` |\n", "| `critique` / `quality_score` | `_critique` |\n", "| `history` | `_critique` (appended each round) |\n", "| `final_output` / `archived` | `_finalize` |\n", "| `arch.archive` | `_finalize` side-effect (**persists across run() calls**) |" ] }, { "cell_type": "markdown", "id": "3d9e0607", "metadata": { "papermill": { "duration": 0.005421, "end_time": "2026-05-27T13:09:09.898658+00:00", "exception": false, "start_time": "2026-05-27T13:09:09.893237+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 7 · Build the graph" ] }, { "cell_type": "code", "execution_count": 3, "id": "2229b9f9", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T13:09:09.914121Z", "iopub.status.busy": "2026-05-27T13:09:09.912371Z", "iopub.status.idle": "2026-05-27T13:09:12.465446Z", "shell.execute_reply": "2026-05-27T13:09:12.463949Z" }, "papermill": { "duration": 2.560221, "end_time": "2026-05-27T13:09:12.465931+00:00", "exception": false, "start_time": "2026-05-27T13:09:09.905710+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAANsAAAGwCAIAAACvpYc6AAAQAElEQVR4nOydB3wUxdvHZ/fucukVSCGN0BFIwNCUv5TQRJSOgBCqFEUFRaQjTQEBQRARUJD2IhKFKE1AqYL00GsIhJBGernk7nb3fe42uRzJZZMgtzd3O1/5nLuzs3OXud89M/PM7DNyjuMQgYANckQg4ARRJAEviCIJeEEUScALokgCXhBFEvDCuhWZl85cPpmZHF+g1SBtIaMp4D1ZLEI0Lec4luJYRMGBltIl0wj+x8JFxFGI4vSvlAyxTFEK/I+iEdwChzTNsQxV9DY0h1iq6GbIoYfT5UJwO8foT2mOYovywzsi/VsbTos+gB6FkpIpKKWj3DtA2ay9p70zIhhDWaM/UpXL/bE+ITWhgGE4hR1t5yBzcJJTFKdW6dVB6fRCy5FekRx8/YxG9zfSNKgQErmiHHooWpeHV6juVE5zWpCd/kBTrCkZQvqCi6RZdKNOu5SM4hj9nTSF2KIyablOjlzxqUxBM4aiQJH2NMMgTQFboGIhXaGgPXyUAybVRAQ91qfIjZ/H5WZpXT3k9cLd2nT3QFbOP9EZt85l5edpvbztB33mjySPNSny0PaUW+eyvXztBk8JRDbHzuWPweq/9IpH+75eSMJYjSI3L3ioLuSGzwyWK5GtkpWs/XllvIu7fNCUACRVrEORu755zGjQ259IolHb+mW8Rw3FG6N8kCSxAkX+OCfO3lE+WEp9rM3zHyIZFTndBjsnFUIjvPm/pY9hHC0pOQKRs4JgSB+16gmSHlgr8uzBzKxU9aApUhyBDp0emPxIdedCHpIYWCvy/OG0Dv0l2p0CWnXxPPJzEpIY+Coyem2io5O8frgjkiovd/aQyaiDm5ORlMBXkY/v5bfoJmnPHBDWzj32ei6SEpgq8uyf6RSFXmot6qTvzp0758yZg6pO586dExISkBlo2c0T5s1vnZWQKDFV5L2LuV5+YrvCb9y4gapOYmJiRkYGMhvOHvLLx9ORZMB07U92hrZ5R09kHuLi4tauXXvhwgXwxTZt2jQyMjIsLGzMmDEXL16Eq3v37t26dau/vz+8nj59+v79+9WqVWvXrt348ePt7e0hw5QpU2Qyma+v7+bNm8eOHfv9999DYs+ePSHPsmXL0IsmsK7jnZgcJBkwVSSj5cJec0dmQK1Wg/hatGixatUqENb69esnTZq0f//+devWDR8+PCgoaO7cuZBtw4YNmzZtWrBggbu7e05OzldffQWZP/zwQ7ikUCju3LmTl5e3fPnyJk2aNGzYcOLEiXv27KlZ0yzrd+o2d7lxPhtJBhwVmXBPTdPIzgGZg4cPH6anpw8aNKhBgwZwumjRIjCNWq22VLYhQ4ZERETUqlWLP42Jifnnn394RVIU9eTJky1btvAm09zUrGOPWE6tRnZ2SArgqMjsDA2ikJkIDAz08PD4/PPPu3fv/vLLL4eGhoaHh5fNBoYQmmwY6IA55PXq6VnSiwCliiNHHuhdpMSr/WtLQpI4jmw4hl+xbRaUSiW01G3btt2+ffuoUaN69eq1b9++stmgTYd2vHfv3rt37z5//vyIESNKFYJEhNItD2aQNMBRkS4edmZd/xEcHAw9vz/++AM6gnXq1Jk9e/atW7eMM8C7R0VFvf3226BIHx/dpBF0JZHlgMqo5m2eTgx+4KjIgPpKhuU4LTIHMNCOjo6GA2h2X3vttcWLF8vl8ps3bxrn0Wg0KpWqRo0a/CkMho4fP44sREq8Blyz9i5IImDqj4TZsyunzWKWsrKy5s2bt2LFivj4eBjlbNy4EbqJ0JuESwEBAdeuXTt37lxubi7YURDu48ePMzMzIT+4h7Kzs2F8XbZAyAmvhw4dgnuRGbhzPofGfYXWiwTTv9XRRR57xSyKBPFNnz4d3D3QIvft2/fSpUvgmwwJCYFLffr0gXH0+++/f/fu3S+++AKMaL9+/aCj2bJlywkTJsBpp06dYJRdqkDwXL755ptQCHQ9kRl4eCvX2U2BJAOmK3aP7Xp663z22EUhSPJ8O/leyy5eLbpY/TNulQRTG9muXzV1IZNwT4WkzY0zORyLpCNHhHMEAY8adn9uSx4xJ7i8DAMGDEhJSSmbzjAMTdPQ/pq8C7w5MA2DzMDly5dhCG/ykvBH+uuvv+hyuopn9qf515HWejysn7NZNelu5PRabtVN/2ySk5Phm0ZVxM/PD5mNsr3MylDeR3pwTbVvU8L7S+sgKYF1lJV6zVyjVj8eOTfY5FVvb2+EGS9W7ge3JDZtK6H2mgdrv0LXSG9o56LXSfEBqF0rExxdZP/rJbk1y7h7ukbMDU56UHh811MkJQ5sfJqZoomcGYSkh3VEENgw80FAfeeuQ6sjCbD7u8ScdM3QGVJ8WBtZUZSV9TNiHV0V73xm4+FHNi94yGg4aBmQVLGmSFQ7lsSnJRc2buXebkA1ZHMc2pJy53JO9QDlgImSjpBmZdH6bp/POxaVrFGzfiGOEW97u1aTISsnLUF97LeniXH5CgXdLdIvsKF4yy7xxCojmp4/knnleGZ+jlauoOwd5E7ucidXuUzBqQtKAofK5BRrFFaUDzgKPmr4e+ESozWOPsrx8XBpOWKLFxzJ7SitWpdHtzSRKwqAKpPTjJbl72L1JfC3UDRF0xyjLXpfvnBDJFXwi3NcSeFyOxoxlCqXycnS5GVpWAY5uyvCIzwbt5XM8h5BrFKRBi78mfngVn5+tkajBlVxGnXJ3yKTIZYFrRU7E6jiQM2c7pLBs07LdKrVBdnVHVMsYwiMixiNbqGkLt5u8fJhw42GA1ofNBp0DqJkn70kkyNeo0WKpPmA00iupOQySmEvc3aT16xj36Kz5DyOwli3Is3NihUrqlWrNmTIEEQQC7JXgxBarVYuJ1UkKqS6hSCKFB9S3UIQRYoPqW4hNBqNQiGh9ds4QBQpBLGR4kOqWwiiSPEh1S0EUaT4kOoWAhRJ+pEiQxQpBLGR4kOqWwhQpExm9Ys5rAuiSCGIjRQfUt1CEEWKD6luIYiHXHyIIoUgNlJ8SHULQRQpPqS6hSCKFB9S3UKQfqT4EEUKQWyk+JDqFoIoUnxIdQtBFCk+pLqFIIoUH1LdQpC1P+JDFFkuDMOQZRbiQxRZLhzHBQTYeOArDCGKLBcwkHFxcYggLlLau6eKUBRF0/RzhDon/BeIIoWAgXbZjY4JZoUoUgiiSPEh/UghiCLFhyhSCKJI8SGKFIIoUnyIIoUgihQfokghiCLFhyhSCKJI8SGKFIIoUnyIIoUgihQfokghiCLFhyhSCKJI8SGKFAIUSVZaiAxRpBDERooPUaQQRJHiQ/b8MkGXLl1SU1MpPVwxTZs23bx5MyKYGbIazQQtWrSg9fCLdmUymaur69ChQxHB/BBFmgDE5+fnZ5wSEhLSuXNnRDA/RJEmaNCgQZs2bQynCoWiX79+iCAKRJGmGTx4cM2aNfnjwMDAHj16IIIoEEWaJjg4uG3btkg/3O7fvz8iiAXWY+37MaoH13JV+Rr+1LBpOgw5WFb/sSn9Fu78IY04FhXn1O3RzueRySlGyxmXYMhpKJCHTzckqtUFly5fgWJahLeE8U1Jfn7zePTsJzEuhuYMxVI0xekzGOc0fl9TJRTh6KKs18wxoL4DkhK4KpJBP8yN0xQyCjuZusDw7YE8dP8vER+lfy2jSN0xKsps/PUX5aFM3FJUGvfMu7AMBzVEFbckpW43UcKznxPpfzJc2fcyzmCyBD129rRazdjby0bMDUaSAUdFwrzduumx9Zt5tnjdHUmek7ufPrqRM3ZxLSQNcFTk91MfvNLdJzhUWq2VAFeO5dz4N+3dhcFIAmA3svlzS4qdnYzI0Zim7Vyg9T/xWwaSANgpMiW+wMWTzLaXxsldnnA/F0kA7BRZmM8iOfFJlYZl2YI8Saz5wM4aMQzHkuU2ZWA1HMtQSAKQ9pGAF9gpEvxzkjAFhHLA0kZSRJPSBTtFwgQGWURcFlpO0QyLJAB+itT9RxRZGlYLAz4ysrEE+oXbpNWWLvjZSGIfpQ1+IxsOEU1KGeKPJOAFUaR1QCsoGSOJxoN4yK0DmEVkpDG3iqX3h4y1JQx+q2xYGwmzMXfe1H379yBCFSHrvszF7ds3EKHq2MLIJiMj/ctFs6/fuBIYENyzZ//Hjx+dOPn3Txt3waX09LQ13y2/dj2moKCgRYs2kUNGBwQEQfqDB/dHjn57zbc/bd++8eSpo9Wr1+jQvsuYdz/gty++fv3KT5vX3bp13c3do03r/w2LHOPk5ATpUb/u2P5/GydNnDbn8ym9eg344P3Jp0+f+Ovvg1euXsrOzmrYoPHQoaObhYVDzg4Rutevls7/bu3Xv+85CscHDv4e/XvUgwf3atWq07FDl759BlVpIkD3RKU0wgZiZyN1I5sqdiOXLJ33KD7uqyVrFsxf/u+/p+Af/zArwzCTPhl7OebCpInTf9zws4e753vvD0t48hjpw1TA67LlCyIiuv154PSMaQt2/rL176OHIPFxQvzkKe8VFBasXrVx/tylsbF3J308ho+QZmdnl5+fFx29a9rUeb17DgCVL/xyZmFh4dTP5n6xcEVgYPCMmZPgNwA5D+w7Ba+fTp7Fy/HwkQOLl8ytV7fB9q3Ro0e9vytq++o1y1BVYHSziJIYa+PXalMcXRVFZmVlnjlzckD/oY0aNvbyqvbJxzOTkp7wl65evfzoUdz0afNbtXzF09Nr/LiJrm7uUVHbDfe2e61T+3adQJ2hoc39fGveuXMTEg8f3q+QK0CLoLDg4JDJn8y6e+822FGk300WVDhw4LBOEd38/QPt7e03rNvxycczwC7Cv3FjJ6pUqqvXLpf9kPv27W7atNnEj6Z6eHg2b9ZixLBxu3fvhE+OCGXATpEcQ1Vpjcv92Lvw2rhxKH/q7OzcvHlL/hjEAWoDBfCnoKew0Jdjrlw03FuvXkPDsbOzS25uDtI12TENGrzk5lb0YK6Pj6+fnz+0y4acDeq/ZDgGk7lq9Vf9BnSDZvr1N3QxMDIzSz+fxbIsdBtahJcEEmrWrAUk3rx1HRHKYPX9yJycbHh1cnI2pLi6uvEHoDCNRsN36Qy4u3sYjvnGvRRw163bN0rdlaFvi3mg7eYPkpOTPpo0unmzlrNmfNGoURNQfOeurcsWqFar4WP88OMa+GecnpUpiWcLq4rVK1KptIdXjVptSMnITOcPoBF3cHBYuOBr4/wyWiZcoKdXtSZNwkYMH2ec6OZqIpbB0WOHQG3QiYR3QaasIw807o6Ojl06v/HaaxHG6QH+QajS6LrXNFmNZgmqOrIpGjvH3Yc+H9JZuNyLF896e/vCce3a9aBjV6OGT00/fz7zk8QEdzcP4QJrh9T989De0KbNDRY0Li4Weo1lc8L42sXFlZcjcOz4kXLLrF0vJzeHH4YDYDITExPgB4MqDR/nF0kADP2RHEVX4VOB2oKCaoGzBgbRIMcVK7/09S2Ksvdy85YtW76ydOl8aF5hmZYt+gAAEABJREFUGLF7zy/jxg89cCBauMB+/d6BTh6MhWEQEx//8Pt134CfKPbBvbI5Q0LqpqU9BZ8OjMT/PfsP/BKg95mSkoR0llsJHqXz589cunwerr47asKpU0fBYQ4lw3hr3vxpH08eB7pElYejdP8kAH4jG5Ziq7h8f8rk2WDPhkb2BjcNDFYavxQKg2X+0pcLV7Rr12negmm9+nT69bcdnTq93qfPQOHSXF1cf9jws4O9w9jxQyKH9wXnEThxwHFTNmdEx65Dh4zavGU9dB9hCP/hB1M6d+q+/f82Lf/6C7j6zuCRFy+dmzX7E1WBCroB69Zuu3LlUu++ncG1lJeXC44qQ3+UYAx2U3brpj1w91a8PsK/8reA/QN75u3tw59OmzFRLpPPn7cU2RC7Vz/UFHIj5wUjW8cWZhFhBhmsI8zTgDS3bP3hwoV/33rL1oI0Q0eGlpGRjSWAYQ1VxZ/JnDmLv1o6b/2G1ampyUGBtebMWtQivDWyLVgOOjOSmEbEbzUapetKVukWN1e3BfOqNilnfeg6V8RGWgTdqIY8aSNdyFMNBLzAcaWFRJqnKkHb0bQ0rAeGT8dSpNUuC6tmWfKcjUV4jvWRBFsCx5gWLDGREgbLeW1iI8tAK2gZ6UdaBp1DEhFKwWpY8rw2gWABiCIJeIGdIpUOlNyOPEVeGjtHGS2TRIxd7L57Byd5QY4kqr5KFOaxTq6SaNCwU2RoO8+cDDUiPEtetvbVN72RBMBOkfXDHd2q2e1aFo8IxexcGlejprJ6gAxJAEzDPh3ekRp3Pd83xDGgvjNj8nkUw8bV/JluOpwrm26cQlMUW3zJdC5klMrPZRrNaHL6+43escRJVXreU/dZKEO5hip+9hajD/xMXOGSXHJK/uhObuKD/IZt3F7tUcEDazYDvoHITkSl37uarS5gNYWV6FYavkeBdRrPKMKUJI2UUeFyD+MMXPHdRad6uZV902fKfCZdd0fZPAoFrXSiG7V0bymlfcZtJDSemVi5cqWnp+fQoUMRQSyIP1IIrVYrl5MqEhVS3UIQRYoPqW4hNBoNH9ePIBpEkUIQGyk+pLqFIIoUH1LdQhBFig+pbiGIIsWHVLcQZGQjPkSRQhAbKT6kuoUgihQfUt1CMAxDFCkypLqFgH4kUaTIkOoWgrTa4kOqWwiiSPEh1S0EUaT4kOoWgihSfEh1C0E85OJDFCkEsZHiQ6pbCKJI8SHVLQRRpPiQ6haC9CPFhyhSCGIjxYdUd7nApDZN0xSJryouRJHlwrJsWFgYIogLUWS5yGSyS5cuIYK4kEiN5cLv+A6WEhFEhChSCBjWwOAGEUSEKFIIokjxIf1IIYgixYcoUgiiSPEhihSCKFJ8iCKFIIoUH6JIIYgixYcoUgiiSPEhihSCKFJ8iCKFIIoUH6JIIYgixYcoUgiiSPEhihRCoVAQRYoMUaQQxEaKD1GkEESR4kP2/DJBs2bNkH5HQ8MjDQzD1K5dOyoqChHMDFmNZoI2bdrQeqhiHB0dBw0ahAjmhyjSBJGRkV5eXsYpAQEBvXr1QgTzQxRpgtatWzdu3NhwqlQqe/fuTR6TFQeiSNMMGzbM19eXP/b39+/RowciiAJRpGlC9SD9cLtbt27Ozs6IIArW1hIx6M5VFavRlJuh7FbuemhEsxSLuCrcFdEqMjVWIZPJmgR3vXUu28RNyFR5VPEO8OU4MUzfBd54maJ2cwckeazH+8OgTQseqnK0lIzSqqv+xCqtF4JofytV5feS2dGI4Rzd5cNmBiEJYzWKXDPlfkBdl/YDaiAbRo2O/JKU9DB/3OIQJFWsQ5HfTYl9Y1Sgh48kRrt3zqnOH0kc+6VERWkFI5tdKxNcPRUSkSNQr4WDvaMsem0SkiRWoMiMVE3Nui5ISvgEOqYlFiBJYgWGR6tmnN2k5aWys0eFhQySJNagSA0H/5CU0LKI0SBpQmbGCHhBFEnACytQJE1zFE0WcUoFK1Aky1IcK61g4DAHKdnw56TVxhGYtZDs0n6iSAJeWEM/kqJIP1I6WIEidS0YRzaVkQrWoEgxV5HhAYWk+wu0AkXqRp2UxCQJDi8y1sYWvY2U1vfDsdIda1vBCgbz2cievSM2b9lQ+XSCCFiBIs1nI98eMLRpk2b8ce++nZ8kJpRNJ4iMNfQjzdZoDx40nD9ISkrMzMwom04QH2uwkVVvtLNzsr9aOr9DRHivPp0WLJyRnKxbjx0bew9Szpw52W9At9FjdCFT+Nb50uXzg955E07fGdJz5uxP0LOt9pG/Dg4Z2gtufG/C8MSkJ3Bw+MgBSN/x8+bX32hreEd4C7h06tQx/vTAwd8hP2SA111R26v66IiUZxFtcCWsVqudOu3Dp2mpy5et/WDCpympyVOnfwiJCoUCrm7eugEa5U8+nmnI3yws/MuFK+Bg29Y9C+YtMy7q0aO4hV/MjIjotmf3XyNHjP/iy1lI/wS38AcAyS5eMrde3Qbbt0aPHvU+KHL1mmWoKpBZRLwBC1mVOZsz/568efPaTxt3BQYGI13InqCdv2xNT0/jA521CG/dv987lSzq4J9/uLt7RA59VyaThb/cKj3t6bVrMRXetW/f7qZNm038aCoce3h4jhg2bsnSeUMGj4RjRKgIaxhr63qSVWjD7t+/6+joyMsRAFs1c/qCGjW8i08bVr6oe/du16/fCOTIn77UWBflQrgJZln22vWYFuFtDCnNmrWAxCtXyVbdlcJaVlpUwUbm5eUqlfblXbVTKlGlgeFOzZoBhlMH+4pjTqjVao1G88OPa+CfcXpGRjqqEsRDji06k1SV9ZGOjk4qVT6YJX7L9v+Ci4trobrQcJqvyi8vJ8MWPahlb28PFrpL5zdeey3COIOfrz+qNJSEpxFtcGTToH6jgoKC23du8qcwOpn48RhoylHV8fHxu3PnJoibP42JuWC4pFDYFRYWGmJCP3r4wHCpdu16Obk5MGDi/zV+KdTLs5qh21AZpBz32Br6kfAZ6SoE+gkPbw1N7bp135w4+fe582dWrFyUmpIcFFRL4JYAfafz6NFDN25eM05v167T06epa777GpQHbiMYIRkuNWrUBDqU4OVBetfP9h2bDJfeHTXh1Kmj+/bvASlfvXp53vxpH08eB605qgpkFhFfYJIXsVX4nOCdWbpkDcuxs+d8OuWzCfYODl9+sVLYZVPTz79b1zc3blq7fv0q43QYmI8d8+Hp08c7d20NbqARw8cZLjVs8NL4cRNB9+CGnLdg2qgR76HiQU+TJmHr1m67cuUSzANNnvIe9GsXzF+urEr/VcpYQdyfVR/fa9m5WqNX3JGlgYEOiGz2rC87tO+MzMmZfal3zme/v6w2kh7WMNYGd7HUOvrkyS+coZD0XCFkzgZrKL2VxACYv/n7yHlEMCfWoUhaWoGoJI01eMhZxDHS60dK9fFLq5hFlNxDJ7o/V6qPX1qFIimpdfPJajSs0e1LSPqRksE6Wm2ywa10IE/H4gj4FiTrXrAO74/URjYsi9iqbyJlG1jJLCJptCWDNbTaFEdJLfCPhLECRSrklFwurV6VTEYp7CTakbQGRdrJstKltZeGKpeR20nUQ24FP8TqfvaP76mQlEh9VOAb5IQkiRUosud7vgV56ivHcpA0OLE7Xctwr4+sjiSJ1exmvHZqrJe3fYvO1bwC7JCNkvRAfelIamaaeswXtZBUsZ4d3xHavig+O0PDsYjRltozUPdXPOuzpPSedaq8QTpX/iJgoUuckGdU9xHKD1AE/gJTF0t2hqdkMKCRuVe3Gzi5Cs/R2h7WpEie3HTEMEWKfBAXO3PGrO3bt+n+BqO/Q6dEruTrLoquVpxBdwi9FYMLWq+VkrtpfU79+cxZM9zc3D6dPMVQbMniYV6anNExKpIzZZzHUK4MIabMhzHKwDFMj34dN2zY0KBBAyRhrG/3EGdd8JyisCf3jsX88ecOZB6uXLlyO/aivb19nibZz88PmR3ZyZMnDx48KHFFWqXTi2XZOXPmwEG/fv2Q2Vi/fn16evqTJ0+ioqKQWHTt2hVeIyMjz549iySJVSpy/PjxQ4cORebk9OnT169f54+PHj2alpaGRGTz5s3Hjh1DksTK+pEgjvbt2yPzA6I/d+4cfwwDlnHjxo0aNQqJzo8//li/fv1XX30VSQZrspGffvqpYUxjVo4cOXL79m3DKfxo9+3bl5eXh0Rn2LBhO3fuFNlCWxbrUGRGhi5IOPQaIyIikPnZtGlTZmamccrjx49BlEh0wB+0cuVKpVJ5//79CxcuIAlgBYr85ZdfTpw4AQetWrVCogAGElpqthiwkVqtFmSKLISzs3NwcPC6deukMNzBvR8Jtur777//7LPPkCWYNGlSnz59/ve//yE8uHnzZsOGDe/cuVOvXj1ko+BrI5OTk2FsAQ2WpeSI9EH2K4yDLyYgR3gFYwntBrJRMFVkSkoKjG2bNm3q4FBxoGXzgZsieZYuXerkpFsZZJMjHhznbHJycmAo88cffyBLY9hzBDe6d+8Or3v27IFOl0XcUuYDLxup0WgGDhwIZgmccAgD8LSRBkaOHKlWq+EHXFhYiGwFvBQJ3aOFCxdatqU2BnNFIr0n39HR8datW9u2bUM2AS6K/OGHH+B18ODBtWtjFFgWf0Uivc8yNDQUet68j8zawUKRMFdmb2+P8AN6EfgrkgccVY0aNYKD3bt3I2vGwoqMjY2F1w4dOrzzTmV3hhMTq7CRBry8vOD1wYMH3333HbJaLKnI6Ojon3/+GQ5q1cJ0Eb91KZIHjCU/13r58mVkhVhSkbm5udOmTUMYA602nt4fYfgZnbt373766afI2rCAIqFZWbFCt3swjGMQ3lijjTTQv39/3m0Js1/IerCAIqdMmTJu3DhkDVi1IpG+gw6vSUlJM2fORFaCqIo8f1630QE4HfEcWZfF2hXJA76htm3bHj9+nLWGgGsiKZJhmF69etWoUQNZFVbajyxLt27dQJSgyAULFiC8EUORWVlZjx8/Xr16dWBgILIeeItC20poUfhDwN43btx4yZIlCGPM3iTFxMRAz7pLly7I2uA4rnnz5si2gJZKpVLxD4fAZA/CD7MbgLy8vL179yJrA76z1q1bW7WruTwcHBy++eabHTvM9Zz7f8TsNrJVq1bu7pbf9rVKFBYWtmvXzvAsou2B88jS+qKsmBsw6l27dj158iQiWAIxuu1r1qwB1wOyBmAQBl5lm5cjowdhiRiKrF69+unTpxH2pKen9+nTRwrBJDZv3rx27VqEJWK4f2F8B7YH4U1KSsqQIUOOHDmCJIBSqYQRN8IS0o/UkZiYOGrUKIvECCCUQiT378SJEw2BnXAjPj5+zJgxkpKj1PuRQFBQEJ7L9eLi4j788MPff/8dSYno6OhFixYhLBFpGcFHH32E4TT//fv3P/vss99++w1JDOhHUrju7CdePxKaCaymrW7fvj1nzhxspy4ki3jLCH+DeygAABAASURBVN566y18lo5Cp3b+/PmSlSO0V1qtFmGJeIoMDw83DspoQWJiYpYsWbJ161YkVcDniu3zJOItR507dy7CgIsXL3777bc//fQTkjB2dnbYrrITrx+p0Whyc3M9PDyQ5Th79uyGDRvWrVuHCLgi6g+FfxDJUsBM5qZNm4gcEelH8igUirZt24L/jz8VOWTAiRMntm/fvmbNGkTQ96THjx+PsETUWcSePXvm5ORkZmaCM8zf33/Pnj1IFI4ePQrv9fXXXyNp07dvX5jOButYUFAAB3K5HLpS4JW7dOkSwgYxRjZhYWG8J5L3ykKfGloNPkiNCBw+fPjAgQNEjki/tQDUg2GqAuQIr3Xr1kU4IUarDaNsFxcX40kC+HW2adMGmZ+DBw+CIpcuXYoICA0aNCg4ONg4BSzF66+/jnBCDEVCY92jRw/jCRsvLy8RbOTevXuPHz+O7QSuRRg6dCi4fgynAQEB0JQjnBBpZDN58uSWLVvyfVZoNZydnevUqYPMSXR0NPh6Fi5ciAhGvPnmmyEhIfwxtFodO3Z0dXVFOCHeWHvVqlX889pQEaGhocic/PrrrzCcxMQnjxvDhg1zdHREWBpIJLI/ctmyZTVr1lQqlWZ9Dnrnzp0wXTlr1ixEMEXnzp354Gnt27f39vZGmFGB9+fvnamxV3PVBSyjLb2WDO4rtaBJn/DMTuYci6gymuc4sJIm3rTc9JJiy15CAmuqaDmttJcFv+QUMbA6wpu/f3kaewXqmdFqOZN/bNm65Q+44h3kiymu+TLfTklR5Vwqr/6FSxO+SlE0LUf2DormHTxD2zujSiDk/fnr57TYq3khjd0avOzG8cOSZ2qj5Iw/0umDT9crpVT9GY6g/nSfv8zfDuk0Z5RsuFNfGlXqDv4dKVRSh6VzgP2nb57PiI3JPsKiiMH4ivLkb+n3LufWbuJa92U3WgEd7dIZKGRKpHxqOZI08Vs1XUrJJb4yKe6Z0kzUcyXL1EPLkDof3TibeWZfqoMLXe9lR1QR5drInV8n5GcxfSdZU6Se8ohaEe/oQg342B/hx6+rn2SmaPt/Ygv1LMyOxXG1Gzt1rMg0mO5Hpj1i0hILbUOOQN+JAenJ6qQ4NcKM3FSU9KhACnIEOg7wuxOTU2E204o8tTfF0cXq4yYa4+iiOLMPuz3bju9JdnSyqXoWoEYtO1pGnTuQIZzNdHWo8hi5wkai1PHI7FB+LnarXXKztAqlTdWzMDDKSU+poKUyrcgCldYawrFWARjGCo7LLYNKpWG0mD6BZQ60BaxGXcFTuVJpMnB98k5iUFSFZkEqisSVitwn0sO0Iina1mwKrrFkOAz7Euak4q/BtCI5luNsqx9J06TZtjwUVXHgAqm02vrpNtI+WpjKWDrTirQ9i0JReLaPEutH/gcbaWsx/HB1ZkmtH1mxsOTl3mZbkiTeHxzQLaJ4bkXamJEkcVtxQGcXyMiGh5ZzlG15D6ySStgFqSiS1VIY9thkcmRjXrYKeO6RjW6yx7Y6Xnj+OYwW2yGXeeAqDlhBl3OjGTte+fn5Xyya/cabr035bEJs7L0OEeFXrpg9poLN9Iyhxj6b+kHnrq23bd8Y9euOiM4t0fPCV/7Vq7po3P+xqBeIBWzk1WuXDx3a9/57H4eFhru7e0QOHV2jhg8yM7ayAyw68teBK1cvzZ2zJCSkbkZG2tAho9GLoFHDxi+qKCGee6WFmW1kHrx2ingd5AgHI4aPQ+bHZobaeXm5Pj5+r7zyGhz7+Pg2bNgYvQignBdVlCDPO6/9HPTsHRE5ZPTxk39BE7xn91+uLq4HDv4e/XvUgwf3atWq07FDl759BkG3dsMP30JzA/l79+3cIrz1uLETR707cOXX65s2bTZ33lTIAEpdtORzlSq/UaMm48Z8xFeTVqv94cc1Z/49mZKS1LhxWO+eA1q3blulj6ebhcJPlRSNynv8zySTPh57OeYCHEBrO3rU+/b2Dmu+W37k0FlI6dWnE/y2s7Iyf9q8zsHBoUV4mwnvT/byqgaXHjy4H/37rouXziUlPQkOCunevVfPt/qVKhlabb6oU6eOzZz9SamrW3761d8/8L9/C5WZ16bLubPKA1OFQvHHvt/q1Kn/1ZJvHR0cDx85sHjJ3Hp1G2zfGg11tytq++o1yyAbHM+e9SUc/BZ1aMni1cYlyOXy6zeuHDq8b+13W/bvPam0U365eA5/6ZtVS6CE3r3e3r7t93avRcyZO+XY8aptzsVoORw3cNGpsQoV/fXy70FMwcEhfx85/87gEcaXoP5//nkzTdO7fzvy08Yo6Bpt+ul7/tK3a5adO3f6ow8/W/TlNyDHld8sPvPvqfLeonHj0OXL1hr+1a5d18fb18tL97jWf/8Wnn9em5LBr7dqJgXE7+rq9sH7k/nTfft2g9mb+NFUOPbw8BwxbNySpfOGDB4JxwKFqPLzP508m4+4ENGxGxhLGAbJZLKDf/4xeNDwt97UBWDo/nrPa9diNm9ZD5WCrJwX2zuqWTNgyDsjdUfOLmAj79y5yafPmvUl9JR8ffzguFlY+IED0WfP/dO61asmC3Fzc4c8/PGe6F0JCfGrv9kIRrewsPAFfAuUief3S1HOWJvhnsMrUb9eUXAplmWvXY+BSjFcatasBSRCl1y4hIDAYF6OgLOzC7zm5GRDzarVauPSwkJfhnFiVnYV9lq0PX9WWerVa2g4dnFxhR5n0QnH/frrjsjhfaGth3+3bt/IzEivsLR79+6s/nbpZ1M+BzMJpy/kW9A9+v18NvL55rUNQbfgo2s0GuhzwD/jDBkVVYTJcO25ubpHKj/4aFSp9Iz0NDdXN0QoxmQfDQzB1OkfaTTqd0dPCAsLd3F2KVuTZcnOyZ45++Oeb/Vv364Tn/JivoVK2EizzNnY29uDqevS+Y3XnjXpfr7P8wy/VzVdJ+aTj2dAq2ScXjWfEQxt8HNGV3Vk8xzcuXvr1q3rS79a83LzIncjaKt6tRrCdy1YMN3b23f8uImGlBfyLejCuDzfSgtK9l9n3GrXrpeTm2PokYDJTExMqFHjeeIe+dcMVCqVSN8H4lPA1oLv39C+VwaOYTn8ZhGhCeM4834qGH3Dq0GCcXGx8K9WcG2BW7b/36bYB/d+WL/DOOTni/kW2IrX/pRjQ//z7/bdURNOnTq6b/8eaDVgVmDe/GkfTx4HrTmqOvA3Dx82FjrRUA6UAOO7yVPeW7GSxCmtFODuASfGzzu3QEP86FHcqtVfgdMtKTmxvPwxMRfXb1g98O1IEOWly+f5fykpyaJ9C+W12v/1h9ukSdi6tdvA9fj9um8KClQvNWq6YP5y/kf2HEAFgdHdvmPTxYtnnZycobRPPplZpRJ03RdJLkjz9vaZMX0BOCl79uoIDe6MafPT0p/Omj152Ih+c2aZ0BMMqJHOYbTcOBFcm337DHwR30LF/kjTkah+mh8Hna5+E4ORrfDrqjj4Y4fNDEI48dOCOEZL9Z+E16cyH9u+uO9f16HHaD+BPOV6yG3MV6L73bHYGUlaVvHY06Z47vWRlZntsTo4/P4k3bAGv9+JGXn+mBY0h2xr3R6eS9F0s2qSevKLYyuUpGlFslJbSUoQiYr3mJNKlBWYDMLQTOo95IhgTPkjG2RbUDjGtNB7yJGEqMTvr7x5bVurJ1a3FA27X5n+A0mpe8Q974pd+O2SfqQI6L8fSbl/KqbclRY21mrLZNiuIUcS4rmfjqVlFGtjHnIWx66I5PqRlXg6thzvz3Ot2MUZneNPYjGfrBSpRBAgWAvltNpymmJsqjmRK3D8ickVtLlX7GIFraDhi6ggj8lUR2c7CsmQDUEjmdJRgTDDycUOsVLaz4ainJztKshjMjWksVNeLnZ7tv0X8rI1dUMrtXepmNQPd1HlaZBkUBeyr7zpJZzHtCLD2rvY2cv+3pGCbIJju1LldlRYe1eEGQ1bOikdZIe32Ug9C7Pn28dePnayCkyk4Mz3ps8fOrrYvT7aF1kzezckqHK0Iz7Hd1XslvnxcqVMt47VpjpKJeRmcgc2Pvb0lvccX7GWKliLsfWL+OwMtUxOawqfiQihCzDAGp9SnNE6P1pGs0zJZf1jgFzxJX5C79lCKOhhlMwS6Z9Y44wzG0rQOUr1Qy7DOxry6HZE5+dAivMoFDSU6eIpHzIN981Zty2Kz0pTK+S0WsOUeqLZ8Hfph2a6mjGklOTR10bJ5udGGfgBnXH+kgKLnvQoqu1Sd+n3jS9dGtStbgUdZ+It+I3fuWdvgYkJGCXD11G9prLfRzVRJah4dRBi0IW/s/NzC565DT6Z0WDcoKHiU/rZaBq0YfaWpkElRccUTRe7PWEcRaFiERtKo2Q0p0+kEc3yJdA0r9ySPIZCylxycFaEd/CwGsPDoItHM/NzGY579sdvqFuDuExIknom/ZkM+lUzRe5Y/ioqLFT/c+qfDh3b6/d3NzIMxfNauh3eqWIhlyTrP0zRZyhJ5IyfFH/2FpqWuXgqmratQn+Jsr1FFYQKSU1NjYyM3L9/P8IPsi+iFNFqtXI5pl89UaQU0Wg0CgV23lkeokgpQmwkAS+IIgl4QRRJwAuiSAJekJENAS8YhiE2koARpNUm4AVRJAEvoB9JFEnACGIjCXhBFEnAC6JIAl4QRRLwgnjICXhBbCQBL4giCXiBsyJJ8EIpgnM/kihSipBWm4AXRJEEvHBwcHByckJYQhQpRfLy8lQqFcISokgpAk02NNwIS4gipQhRJAEvcFYk8f5IEaJIAl6QVpuAF0SRBLwgiiTgBUxqE0USMILYSAJeEEUS8IIokoAXRJEEvCCKJOAFUSQBL4giCXghk8mwVSTZYUlCDB48ODMzk6KogoKCvLy86tWrQ6JKpTp8+DDCBrLSQkJ06NDh6dOnycnJWVlZYCMT9bi64rWlLlGkhHj77beDgoKMU1iWbdeuHcIJokgJAebwrbfeUiqVhhR/f/++ffsinCCKlBYDBw4EFRpOX331VeNTHCCKlBYKhWLAgAG8mfT19cXNQCKiSAkCKgwICICD0NDQOnXqIMwg3h98uXMh9+bZnIyn6oJchtHtfE9xLFe0mXvxDuuGbdmRfo91unj3dkM23R7tbMku7HAkoymGZSGRlkEmSr9be4kKDHvHF5VplM4ijtYJxvDGRZdoKJEGrzslk1PVaypf7ujpV0eJnheiSByJ+uZJSnwByEZmJ7NTKuT2MoVSBtJiuCJFUqAtWicKSv8NFsuD0l8t0Z9ebJReh6hsHkOK7n9UkQz4RAoV3WUMBe9Dm9CLXEazLK1Va1XZBdpCLcsgiub86zj2HOeHqg5RJF7sXP44JaHQzkFePcjdw98ZWScp9zLTE7K1GjbkJefuI72rdC9RJC6kPFJHrU6Q21F1WwcgGbIBCjLUD68mcxw7blFI5e8iisSCmONZJ/ak+tWt4RmEaYCo5+bJzfT0x1lDZ9Ry86rU74wo0vLcj8kgzdWmAAAFzklEQVTf/1Ni487ByEbRFjK3jscPnx3s7F6xKIkiLcz5P7POHkpr1DEI2TrXDz94Z2ot9+oViJL4Iy2JWoXO7E+VghyBwMY+Wxc9qDAbUaQl+fHz+54BeC29MR8uPg6OLg4bP38onI0o0mLs25jMssivoReSDCGtfPJztddO5QjkIYq0GLFXc3zrVUMSw62Gy5n9aQIZiCItw9FfnlI0ha0PPDcvY/KsVpevvvi15f5NvApVzP0r+eVlIIq0DPdiclw8HJEkUdjLT0enlneVKNIyqPK01eu6I0ni5uuclaEp7yp5FtEC3DiTQ1GUg4sdMg9xj678+feG+Mc3nJ08GtZv26XDaHt73VTQqTO/HDr24/iR323eMS05JdbXu85rrwxq0bwHf9elK38eOPK9SpXdqMH/2r36DjIb3rXdU+9ncAyiTLkmiY20AA9v5ssU5qr5p2nx32/6QKMpnDBmw7DBixOT737343iG0T0LK5MrVKqc3XuXDug1/at5Z5o27rhz94KMzCS4lJh8b/uu2eHNuk+dGBUe9saevcuQOaFp6uo/pkfcRJEWIC9bQyvMtZjiYswBuUwxfNBi7+rBPjVC+veckZB4+9rNY/xVhtF07jA6KKAJGGlQHszYJSTegfR//o1yd/Pp3H6Uo6NrnZCXW4X3QuaEktNpTwpMXiKKtAAaNYPMNncLTXaAfyMnp6JOqqeHr5en/4OHlw0ZAmu+xB84Ouic86oCna16mh7v412yQiegZiNkTjjEFRSYDmFA+pEWAQykuUJKqApy4xNugO/GODE7p8QFSBkWiBuRn59dzSvAcGpn54DMCa1ruE1bQ6JIC6B0oKlMZCZcXLxqBYV17TjGONHJyU34LmisNZqSZrSwMA+ZF8rRxfR2ykSRFsDT2y4lvhCZBz/vuhdi9oUENzMYoaSU2OpegcJ3ebj73rh1gmVZ/q4bt08ic8IyrG8te5OXSD/SAjQMd2W0LDIP4NABYUXv/1qtLkhJffjHwdXLVg+GobTwXaEvdYJ5mt17l8FY517shX/+3YXMhlrFcByqE2p6goAo0gJ417KjaJSVmI/MALS/kydst1M4rFg7bMk3A2LjLvbvNcPfr4HwXfXrturR9YPbd09/Orv1jl/nDew7W59slvFXyr0MhZIq7ypZsWsZti2KV+VTddr4Iulx+/ijmrUdeoz2MXmV2EjL0OaNahqVGkkQBmnVXHlyRGRkYylCmjgolPTjq2n+TUyvj8zKTv1q1UCTlxyUzqrCXJOXfKqHTBizHr04Zi6MKO8SzAPJZCb0E1Cz4djhq8u76/65J25eQqojrbbFiL2i2r/5yUsRwSavwvedlZ1i8hIMWezsyhmo0nJ3txroxZGe8aS8S2pNoZ3CROwKudzO1aWcdZ8MuvZX3ITltVH5EBtpMUKaOlTztY/990lIKxOxH8D8eHo8T0yIF8uL/Qy3Tj6qE1bB47+kH2lJ3v6kpqZAk3QnA0mAuIvJ9o50t0gf4WxEkRZm7KKQ9MfZT2NzkU0Tdy6pMLdg+OyKn7ok/UgsWPPpfQ8/F98GtvkU2IPzSRTSDp9VqYeAiSJx4bsp92UKeb22eEW8/e/cOfFYpmBHza1VyfxEkRixY2l8WqLa2csxqNmLHC9binv/Jqmy8ms3ces+sgp/DlEkXiTcLTiwNakgj1HYKzx8nauHuCHrgkOJt9Jznqo0hRpnd8Ww6YFVjfNGFIkjSbGFR39NyXyq0apZfQRb3SywPlquUSbKxLSzfukjH+LUNEWxdxEfGZV6NrFUVt07sCWxTsvNSdG62Kksw7JaFq7KFJRvsGOP0b6K53qOiCgSa9QqFHMi62lCgSpfqy7kwG9uuAQyLQrnXHSuD6pL6/XDlvudUjLEMXoXC0UjhjVdVHEiXxJlpJCi259FpqCV9nKlI+0d5BDW7r8GjSGKJOAFmbMh4AVRJAEviCIJeEEUScALokgCXhBFEvDi/wEAAP//DeAunAAAAAZJREFUAwBRfSZi+Zh7+gAAAABJRU5ErkJggg==", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from IPython.display import Image, display\n", "arch = RLHFSelfImprovement(max_iterations=2, target_score=8)\n", "graph = arch.build()\n", "display(Image(graph.get_graph().draw_mermaid_png()))" ] }, { "cell_type": "markdown", "id": "741c20d5", "metadata": { "papermill": { "duration": 0.009776, "end_time": "2026-05-27T13:09:12.481835+00:00", "exception": false, "start_time": "2026-05-27T13:09:12.472059+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 8 · Live run — 3 sequential tasks (archive should grow)\n", "\n", "We run **3 similar tasks** through the same architecture instance to watch the archive grow. Each subsequent task's generation sees the prior accepted outputs." ] }, { "cell_type": "code", "execution_count": 4, "id": "f5e1edba", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T13:09:12.494489Z", "iopub.status.busy": "2026-05-27T13:09:12.494489Z", "iopub.status.idle": "2026-05-27T13:09:41.974445Z", "shell.execute_reply": "2026-05-27T13:09:41.972563Z" }, "papermill": { "duration": 29.487017, "end_time": "2026-05-27T13:09:41.974445+00:00", "exception": false, "start_time": "2026-05-27T13:09:12.487428+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "TASK_TAG: easy\n", " COMPOSITE_SCORE (Python): 8/10\n", " LLM_OVERALL_RAW: 8/10\n", " features: on_brief=True, word_count=39, concrete_imagery=False, avoids_cliches=True, engaging=True\n", " archived=True, archive_size=1\n", " output: Expertly crafted coffee, every time. Our skilled baristas carefully prepare each drink. Quality and precision in every cup.…\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "TASK_TAG: hard\n", " COMPOSITE_SCORE (Python): 8/10\n", " LLM_OVERALL_RAW: 9/10\n", " features: on_brief=True, word_count=12, concrete_imagery=True, avoids_cliches=True, engaging=True\n", " archived=True, archive_size=2\n", " output: Step into a world of vintage pages and freshly printed stories daily.…\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "TASK_TAG: vague\n", " COMPOSITE_SCORE (Python): 10/10\n", " LLM_OVERALL_RAW: 8/10\n", " features: on_brief=True, word_count=96, concrete_imagery=True, avoids_cliches=True, engaging=True\n", " archived=True, archive_size=3\n", " output: Uncover the threads that weave our world together at our museum, where the stories of yesterday, today, and tomorrow come alive through a diverse array of artifacts and immersive experiences. From the…\n", "\n", "COMPOSITE_SCORES_PY: [8, 8, 10] spread=2\n", "LLM_OVERALL_RAW: [8, 9, 8] spread=1\n" ] } ], "source": [ "# Three tasks of varying difficulty so feature outcomes diverge.\n", "TASKS = [\n", " (\"easy\", \"Write a 3-sentence tagline (30-80 words total) for a coffee shop that emphasizes craftsmanship.\"),\n", " (\"hard\", \"Write a tagline for a bookstore in EXACTLY 12 words. Must avoid the words 'we', 'our', and 'discover'.\"),\n", " (\"vague\", \"Tagline for museum.\"), # deliberately under-specified — should miss on-brief\n", "]\n", "\n", "results = []\n", "for i, (tag, t) in enumerate(TASKS, 1):\n", " r = arch.run(t)\n", " h = r.trace[-1]\n", " results.append((tag, t, r, h))\n", " print(f\"TASK_TAG: {tag}\")\n", " print(f\" COMPOSITE_SCORE (Python): {r.metadata['final_score']}/10\")\n", " print(f\" LLM_OVERALL_RAW: {h.get('llm_overall_score')}/10\")\n", " print(f\" features: on_brief={h.get('is_on_brief')}, word_count={h.get('word_count')}, concrete_imagery={h.get('has_concrete_imagery')}, avoids_cliches={h.get('avoids_cliches')}, engaging={h.get('is_engaging')}\")\n", " print(f\" archived={r.metadata['archived_this_run']}, archive_size={r.metadata['archive_size']}\")\n", " print(f\" output: {r.output[:200]}…\")\n", " print()\n", "\n", "# Aggregate spread comparison\n", "composite_scores = [r.metadata['final_score'] for _, _, r, _ in results]\n", "llm_scores = [h.get('llm_overall_score', 0) for _, _, _, h in results]\n", "print(f\"COMPOSITE_SCORES_PY: {composite_scores} spread={max(composite_scores)-min(composite_scores)}\")\n", "print(f\"LLM_OVERALL_RAW: {llm_scores} spread={max(llm_scores)-min(llm_scores)}\")" ] }, { "cell_type": "markdown", "id": "04247450", "metadata": { "papermill": { "duration": 0.008931, "end_time": "2026-05-27T13:09:41.990793+00:00", "exception": false, "start_time": "2026-05-27T13:09:41.981862+00:00", "status": "completed" }, "tags": [] }, "source": [ "### 8.0 · What just happened, briefly\n", "\n", "Three signals:\n", "\n", "- **`ARCHIVE_SIZE_AFTER` should grow monotonically** as tasks finish above threshold. If it plateaus, the editor is rejecting more than accepting (could be good — high bar — or bad — over-conservative editor).\n", "- **`iters` per task** — should mostly be 1 (loop terminates early when score ≥ target). If consistently 2-3, the editor is hard to satisfy.\n", "- **`score` distribution** — healthy: 7-9 range. Pathology: all 9/10 (lenient editor) or all 6/10 (rejected from archive)." ] }, { "cell_type": "markdown", "id": "72373e0b", "metadata": { "papermill": { "duration": 0.006385, "end_time": "2026-05-27T13:09:42.002429+00:00", "exception": false, "start_time": "2026-05-27T13:09:41.996044+00:00", "status": "completed" }, "tags": [] }, "source": [ "### 8.1 · Did the archive influence later generations?\n", "\n", "Eyeball check: do tasks 2 and 3 share *structural patterns* with task 1's accepted output? The generator should be borrowing tone / cadence from the archive, not the literal words." ] }, { "cell_type": "markdown", "id": "4e86ead7", "metadata": { "papermill": { "duration": 0.00655, "end_time": "2026-05-27T13:09:42.015091+00:00", "exception": false, "start_time": "2026-05-27T13:09:42.008541+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 9 · What we just observed\n", "\n", "The cells above ran **3 tasks of varying difficulty** through ONE `RLHFSelfImprovement` instance, with the **multi-dimensional deterministic-scoring fix** applied (see § 3.0).\n", "\n", "### 9.1 · Per-task feature decomposition\n", "\n", "| Tag | Python COMPOSITE | LLM `overall_score` | Archived? | Editor feature commitments |\n", "|---|---|---|---|---|\n", "| easy | **8**/10 | 8/10 | ✓ | word_count=39, avoids_cliches=True |\n", "| hard | **8**/10 | 9/10 | ✓ | word_count=12, avoids_cliches=True |\n", "| vague | **10**/10 | 8/10 | ✓ | word_count=96, avoids_cliches=True |\n", "\n", "### 9.2 · Score-spread comparison\n", "\n", "| Source | Values | Spread (max−min) |\n", "|---|---|---|\n", "| **Python composite** (the deciding signal) | [8, 8, 10] | **2** |\n", "| LLM raw `overall_score` (preserved, unused) | [8, 9, 8] | 1 |\n", "\n", "### 9.3 · Patterns surfaced in this run\n", "\n", "- **Python composite scores: [8, 8, 10] (spread 2)** vs **LLM raw `overall_score`: [8, 9, 8] (spread 1)**. Python's composite has WIDER spread than the LLM's raw score — the multi-dimensional decomposition produced more discrimination than the LLM was willing to commit to in its single `overall_score` field. This is the deterministic-scoring fix working as designed.\n", "\n", "- **All 3 tasks got the SAME feature pattern** — Llama gave identical booleans across the three tasks. This is the same flat-scoring pathology resurfacing at the feature level. The output is still explainable (you can see which features contributed) but the architecture isn't actually distinguishing the tasks. Use genuinely different task shapes (e.g., easy vs hard constraints) to force divergence.\n", "\n", "- **All 3 tasks archived** — happy path, but watch for sycophantic-editor pathology. If every output passes the bar regardless of obvious quality differences, raise target_score.\n", "\n", "### 9.4 · The takeaway\n", "\n", "The multi-dimensional fix has three properties worth checking:\n", "\n", "1. **Transparency** — every score has an explicit Python-side decomposition you can read.\n", "2. **More spread than single-score** — usually, because 5 independent booleans diverge more than 1 numeric commitment compresses.\n", "3. **Honest residual** — even with multi-dim, identical tasks get identical features. When that happens, the architecture is admitting \"I can't distinguish these\" rather than papering it over with a fake 9/10 vs 8/10." ] }, { "cell_type": "markdown", "id": "80446b05", "metadata": { "papermill": { "duration": 0.006179, "end_time": "2026-05-27T13:09:42.026804+00:00", "exception": false, "start_time": "2026-05-27T13:09:42.020625+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 10 · Try varying `target_score`\n", "\n", "The archive's quality bar is the most important production knob." ] }, { "cell_type": "code", "execution_count": 5, "id": "80707c08", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T13:09:42.040859Z", "iopub.status.busy": "2026-05-27T13:09:42.040859Z", "iopub.status.idle": "2026-05-27T13:10:22.095908Z", "shell.execute_reply": "2026-05-27T13:10:22.093817Z" }, "papermill": { "duration": 40.069304, "end_time": "2026-05-27T13:10:22.101891+00:00", "exception": false, "start_time": "2026-05-27T13:09:42.032587+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
target_score=6 ────────────────────────────────────────────────────────────────────────────────────────────────────\n",
       "
\n" ], "text/plain": [ "\u001b[1;36mtarget_score\u001b[0m\u001b[1;36m=\u001b[0m\u001b[1;36m6\u001b[0m \u001b[92m────────────────────────────────────────────────────────────────────────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ " ('easy', 'Write a 3-sentence tagline (30-80 words total) for a coffee shop that emphasizes craftsmanship.') → score 10/10, archived=True\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " ('hard', \"Write a tagline for a bookstore in EXACTLY 12 words. Must avoid the words 'we', 'our', and 'discover'.\") → score 6/10, archived=True\n", " archive_size at end: 2\n", "\n" ] }, { "data": { "text/html": [ "
target_score=9 ────────────────────────────────────────────────────────────────────────────────────────────────────\n",
       "
\n" ], "text/plain": [ "\u001b[1;36mtarget_score\u001b[0m\u001b[1;36m=\u001b[0m\u001b[1;36m9\u001b[0m \u001b[92m────────────────────────────────────────────────────────────────────────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ " ('easy', 'Write a 3-sentence tagline (30-80 words total) for a coffee shop that emphasizes craftsmanship.') → score 10/10, archived=True\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " ('hard', \"Write a tagline for a bookstore in EXACTLY 12 words. Must avoid the words 'we', 'our', and 'discover'.\") → score 8/10, archived=False\n", " archive_size at end: 1\n", "\n" ] } ], "source": [ "for ts in [6, 9]:\n", " print_header(f\"target_score={ts}\")\n", " fresh_arch = RLHFSelfImprovement(max_iterations=2, target_score=ts)\n", " for q in TASKS[:2]:\n", " r = fresh_arch.run(q)\n", " print(f\" {q[:50]} → score {r.metadata['final_score']}/10, archived={r.metadata['archived_this_run']}\")\n", " print(f\" archive_size at end: {len(fresh_arch.archive)}\")\n", " print()" ] }, { "cell_type": "markdown", "id": "88862c11", "metadata": { "papermill": { "duration": 0.011312, "end_time": "2026-05-27T13:10:22.123923+00:00", "exception": false, "start_time": "2026-05-27T13:10:22.112611+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 11 · Failure modes, safety, extensions\n", "\n", "### 11.1 · Where this breaks\n", "\n", "| Failure | Mechanism | Mitigation |\n", "|---|---|---|\n", "| **Archive bloat** | 100+ accepted outputs → prompt too long | Cap at N most recent (we use 3); or retrieve top-K by similarity |\n", "| **Mode collapse** | All outputs imitate the same archive item | Vary temperature; explicit \"use different structure\" instruction |\n", "| **Sycophantic editor** | Editor accepts mediocre work | **Python target_score backstop catches this** (deterministic-picker pattern) |\n", "| **Editor inconsistency** | Same draft scored 7 then 9 across runs | Lower editor temperature; or use a stronger model in the editor seat |\n", "| **No improvement signal** | Archive doesn't actually make next outputs better | Track output quality over time; if flat, the archive isn't helping |\n", "\n", "### 11.2 · Production safety\n", "\n", "- **Don't persist archive to disk without review.** Bad outputs in the archive corrupt all future runs.\n", "- **Track archive drift.** Compare quality of recently archived items to oldest — if drifting down, the editor is loosening or task distribution is shifting.\n", "- **Diversify the editor.** Same-model generator + editor share blind spots; rotate or use different model in editor seat.\n", "\n", "### 11.3 · Three extensions\n", "\n", "1. **Similarity-retrieved archive.** Use embeddings (FAISS, like Episodic Memory nb 08) to select archive examples most similar to the current task instead of last-3.\n", "2. **Persist to disk.** Save `arch.archive` to JSON between sessions; quality compounds across processes.\n", "3. **Real RLHF.** Train a small reward model on the archive; use it to *score* future outputs without needing the LLM editor each time. That's actual RL-style learning.\n", "\n", "### 11.4 · What to read next\n", "\n", "- [**01 · Reflection**](./01_reflection.ipynb) — same loop, no archive.\n", "- [**18 · Reflexion**](./18_reflexion.ipynb) — same idea, but archives verbal reflections on *failures*.\n", "- [**08 · Episodic + Semantic Memory**](./08_episodic_semantic_memory.ipynb) — generalises the archive into vector + graph stores.\n", "- [**29 · Voyager**](./29_voyager_skill_library.ipynb) — archives learned SKILLS (reusable code), not just outputs.\n", "\n", "### 11.5 · References\n", "\n", "1. Madaan, A. et al. *Self-Refine.* NeurIPS 2023. [arXiv:2303.17651](https://arxiv.org/abs/2303.17651)\n", "2. Ouyang, L. et al. *Training language models to follow instructions with human feedback.* NeurIPS 2022. (true RLHF, distinct from this pattern.)\n", "3. Self-distillation / self-improvement loops — modern LLM practice." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.0" }, "papermill": { "default_parameters": {}, "duration": 77.077119, "end_time": "2026-05-27T13:10:23.549536+00:00", "environment_variables": {}, "exception": null, "input_path": "all-agentic-architectures/notebooks/15_rlhf_self_improvement.ipynb", "output_path": "all-agentic-architectures/notebooks/15_rlhf_self_improvement.ipynb", "parameters": {}, "start_time": "2026-05-27T13:09:06.472417+00:00", "version": "2.7.0" } }, "nbformat": 4, "nbformat_minor": 5 }