{ "cells": [ { "cell_type": "markdown", "id": "4bc7495a", "metadata": { "papermill": { "duration": 0.014509, "end_time": "2026-05-27T04:42:32.829685+00:00", "exception": false, "start_time": "2026-05-27T04:42:32.815176+00:00", "status": "completed" }, "tags": [] }, "source": [ "# 04 · Planning — decompose the task, then execute the plan\n", "\n", "> **TL;DR.** Instead of reacting step-by-step like ReAct, the agent **writes a plan first**, then executes each step in order. After the plan finishes, a *replanner* decides whether to finalize or extend the plan. The whole approach is one bird's-eye-view pass through the task before any action.\n", ">\n", "> **Reach for it when** the task has natural structure: write a report, compare entities on several dimensions, run a benchmark — anything where you can sketch the steps before starting.\n", "> **Avoid when** the task can't be decomposed in advance (open-ended exploration, dialogue) — committing to a bad sub-goal locks you in.\n", "\n", "| Property | Value |\n", "|---|---|\n", "| Origin | Long lineage in classical AI (STRIPS, HTN); modern LangGraph idiom: [plan-and-execute tutorial](https://langchain-ai.github.io/langgraph/tutorials/plan-and-execute/plan-and-execute/) |\n", "| Reasoning style | Hierarchical: plan once, execute many, optionally replan |\n", "| External tools needed? | Optional (the executor is a sub-agent that may use tools) |\n", "| Memory across episodes? | No |\n", "| Structured output? | Yes — Pydantic `Plan(steps: list[str])` schema |\n", "| Composability | Executor reuses `ToolUse` (notebook 02) — **architectures compose** |\n", "\n", "This notebook continues the progression: Tool Use → ReAct → **Planning**. Each step adds one layer of abstraction. Tool Use is reactive (act → observe). ReAct adds reasoning (think → act → observe). Planning adds *structure* (plan → execute many → maybe replan)." ] }, { "cell_type": "markdown", "id": "37048f78", "metadata": { "papermill": { "duration": 0.008026, "end_time": "2026-05-27T04:42:32.849275+00:00", "exception": false, "start_time": "2026-05-27T04:42:32.841249+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 2 · Architecture at a glance\n", "\n", "```mermaid\n", "flowchart LR\n", " A([user task]) --> P[Plan
structured-output Pydantic Plan]\n", " P --> E[Execute
delegate to ToolUse sub-agent]\n", " E --> R[Replan
structured-output ReplanDecision]\n", " R -->|more steps queued
or extended| E\n", " R -->|is_done = True
or budget exhausted| F([final answer])\n", "\n", " style P fill:#fff3e0,stroke:#f57c00\n", " style E fill:#e3f2fd,stroke:#1976d2\n", " style R fill:#fce4ec,stroke:#c2185b\n", "```\n", "\n", "**Three nodes, one cycle.** The Plan node runs once at the start. The Execute node runs once per planned step (delegating to a ToolUse sub-agent). The Replan node runs after every Execute; it either routes back to Execute (if steps remain or it adds more) or routes to END." ] }, { "cell_type": "markdown", "id": "7f907e40", "metadata": { "papermill": { "duration": 0.011591, "end_time": "2026-05-27T04:42:32.880808+00:00", "exception": false, "start_time": "2026-05-27T04:42:32.869217+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 3 · Theory\n", "\n", "### 3.1 · Why plan at all?\n", "\n", "ReAct (notebook 03) is *greedy local search* — at each step, the agent picks the next-best Action given the current information. That's optimal for tasks where you can't see ahead (chess midgame, exploring an unknown environment). It's *suboptimal* for tasks where you *can* see ahead:\n", "\n", "- *\"Write a 500-word report on X covering aspects A, B, C, with references.\"* The structure is obvious; reacting step-by-step would just rediscover it.\n", "- *\"Compare Japan and South Korea on population, GDP, and education spending.\"* You can write the 4–5 steps before doing any search.\n", "- *\"Run the benchmark on three architectures and write a leaderboard.\"* The plan is implied by the task.\n", "\n", "For these tasks, **one upfront planning pass beats N greedy decisions**. The plan also becomes a **contract** — easy to inspect, modify, or replace — which is huge for production: you can show the plan to a human for approval before any execution.\n", "\n", "### 3.2 · The Pydantic Plan schema\n", "\n", "```python\n", "class Plan(BaseModel):\n", " steps: list[str] = Field(\n", " description=\"Ordered list of 3-7 atomic, actionable steps...\",\n", " min_length=1,\n", " )\n", "```\n", "\n", "We use `llm.with_structured_output(Plan)` so the planner cannot fail to emit a parseable plan. The schema description in the `Field` is what the model actually reads — it's the only way to enforce *\"atomic, actionable steps\"* without post-hoc parsing.\n", "\n", "### 3.3 · Why replan?\n", "\n", "Initial plans are often imperfect:\n", "- Step 3 turns out to require info that step 2 didn't fetch.\n", "- Step 4 reveals that step 1 was the wrong starting point.\n", "- The whole plan was based on a misunderstanding.\n", "\n", "The Replanner runs after each batch of executed steps and decides:\n", "- **`is_done = True`** + `final_response` → we're finished, here's the answer.\n", "- **`is_done = False`** + `additional_steps` → extend the plan.\n", "\n", "We cap replans (`max_replans=2` by default) to prevent runaway loops. Once the cap is hit, the next replan call is **forced** to finalize even with incomplete evidence.\n", "\n", "### 3.4 · Composability — Planning uses ToolUse internally\n", "\n", "Each `_execute` call hands the step to a **ToolUse sub-agent** (notebook 02). This is the first place in the repo where architectures compose:\n", "\n", "```\n", "Planning.run()\n", " ↳ Plan node (structured output)\n", " ↳ Execute node\n", " ↳ ToolUse.run()\n", " ↳ agent + ToolNode loop\n", " ↳ Replan node (structured output)\n", "```\n", "\n", "This composition pattern is the whole reason every architecture in this library implements the same `Architecture` base class — they all become black-box callables. **Meta-Controller (notebook 11)** is the natural conclusion of this pattern: a router that picks which architecture to use per task.\n", "\n", "### 3.5 · Where Planning sits\n", "\n", "| Pattern | Plan ahead? | Replans? | Mid-execution flexibility | Use when |\n", "|---|---|---|---|---|\n", "| Tool Use (nb 02) | no | n/a | high (one decision at a time) | one-shot lookups |\n", "| ReAct (nb 03) | no | n/a | high | multi-step search where each step depends on previous result |\n", "| **Planning** *(this notebook)* | **yes** | **yes** (`max_replans`) | medium (replan extends the plan) | tasks with obvious structure |\n", "| PEV (nb 06) | yes | yes + per-step verify | high (verify can trigger replan) | high-stakes / unreliable tools |\n", "| Multi-Agent (nb 05) | yes (manager assigns) | no | low (specialists own their step) | tasks needing diverse expertise |\n", "\n", "### 3.6 · What goes wrong (you'll see in § 9)\n", "\n", "1. **Over-decomposition.** Plan says 7 steps when 2 would suffice. Cost balloons.\n", "2. **Vague steps.** Step *\"Analyze the data\"* — what does the executor do? Surfaces as junky tool calls.\n", "3. **Step interference.** Step 3 needs info step 2 didn't capture. Replanner fixes it but costs an extra round.\n", "4. **Sycophantic replanner.** Replanner thinks `is_done=True` even with thin evidence. Always inspect.\n", "5. **Replan thrash.** Replanner keeps adding 1–2 more steps without converging. Cap with `max_replans`.\n" ] }, { "cell_type": "markdown", "id": "ff640794", "metadata": { "papermill": { "duration": 0.013479, "end_time": "2026-05-27T04:42:32.904718+00:00", "exception": false, "start_time": "2026-05-27T04:42:32.891239+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 4 · Setup" ] }, { "cell_type": "code", "execution_count": 1, "id": "9a8000ad", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T04:42:32.931197Z", "iopub.status.busy": "2026-05-27T04:42:32.931197Z", "iopub.status.idle": "2026-05-27T04:42:34.610809Z", "shell.execute_reply": "2026-05-27T04:42:34.607176Z" }, "papermill": { "duration": 1.690551, "end_time": "2026-05-27T04:42:34.611778+00:00", "exception": false, "start_time": "2026-05-27T04:42:32.921227+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
Provider: nebius  ·  Model: meta-llama/Llama-3.3-70B-Instruct ─────────────────────────────────────────────────────\n",
       "
\n" ], "text/plain": [ "\u001b[1;36mProvider: nebius · Model: meta-llama/Llama-\u001b[0m\u001b[1;36m3.3\u001b[0m\u001b[1;36m-70B-Instruct\u001b[0m \u001b[92m─────────────────────────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from agentic_architectures import get_llm, enable_langsmith, settings\n", "from agentic_architectures.architectures import Planning\n", "from agentic_architectures.ui import print_md, print_header, print_step\n", "\n", "enable_langsmith()\n", "print_header(f\"Provider: {settings.llm_provider} · Model: {settings.llm_model}\")" ] }, { "cell_type": "markdown", "id": "c28327ce", "metadata": { "papermill": { "duration": 0.006798, "end_time": "2026-05-27T04:42:34.631313+00:00", "exception": false, "start_time": "2026-05-27T04:42:34.624515+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 5 · Library walkthrough\n", "\n", "Source: [`src/agentic_architectures/architectures/planning.py`](../src/agentic_architectures/architectures/planning.py).\n", "\n", "Three nodes:\n", "\n", "1. **`_plan`** — calls `llm.with_structured_output(Plan)` on the task. Returns a typed plan; the prompt explicitly demands atomic, actionable steps.\n", "2. **`_execute`** — peels the first step off the plan, formats a context block (original task + remaining plan + history of past results), then delegates to the internal `ToolUse` sub-agent. The sub-agent's output becomes the step's result.\n", "3. **`_replan`** — when the plan is empty (or initially), calls `llm.with_structured_output(ReplanDecision)`. The decision is either *finalize with `final_response`* or *extend with `additional_steps`*. A `max_replans` cap forces finalisation after N extensions.\n", "\n", "Two structured-output schemas live alongside the architecture:" ] }, { "cell_type": "code", "execution_count": 2, "id": "95861d22", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T04:42:34.664201Z", "iopub.status.busy": "2026-05-27T04:42:34.664201Z", "iopub.status.idle": "2026-05-27T04:42:34.689884Z", "shell.execute_reply": "2026-05-27T04:42:34.688704Z" }, "papermill": { "duration": 0.043984, "end_time": "2026-05-27T04:42:34.691150+00:00", "exception": false, "start_time": "2026-05-27T04:42:34.647166+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--- Plan schema ---\n", "{\n", " \"description\": \"Ordered, actionable steps to solve the task.\",\n", " \"properties\": {\n", " \"steps\": {\n", " \"description\": \"Ordered list of 3-7 atomic, actionable steps. Each step must be executable on its own given the previous steps' results. Avoid vague verbs like 'analyze' \\u2014 say 'compute X from Y' or 'look up Z'.\",\n", " \"items\": {\n", " \"type\": \"string\"\n", " },\n", " \"minItems\": 1,\n", " ...\n", "\n", "--- ReplanDecision schema ---\n", "{\n", " \"description\": \"Decision after executing all currently-planned steps.\",\n", " \"properties\": {\n", " \"is_done\": {\n", " \"description\": \"True iff the executed steps' results contain enough information to produce the final answer.\",\n", " \"title\": \"Is Done\",\n", " \"type\": \"boolean\"\n", " },\n", " \"final_response\": {\n", " \"anyOf\": [\n", " {\n", " \"type\": \"string\"\n", " },\n", " {\n", " \"type\"...\n" ] } ], "source": [ "from agentic_architectures.architectures.planning import Plan, ReplanDecision\n", "import json\n", "print('--- Plan schema ---')\n", "print(json.dumps(Plan.model_json_schema(), indent=2)[:400] + '...')\n", "print()\n", "print('--- ReplanDecision schema ---')\n", "print(json.dumps(ReplanDecision.model_json_schema(), indent=2)[:400] + '...')" ] }, { "cell_type": "markdown", "id": "d667c8c1", "metadata": { "papermill": { "duration": 0.012767, "end_time": "2026-05-27T04:42:34.718119+00:00", "exception": false, "start_time": "2026-05-27T04:42:34.705352+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 6 · State\n", "\n", "The state has four real fields. Most LangGraph state fields *replace* on each return; we mark `past_steps` with `operator.add` so each Execute round *appends* its result rather than overwriting.\n", "\n", "| Field | Type | Set by | Reducer |\n", "|---|---|---|---|\n", "| `input` | `str` | caller | replace |\n", "| `plan` | `list[str]` | `_plan`, `_replan`, `_execute` (consumes) | replace |\n", "| `past_steps` | `list[tuple[str, str]]` | `_execute` | **append** (`operator.add`) |\n", "| `response` | `str` | `_replan` (finalise) | replace |\n", "| `replan_count` | `int` | `_replan` (increments) | replace |" ] }, { "cell_type": "code", "execution_count": 3, "id": "b37d0314", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T04:42:34.746637Z", "iopub.status.busy": "2026-05-27T04:42:34.745524Z", "iopub.status.idle": "2026-05-27T04:42:34.766443Z", "shell.execute_reply": "2026-05-27T04:42:34.763182Z" }, "papermill": { "duration": 0.039056, "end_time": "2026-05-27T04:42:34.768450+00:00", "exception": false, "start_time": "2026-05-27T04:42:34.729394+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "PlanningState fields:\n", " input : ForwardRef('str')\n", " plan : ForwardRef('list[str]')\n", " past_steps : ForwardRef('Annotated[list[tuple[str, str]], operator.add]')\n", " response : ForwardRef('str')\n", " replan_count : ForwardRef('int')\n" ] } ], "source": [ "from agentic_architectures.architectures.planning import PlanningState\n", "print('PlanningState fields:')\n", "for k, v in PlanningState.__annotations__.items():\n", " print(f' {k:14s} : {v}')" ] }, { "cell_type": "markdown", "id": "b48ebf94", "metadata": { "papermill": { "duration": 0.009696, "end_time": "2026-05-27T04:42:34.789620+00:00", "exception": false, "start_time": "2026-05-27T04:42:34.779924+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 7 · Build the graph\n", "\n", "Three nodes, one cycle. Compare this to ReAct's three-node graph (think → act → tools → think) — the topology is structurally similar but the *semantics* are different: Planning's cycle is at the *batch-of-steps* level, ReAct's cycle is at the *single-action* level." ] }, { "cell_type": "code", "execution_count": 4, "id": "b0679a53", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T04:42:34.819391Z", "iopub.status.busy": "2026-05-27T04:42:34.819391Z", "iopub.status.idle": "2026-05-27T04:42:41.452378Z", "shell.execute_reply": "2026-05-27T04:42:41.450856Z" }, "papermill": { "duration": 6.648823, "end_time": "2026-05-27T04:42:41.452378+00:00", "exception": false, "start_time": "2026-05-27T04:42:34.803555+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAGoAAAHICAIAAAAp4OiRAAAQAElEQVR4nOydB3xTVfvHz72Z3bSlu8yWbWkrRUGUCi1DBZkqRVAZypDhHxFBUd+CiIC4mKLAq8h42RuRIVP2pkCBbkqhdK+0Wff/JLdN05Lc5OYmzWlzv/DpJznn3JubX858zniEFEUhHksRIh4O8PJxgpePE7x8nODl4wQvHye4ypeSUJF8tbgwv0JWqlJWUIhChJCilAQpRGpl5WtIRpEUoa4MBAgBRam04YTmnyaIpJBa84IQIkqbBhHaWHWN2Er03hIkotTVMbXeahAgpKoZICEkEtLFU9S8jXP7rm6IA4Rl/b4rR4uuncovK1aCXiIxKXYiSSF8XRCFIoQEpaRIEaFWUDq9EKGJhTRqpebjdC/ocO13qvySAhGhUtBR2lg1pZ+qEj1Fqm9Ff45OfaRLQKqVNRQVQIiaUiqoCplKrUZSF0GLDq493miM2MNavstHCi8ezoUv5RMk6RzbuGk7CarPFOdSJ3dnZ94vUynULcJc+4z0Y3U5O/nWzUsrK1a1e96j+yBv1LC4c6709L5sqC/GzG1u/lUs5Fs+PcmvmXTI5CDUcDm+9UnC2aIX+jWOeNnDnPTmyrd02v2eb/i37+qKHIDl0++/PauFh7fAZEqz5Fv28f1x80KFUuQ4/DIzOSrGu1MvE3mQRKZYOSO551v+DqUdMO7blmcP5hQ+UTInMyHf73PTfJtI2j3nEGW2Fl36em9YlMachkm+S4cLZCWqwQ26rWCgU2wjZzfBtp8zGdIwync0/5kujZAD88bUZo/SyhkSGJXv6rEilVz94iAv5MA4uxNOruT2pUYzoHH5TuT7NnFCdUuvXr0yMzPZXpWUlNSvXz9kGyK6ez3JrDAWa1S+0iJl5z51OrTIysrKz89H7Ll16xayGc/GeKiUVMZdw0XYsMXl3pVSGKU3bWuT8Sz0NDdu3Lh37960tLQWLVp06dJlwoQJV65cGT9+PMQOGDAgOjp68eLFkKe2bt164cKFhw8ftmzZcuDAgUOHDqXvEBMTM3bs2KNHj8JVI0eOXLduHQRGRUX93//939tvv42sjdSZvHGqsElrA303w/KlJJSKJASyDZs2bVqzZs1HH33UrVu3Y8eOLVu2zMXFZdSoUT/++CME7tq1KyhI09aDgiDc559/ThBEamrqggULAgIC4BKIEolEO3bseO6550DETp06QYK///4bfg9kG9waifKzDZdfw/IV5SqkzraypF6+fLl9+/Z0bTVo0KDOnTuXlZU9nWz+/PmlpaWBgYFIm7N2797977//0vKBXh4eHtOnT0d1gpuXKDOpzGCUYY3kFWqw4iHbEB4evmTJkjlz5kRGRnbv3j04ONhgMijjkE9Pnz4NZZwOoXMlDfwAqK5wchVAJ8RglGH51CoVSdpKvuHDh0NpPX78eHx8vFAohNZ2ypQpPj4+NR5ArZ46dapcLp80aRJkPTc3tzFjxugnEIvFqK7QKEEarsoMyyeRisrLTQz3ODwNOUhLcnLy+fPnV61aVVJS8sMPP+inuXPnTkJCwvLly6GCo0OKi4t9fX2RPSgrVpOEYfkMZzE3T5FcZqvFG1DHQ6sKL6A9HTZsWFxcXGJiYq00BQUF8FenV7IWZCeK8hRCJ8PGK8PyNW3jDPMAyDb89ddfn3zyyYkTJwoLC0+dOgX9D6gNIbx58+bw99ChQzdv3gRloVxDj6SoqAia3UWLFkH/BjqGBm/YtGnTnJwcaMR1taR1KSlUevsYrisMywdmUZVSnZslRzZg9uzZoM60adOg+zZ37lzo5UHvBMKhDenfv//KlSuhYfH39//6669v3LjRs2dP6M19+OGH0OkDWXVdP31efPHFiIgIaIgPHjyIbEBpkaJ1pIvBKKPm0l9mJfs1kQ6cGIgcmzvniw9vejzp+1CDsUab17ZRbsY6Ow7FuYN5nj5GW3mjfePoIT43ThdeOVoY2dOwwfrRo0dQ8RuMcnV1hcbUYBQUWxhyINvwXy0Go6CnbaycQd/IYJ1AU5Qn/2BeqLFYprmOIxuf3LtaNH5BiMFYpVKZnZ1tMKq8vFwqNWzdhwbBdv2PYi0Go6AJcnd3NxgF4fB7G4zaMD8d5tFHfN4UGcHEVNGvX6Q0b+PcawS7yeOGwYPE8t2/Zk78LoQhjYmhxftzW9y9WlJR6ogLePesftjtdR/mNKZHZr3i/NbGpyAHY81XqU1aOYd3d2dOZtY8b94j+YZFGZMWhyDHYMWnydGDfds/b3p+0dxVBikJsr2/ZYZ3b9R9kCUrkeoL6bdl+/+b1bSt86uj/M1Jz26JEMy9C8VE33f8g0LrehqkDti4MKPgifylAT7PvOhu5iWsF6jt+y0rLbFM6ixoFeH6UoPIiVdPFt88mV+YK28cKH3r42BW11q4PHL/mkcP7ssUFSqwqrp4CMRSgbO7kEKa5ZEmryXA+kMStZYsGk/81GpRI/ckSUJl3j0FQoG8QiUr1vwvl6kEAsI7QDJ0QhBib0K0UD6a0jz12UO5TzLKi/MVSGPjJNRmyIcopFl1qjInITwd2NrMWOlEUBr5zDMSCeCOYsLJReDpJwrr6hncxvIZMU7y1QF9+vTZsGGDtzemqzFxX1kPQ0MY5yFc4eXjBC8fJ3CXT6FQwKQ4whWs5YPpSqSdmUO4grV8mJdcxMvHEawfDvOKD/G5jyO8fJzg5eMELx8ncJePbzosh899nODl4wQvHyeg28zLZzl87uMELx8nePk4wcvHCd7iwgk+93FCIBC4uXE6Y8rW4D5VVFhYiDAG76IhFEL5RRjDy8cJXj5O8PJxgpePE7h3XHj5LIfPfZzg5eMELx8nePk4wcvHCV4+TvDycYKXjxO8fJzAXz4cdxXFx8fv3r2bfjD4S2ghSfLChQsIM3BctD5hwoTmzZuTWmDYC39BPmMHrdkXHOXz9fWNjY3VDwH5BgwYgPAD0y0TI0aMaNasme5tUFDQwIEDEX5gKh9MsPXv31+3IaZ3796NGuF4gjS+G3aGDx9O13eBgYGDBw9GWGLlljfjTsXdS0VlZYrqD9CeG6j5EEKzv5nQutCh1FUfSmjOZtTfGq05aBDitdvCMzMz7yfdDwwIbNu+daUHnurbEhpPPGpK/4PgUzR3U1en0f92UqnQL8S54wvWPH/fmvKt/U9ahUwtFBOKcn3nS9q/GvkoRIE0Whmp6s+HdlVfPo0KdHotakoNnRaBgFKrapzfSJDaO+pvvoeCpK76WwlV9fEaJE6kXE4JBGjghCCfYOuc3Wk1+VbNSgkMcYl+wz7nY5pPwpniq/88GTo5uLE1FLSOfL/NTm0Z5tG5ryeqD8jL0f++S564qCXijBWajot/FUAVVF+0A8RS5O4p3rYkC3HGCvKlJpY6u5k+7AIrGgdLCp6UI85YwWQgK1EaO9YYWwghJa8w69AXZqwgn1rjKtAKj1KnKM06scckDurik7JSj8NB5aN73dyxhnyktjdcr6AofHKfWjOeQPUKzSFk+OS++qYe5D41RrkP1bvch6xT31in6YCGDDkk1pDPWj9lHUIh61Q41pCvPuY8sAtaY6RkBfk0Fjqivg3a1JqOM+KMFeTTPAbeR1AaxCpPbAWLi8bwa70pk4GDY/9Y9xuyMRjVfVQ97LhYC0cd82LV72NLv9ejh8eNSky8deLkURcXl7CwyM9mzXVzrb15cvuO/509e/L27ZtiiSS847NjxnwYFKiZuoyfMxNG/LExr3y78D8yWVn79mHjP5jart0zZn++1QqvFSotCx5DIBBu2bq+X7/BRw9fWPjt0vT01CVLF9VKc+PGVQjs0CF8zpzvZn4an5+fN++b2XSUUChMuHX90OH9K1esO7DvlEQsmb/gK8QG7fIZK2Q/K8hHCilSyPo+oSGtO0d1gUwEeWfA60OPHTukUCj0E0D42tWb3x4+KjIiClK++cYIyIaFRZX7U2VlZZ9M/zIwIAikjOnZNyMjzaCjS2OoAUzMpZSSsMDaHBraRvc6KLAJaPfw4YNmzVroAgUCAYQsW7749p2bpaWldGBBfp6Hu8Z3UpOmzZ2dnelAV22pLy4u0oXUGXZbpCGRVDszkjpp/C+UltbwDnX69PHPv5jWpk37H7//VVPGFyzVj+V4HizYq7BpOgjKgifRF6tcJkOaRRQ1nFjs3b8jLCxi7JgP6bclJcXIemhWiWDSdCCLxj/Xrl3Svb53PxGqsKCgJvoJiooKfRpXr1k4efIosh4YNR0acyn7R3mSkw2Nr0qlgmZ3777tPXr0lkhqeC6AtuXCxbNXrl5UKpWQkg589NgKc9sIq6bDMnNpv9cGJSRcX75C41f22cjOkyd9UivB6NETy8pKZ38xTSaTDR40DPouWVmZM2dN+fyzrxE2WGHC7vf4NKhIhnzU3PxLBgyKGTI47p2RY5Gd+HfP46SrxRO/C0XcsIbBCt8llkbBaKaNUtc/g4FmDaY1fnWrzLQhxLITtWvHEWRfKldhcsUqTQeqdwYrzTwlLvJZ1G22M5rSi8dch3aasr7N8+qvn+aAo66wwmii0oGxypiX4lcZcICsfx1njUtGXPp99W+BFXRc1JQ11hPzdR8nePk4YQX5xM5kvSu9IrFI4mSFvShWqD/dG4kqrLDDpE4pylGInawxy4g40yvOv6xYjuoVOQ9lIWFWONLYCvKJXVFwiMumhamonrBjaYZESnZ73QtxxmobUq8cK7zwd55fU6emrV1VetO+hNYiQ1R/HtIZGLT7l6veVG29JapeUtr/pPYKSn9bcNVtUdWoVT+K0n0qVfnRpC4ZQeSkl2cmlXoFSAZOCEDWwJrboa8dL75yPK+8TK2oqN7fTBA1Oob0nu+q15RuVbQBCVBtaWuLqHdnmPWlqMpfg6r6FN1H04FiMSGWCpq2c40ZZjWP9Lg71+7bt+/69et559oWwrs35gQvHycw9/bE5z5OYC0fNGtqtVogwHenP+8thhO8fJzgXT1xgs99nODl4wQvHyf4uo8TfO7jBC8fJ3j5OMHLxwlePk7w8nGCl48TvHyc4LvNnOBzHyd4+TiBu7cYHx8fhDFYy6dSqbKzsxHG8L6KOMHLxwlePk7w8nGCl48TvHycwF0+6LsgjOFzHyd4+TiBu3y1jlXDDT73cYKXjxO8fJzg5eMELx8nePk4geOuosmTJ586dUrnTIgkSbVaDW8vXbqEMAPHQxymTp0aHBxMVoG0CjZt2hThB47yhYaGvvjii/rFArJedHQ0wg98nWs3aVJ9lCm8Hjp0KMIPTOULCgqKiYmhX0PFFxUVRXuKxg18D7AZNmwY7d0d/r711lsIS1h0XFKuy2RlCvXTDbXBk19rbWs21Lzrb41+Ko5ClKR31/f/Kf8nrM0zsic+N7OLdPchSGTsEBv9ndIGP71GjKH7iMWi1p2ckHmY1XHZsTTrUboMvq1SUenerPbj6e+xr/2UGh+CtbaM05AEqvVjENUb5zUHY1UlJ4iaG82ZdK/9awE0AwAAEABJREFU6U/dWe8EAIP3EUlISoXcvMQjZpmuLkzLt2NZVmG2vPuQAJ9m1vHnjT/ycnR0fVZxXsXor5szpzQh34aFD8BaPnAijtW2rTmzJy8jsWjM3OYMaZiajrwsVPikwjG1A7r294K8dWJbHkMaJvku/v1E4uzQh1x5NJam3y1hSMAkX0l9OxvI6pAiqkLGNNXHlLmUcjX8Rw6MWqlWMVp8HLpsmsTk2ey8fEyQQpL5IAVG+QT18jh/K6IpvIyrHBjlUyHKoas+zcCI+WxxvvAyQZpyJsjLx4Ta1AG7Juo+5KjnWdMQQpK0vOlQIVQPnSFYEUqpVlvedDg8pJCwvONCCBDp6B0XyvKOC6Wm1A5d9XHLfZqKz7HlM5n7mAonWQ9HHfFzZu4/sAtZCU3uY8xgTPKo6+GoIzHxFrIemtzHaHGxcu5SKpW/rPp51Jg3X+vf/dNZU86ePUWHHzq0P6bXc/fv36Xf3rp9s0dM1Amt105jlyDtlsBN//vjlddehP8fT59w48ZVOhzeQrgu2cJFc8aNHwEv4J5Zjx4u+m5u/wEv01F/HdwzcdJ7kB7+bt22gfWCHlOlj7HwCqHxZfd5Py9ZCE85aOBbG9bvie4e81X8jOMnNC7ZevV6tdOzzy3+/muktQLBi9iYvt1f6slwCbDq1yW7dm2ZE//d7M/m+fj4fTprcnp6KsOn/7X/NPz9ZPoXe3YdgxeHj/y1YGF861ZtN/y5e+yYD+FTli5fjFhhqvQxFl4lolQsus1yufzg33uHx733ev8hHu4er74yIKZn3z/W/UrHfjxtdkpqElRMO3dtycvLnTplJgRWVFQYu6SwqHDzlj+HDXu3c1SXbt2ip388O6pTl9y8HPOfZ//+nR07Rn40daanp9ezkZ1HvTt+587N+fl5yHowyQc5l5Uvs6Tke6Bg56iuupCI8E7Jyfdph+J+fv6jR02ADLVmzfJPZ/zH1dUVAu/evW3sktSUJHjbtm0HOlwoFM6JXxQZEWXmw8Bw9WbCNf07R0Z2hsDrN64gsxFwsvexhHbgPHnqmFrh+Xm5tEPxwYOG/ff3X4QCYcewSJOX0FFSPR/crIBfRaFQrF6zHP7XuDOb3KfiYu/T+O5kU/V5e2nOgv942ue1/Dz7+vrTL6C+DwgIgm+16tefoUxpLmnsY+ySgoJ8eFFWVmryc1WGxqVSqdTZ2bl3r9e6d4/RDw8MsOa8K+OgjWXhBWloD8+6IgY/NTQUtMP61NTk3/9Y9fNPq5UKxZSPxsIXa98+LDioqbFLQkPbQIG9dv1yu3bPIG2DM+vzj3pE9+rTp59YLJHJynSfm5GRZvB5QkJaF5cU6+4MP1tWVqavrx8yGwLMpYxtL1Mk29zn5OT03rvjoOKHHgaUHWhAp8+Y+ONP3yJtTfT1N5/HxrzSrm2HsLCImJ59vvn2S+iygEzGLoHKsVfsq9DyHvhr95WrF5csXXTp0jlaStAdUpaUaGZg1/25Oien8rwD+CV8fHwvVrnkfn/MpNOnj0FjBZ8O958zd9a06ePhU8z/RhpPoIxtL9Mijc3fZxQ8UcbNbIHYcOHi2e07Nl2+fN7FxbVD+47Tp38BFR98yc2b161fv9vdzR3SQMF8e+SAoUOGj3pvvLFLkLZdBikPHd4PHcDQkNbQ8nTt+hKEZz58sHjx16ARZM+33hwJsXDtLyv/hKhdu7eu/e9KpVKxccNeN1c3yJjrN6w9c/ZkebkM7vzBB1Patmlv/nc5sDYj/7Fy3HyjCjDK90NGYbZyGEv5GhIH/vug4LHig2+MKsBsMnBwiwE90csUb82mwwExkfuQY8tHCkhmgzGjfES9c5ptZdQqaLSZEliz2+yAMNZ9hKPXfZwmKq3kv7sew2misnp1O48RmC35mjEf4jGOiaZD7dgzlZzsfYTDr+/jZO/TOMh08PV9pmAsvPyY1xRM8gklhFjq0KVXLBFJxEwFkEkddy8x3ueX2ZyKMpXYhXEulyEudoivvAzrgyxsTVGuvFW4O0MCxrIpRn7NnbcsTkMOyf41WSKpoHMfD4Y0pjekntyRc/tiSfvnPZ950QNjb5HWJPlq6dUTeWIJETfDxLScWduh/9mSk3StWFFBQT+ITq0ZztEDEmM2QcqIQ3LtxTVMEcxWRTaJzTCw1b6eIGorQAoIoYjwbeI8aKI/MgW7Y3BUckS3JWRVj5Cs8txO6QWiKofqVM3ESJuSMP4WUvbu23fXzp0SqRTpfVH6PqWlpcPj4saNG/fqa689/SlIu5RdpXdnyngCndBPh4jZbPpmt8pAIEY2Lb737t3z8/N0cze8suD8+SsV8pLfVq94tlOYsVNxTD6egP0lDODVrbt27Vp4eLix2HPnzhUWFmZlZX322WcID/CS78aNG2FhYcZiz58/T9e5iYmJ8fHxCAPwku/69evGct/du3dLSkromRsQ8dixYzt27ED2BiP58vLyoHEwdlrQzZs39Z1PFBcXr127Njk5GdkVjORjyHoAZLdas14PHjyYPn06sisY7SoC+RgqvqQkzWpJUFAsFru7u4tEon379iF7g5d8kyZNMhbr5OREH3/4+PHj/Pz8tm3bIgzAqPAy91q2bt1KvygqKsKk2UX45D7osjzzzDOEGTNTrVq1atmyJcIDXORjznq1mDdvHsIDXAovc4e5FqB1RkYGwgBc5GPutdTi1q1bmzdvRhiAhXwPHz6Ejkjjxo3NTP/CCy8EBAQgDMCi7oOs17FjR/PTN9OCMACL3Meq4qPZtWtXWVkZsjdYyMeq4qM5ceLExYsXkb2xf+GVy+UwIGM7ihg2bBiBwfIl+8tnQdYDOnfujDDA/oUXOnGs2g0aMDtv2LAB2Rv7y2dBuwF4eHisWLFCJpMhu4JF7rOg8AKzZ8+mt7XZETvXfSkpKdBbdnNzQ+zp06cPsjd2zn0w6dOzZ09kEWC+P3z4MLIrdpYvNDT0+PHjyCIOHjz45MkTZFfsXHhBPpiyKC8vl0pZb7rv2rUr2P6QXbF/0wFWUiiGiD1gOPDx8UF2xf7yQa8F+i6IJTBRuWjRImRv6mvuS0hISEuz/8LD+ipfYGDgRx99hOyN/ce83t7eEokELKagiPlXYeJ3DAuDlQUZECo+UBzZm/oq35YtW/z9Ta/+tDX1Uj6wFKxevZrE4GBVLHxUwjM899xzFy5cQPUNLHIf2I3bt28PfREz02/fvv3AgQMIA3CZ52VVfmGiwzIjjdXBRT5WY49JkyY9//zzCAPqZe4DQwNMqyMMwEW+4OBgMB0XFBSYTAkmwq+++grhAUbr+/QzYL9+/Ywlg5k5+kBAHMDIuXa3bt0qKiq03pWooKCgPXv2GEyWn58vFotdXFwQBth/zPvqq68+evSI0IKqNpkxLGHx9PRE2GD/wvvNN9/A8Et/yQC8jooyekLu0KFD7T4/qcP+8kVERIwcOZI+h5jGy8sL6kGDicFMIJfLnZzM9Z9ra7BoOuLi4mJjY4XCypoE5j3atWtnMCVY5zdu3IiwAZeWF+a8IcfR7QZMABlrGaC7h0mjQYNRx+X7778PCQkRCAQMy3+gx/fPP/8gbLCw47L318cPU8qUCrVKyeZytsdRsk5PsT0yjz5mydNX8ua0IMQeS+Rb/22GUk61jWoU2rnylAk9z99VO8gppCZqhOhe63sorxVbK9xA+prb+smaxxzp3hI1fZk/fX8dAoQy78uuncwtzpOP+5b1dhHW8m2Yn6FG5ICJlvxWOHP1SNGdy7nvf92C1VXs6r7LR4pLipUNTzsgIsZd6izcsewRq6vYyZd4udDdm81RCfWK4FauOVnsOuTs5CsvUzVgb+UefiJFBbsj49hpoShXKyoUqIGidaXNriXgnSxygpePE+zk05xij3iqYSef5hxxxFMNy9wnUDfgo4gt+GIsc5+KVDfcY+wt+GJ808EJlvI16IaDYH9IP0v5GnTDASYigmUGYSkf2ZB9AJCs1WMrn5powPmPQqyNn3y3mRPsLC726jYPHBz7x7rfEH7wHRdOsG06GnLhtcCvGtsxL2tf7wMGxbwzYuyJU0evX7+ya+dRdzf3vw7u2b1nW0rK/RYtQnv26D1kcBy9QqPf69HD40YlJt46cfIoTOaGhUV+Nmuum2vtVaTbd/zv7NmTt2/fFEsk4R2fHTPmw6BAzYGTO3ZuXvfnbz9+v+qr+BmpqcktW4a+MfTtvn36m/2klvhVY1f3kRr3O6yu0Exs792/IzS0zaKFy5ydnBn8hQsEwi1b1/frN/jo4QsLv12anp66ZGntXWs3blyFwA4dwufM+W7mp/H5+Xnzvpmt+6CSkuKflyz85OMv4A7R3WMXLpqTm8vCGbcFsG862PqWJwh3d4/JH06P6vS8UChk9hceGtK6c1QX7UrxsAGvDz127JBCUcO4DeFrV29+e/ioyIgoSPnmGyMgG9LOu5HWA++773wAaeAOfXr3g25ISmqS+Y9q88JrGW1aV/p0pf2FvzPyfV2Uzl94tNaHM2RSXVRQYBOQ4+HDB82aVU8eCgQCCFm2fPHtOzdLSysdRxfk59EOaZGeL3M3rSdbc5xL67Cg8LIfdbDvOIurzjE36S9coueJXKpdRlVaWuOwh9Onj8/+8mPIfeM+mBoS0uripXMzPq1xXieXs3FsbrDSjDo4DNtM+gvXF6tcu4hPKq2xFg2q0bCwCKg06be0+3drYXODFfdRB7O/8GvXLulS3rufCHVlLZfvRUWF/n7VR8+dPHkU2ZW6HnUw+wt/kpMNja9KpYJmd+++7T169KYdv+uAtuVCletxSEkHPnqchexEXY86oOitWrl+/Ya1v6z6mfYX/vXc73Ua9XttUELC9eUrfoDX0C5PnvRJrctHj54IrcHsL6bJZLLBg4ZB3wUy78xZUz7/7GtkD9gtEfptdnIjP1Gfd5ogGwAdbOhCvzNyLLITiZeKzuzJnvxDqPmXsMt9ahUB/xFPFbzJgBMYybdrxxFkV2w/10E0cHOpbec6KHy2cNkCAtl2rqNhZz4KIdvOdfDUgv2gje+36MG26aBY20vrD7Y31mu6zQ228bC9vY+nJrx8nGAnHylGAmGDrfsEAs0WN1aXsJPPSSpUqxqst/LyIiRimTnYaREY4lSYU4EaKGmJxW7e7I6HYSdfjzcbIxV1+W/Tp63UR/IfVwybEszqEks2pK76LNXbz6n3e36ooXB+f+7dK4UjZzZz9WLnrdfC7dDr5mWUFChIAVLIja7agPGJAQuQtnNV6ZebMnIVZSDk6XA6qtZXqEpmoA9HkoRaXTtQICYpFZJIyUGTmnj5sfZ0zOEYHBW6caZYVmp0i5th+ego4x1UTZRe9IH9B3r27CGRSmuF64DAGjPPtd/XfJ6nYkQiQbN2rl4BFrqIxugUIYP07t1706ZNXl5eCEtwly8pKal58+YCgU1dolsO7vJhDu594NGjR1dU4NvTxH3Me/fuXYQxuBfee/fuhSo5AowAAAsOSURBVIaG4uBTzCB83ccJ3Ou+ESNGIIzBuu5TKpX3799HGIN14YVnA/ns7lGHAb7u4wTWdV9paen777+PMAb3ui85ORlhDNaFV6VSpaamhoSEIFzh6z5OYF335ebmTpkyBWEM1nVfeXk5Dt7EGMC68CoUiszMTLD3IVzh6z5OYF33PXjw4NNPP0UYg3XdJ5PJ0tPTEcZgXXjBzvzo0SMG7wl2h6/7OFF3dR/8TmqWgLll4cKFavaguqLucl9xcTHbSR/ouIDVoFGjRoglnp6edTO3iXXTIRQKMfHHZgys5YMZImwnyGmw7vdB4S0pKUEYg3Xug3oZbFYIY7DOfSKRSN+H0dMUFBT07dv3xIkTyE7wdR8nsM59crlcd9YNntgz9+Xl5a1aterWrVvQH+zUqdPw4cODgzVLi8FAP378+J9++mnjxo1nz55t3LhxdHT06NGj6Zx47NixP/74A3qRXbp0GTJkCLIrdst90CaANeX69euTJ09esWIF9I2nTp1Ke2unvZ+CfD169Ni5cyck27ZtG13BpaSkLFiwIDY2ds2aNfAXLkR2xW7yJSQkZGRkzJgxo3Pnzl5eXjAh6e7uDmLpErz00kuQ6aRSaVhYWEBAwL179yBw7969vr6+kE+hOx0eHv7KK68gu2JP+SCXRURE0G+hlejYsaO+h+PQ0FCo+2h/ii4uLnQHELKnvgGmdevWyK7Yre4DOaBXDN0O/UD94S1JkjD4h6le/QRFRUVBQdWekiBvIrtiN/mgwMKXj4+P1w+s1U0Ri8W1vEBDAde3O9jd16fd5GvZsiVMpPn4+AQGBtIhWVlZHh4e+mkgA9a6Ciq+c+fOQa6ko+A1sit2q/siIyOjoqJ+/PHH7OzswsLCPXv2wJTuoUOH9NM83e/r3r07jDSgwYXx3LVr14y5QK4z7NnvmzNnzr59++bPn3/79m3o8UE3ZcCAAfoJnh7zQvdw7NixcBW0uZAToeGePn26HQ3mWJtLaQO1BeM23lyqgR/zcoK393ECf3sf1vJBpw/zwsvXfZyoO/l0pzebT35+fmJiIhimEEvqbBdS3ckn0cLqkvT09PXr1/fq1QvhCr/GhRP8GhdOYN3vg8I7a9YshDG47+tISmLhMaLuwbrwgsUFzMv82uYGC9Z1X05ODky/IYzBuu6DAS9f91kONB3Q+IJZH+EKX/dxAvf9vOPGjUMYg/s5LmAyQBiD+1kG0HSEhrLwP1LH8HUfJ7Cu+6Dj8u677yKMwd3aDFPACGMwLbx9+vShrdMw2SYUCkmShD4gDH6XL1+OcALT3PfkyZNaC1zc3d1HjhyJMAPTuq9bt261tqa1bt26a9euCDMwlW/06NHe3t66t87OzsOHD0f4gal8kZGRYWFhuretWrXq3r07wg98Oy5jxozx9/dH2qwXFxeHsARf+Tp06BAeHg4vWrRoERsbi7DECh2XtATZ5eN5eVkKpVKtVmkO+FYra9+z9pHTugOsCQo95fCX0DwUUfl0ako76U2nI55+doJ231V9N6TvMqf6VpVHjGvuJnESePuLQ8JcO3Z3R9zgJN+OJQ8fZcgoNSIEhNhJJHVzEksFFKmmDGznVhvM6YaPUtd8Ze1x6nqnhD99Yrju2uoojZKUvgNmgqj+ggIBoVYhZYVSXqaQy9RKuRzSeflLXhkd6O5pYSm0UL49v2SlJZaKxALvJo0at+T6G9qL4mxZdkp+eXEFiBj3iSVuXy2Rb+WnyfCrNu8UIHFl590CW5LPPSwvlce+5dc6ypXVhezky38s37AowyvQPaAdpofIW0xJjjz9elZYN4+XBnqbfxUL+QpzlH/OT2sf3ZxoIHnOAAlHUru97hNhdpNirnx5WaqNi1I79GqOGjq3j6W1i3J7+Q0fcxKb2+JsXJzSpIMvcgDavdzs5pnC7FSFOYnNkm9tfKrUVeIe6IwcA78Qr63LzDo7y7R8t86UlBUpQ54PRA6DT0sPoUCw65cskylNy/fvvifuvuya8wZAUJhvRqLpcwBMyJd5r6K8TNWko1n1aN1TUpo//Yvnr944jKyNi5dEICQP/pHNnMyEfMd3PBZJHNSPpau3S9otE5tyTMhXkK1w93NBDklg28YVFSY25ZjIWWqKCmhpqwFGUXHungM/pmZcl8vL27TqEhs92tdHswo863HS4qXDp4xbc/TE7zdvH/dw940I6/Vqrw/pPR5Xrv/915FfZLKi9m1fiu72NrIZAjEiBcSlIwWdYowe4caU++5dLNEYMWwzxoA53JVrJialXh7Sf+bHkza4unj9vGp0Tu4DiBIKNB+5Zdf8yI59vv3q1PCh8cdPr7+WoKngsh7f37D1y6jIV2d+tC0q4rVd+xYjWyIQCh6mMG1YZ5LvcbpcYDNX2inpV7NzUuOGxrdt3dXdzbt/3ykuzo1OntmkSxDeoWf4MzFCoSikxbPenkEPMu9A4L/ntjXy8O/18hhnZ/fQlp2ejxqIbAkpIEvylQwJmApveYWCstn2nNS0awKBqFXLKPotmHBApuTUK7oEwYHtdK+lUjdZeTG8yMnL8PerXu7XJKg9sikCQqVkqv6Y5COr/CTaAll5iUqlgG6HfqCri6fuNUEYKBllZUWNvasNc2KxE7IlJIWYN9UxyefiJbLdCgQ3V2/48qPfrlF5PX32Qy2gzCoU5bq3FRW2PeEKZgpEUqZHYpKvaSuXi4fykG0ICmgtl8saNfJr7FXpEjc3L1M/9xnEs1HArTsndSdp3Eo8hWwJpVR5eDNlcCZpA1qKoeYrzbOJk75WIZ3btuq6Zee8/IJHJaUFp89t/Wnle+cvmzgZI7xDLIw0du5brHFjlHzp33NbkS1RKtStnmUasJro9zm5CnLTi1y8bDJoGz3i+zMXtv+5eXZaxg2fxs2eDe/7Ute3mC9p0+r5fn0mnzm//ZMvu0AT/PYb8ct+G4dsU0EXZpVB3d+8HVPuM2EuPbQh+/610nYvN0WOx/2zWVIJNeIzpikkE1V1r+G+apVaVlJ3B0njg7xM3uUVE/Meps0BPoGSzOtZoS8EGYyFWvzL+Ya3KyuVcujZGdzY7e/TctIHvyLrsXrdtJT0awajFIoKkcjANmyxSPrljH3ICBk3coQiFBppwkJs1lzHso+TWkYFOTUyPHzLy39oMLy8vEQqNVzvkqSwkYc1Tf9FRTlKldxgVGlZkYuzwakfwsszABnh1pHUPu8EhnQ00a80S75j23LvXChqG+0oNeD9M1kurihuRrDJlGbNdbw8xNvVXZB09iFyALLuFigVcnO0Q+bPtI34rKlaqbx3uoEr+OheUV5G/vj55u6iY7fKYNN3D0qLUUiXANQQyUzILcounrCQhVc91mtc/piXXlKgDHkuSOLaoIz4905lqJTq8QvY7d60ZInQP1tyEs4USFxEzTsFiSSYur02n5QLj0sLZf7NpEOnBLG91vL1fRsXZuQ+qhCIBG6ezo1DGknrV2ZUocdJ+YXZpfJypYu7cMAHQV4Bljw/19Wl+1Y/ykySyWUqsOrDzB4hJCkVxXTPp5Y5apZC0qtCieo4il5Tqg3VDyTppHSwNopC+stLqz5De73e5dq/2jBKrX04MOSJSJ8gSewwPw8fy4/JstquoqSrpZn3ystKFeUyNVQiunBSWGOtLimAL6C3orRKAY0aJNItS4U5AkrzVaGDjSr3d2gX6FYqgyrV1qqkkR/SVK/QJavuq9alqfwQkYR0dRP6Bjs986J1Jv75HZWccNApcGvBy8cJXj5O8PJxgpePE7x8nPh/AAAA//+AJM4pAAAABklEQVQDADEy9KR4cYYcAAAAAElFTkSuQmCC", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "---\n", "config:\n", " flowchart:\n", " curve: linear\n", "---\n", "graph TD;\n", "\t__start__([

__start__

]):::first\n", "\tplan(plan)\n", "\texecute(execute)\n", "\treplan(replan)\n", "\t__end__([

__end__

]):::last\n", "\t__start__ --> plan;\n", "\texecute --> replan;\n", "\tplan --> execute;\n", "\treplan -.  end  .-> __end__;\n", "\treplan -.-> execute;\n", "\tclassDef default fill:#f2f0ff,line-height:1.2\n", "\tclassDef first fill-opacity:0\n", "\tclassDef last fill:#bfb6fc\n", "\n" ] } ], "source": [ "from IPython.display import Image, display\n", "\n", "arch = Planning(max_replans=2, executor_rounds=4)\n", "graph = arch.build()\n", "display(Image(graph.get_graph().draw_mermaid_png()))\n", "print(arch.diagram())" ] }, { "cell_type": "markdown", "id": "90659661", "metadata": { "papermill": { "duration": 0.014599, "end_time": "2026-05-27T04:42:41.492301+00:00", "exception": false, "start_time": "2026-05-27T04:42:41.477702+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 8 · Live run\n", "\n", "Concrete task: a comparison that *naturally* decomposes into multiple lookups + a synthesis step. We pick a comparison the model can't possibly have memorized to current accuracy." ] }, { "cell_type": "code", "execution_count": 5, "id": "fd95efd5", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T04:42:41.519723Z", "iopub.status.busy": "2026-05-27T04:42:41.519723Z", "iopub.status.idle": "2026-05-27T04:46:17.489207Z", "shell.execute_reply": "2026-05-27T04:46:17.487859Z" }, "papermill": { "duration": 215.984265, "end_time": "2026-05-27T04:46:17.489207+00:00", "exception": false, "start_time": "2026-05-27T04:42:41.504942+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
Final answer ──────────────────────────────────────────────────────────────────────────────────────────────────────\n",
       "
\n" ], "text/plain": [ "\u001b[1;36mFinal answer\u001b[0m \u001b[92m──────────────────────────────────────────────────────────────────────────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
As of 2026-05-27, the latest stable release version of Python is Python 3.11.4, and the most popular package       \n",
       "manager for Python is pip. A well-known company that uses Python in production is Google. The latest stable release\n",
       "version of Rust is Rust 1.68.2, the most popular package manager for Rust is Cargo, and a well-known company that  \n",
       "uses Rust in production is Microsoft.                                                                              \n",
       "
\n" ], "text/plain": [ "As of 2026-05-27, the latest stable release version of Python is Python 3.11.4, and the most popular package \n", "manager for Python is pip. A well-known company that uses Python in production is Google. The latest stable release\n", "version of Rust is Rust 1.68.2, the most popular package manager for Rust is Cargo, and a well-known company that \n", "uses Rust in production is Microsoft. \n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ "
7 step(s) executed  ·  0 replan(s)  ·  budget 2 ───────────────────────────────────────────────────────────────────\n",
       "
\n" ], "text/plain": [ "\u001b[1;36m7\u001b[0m\u001b[1;36m \u001b[0m\u001b[1;36mstep\u001b[0m\u001b[1;36m(\u001b[0m\u001b[1;36ms\u001b[0m\u001b[1;36m)\u001b[0m\u001b[1;36m executed · \u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;36m \u001b[0m\u001b[1;36mreplan\u001b[0m\u001b[1;36m(\u001b[0m\u001b[1;36ms\u001b[0m\u001b[1;36m)\u001b[0m\u001b[1;36m · budget \u001b[0m\u001b[1;36m2\u001b[0m \u001b[92m───────────────────────────────────────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from datetime import date\n", "\n", "TASK = (\n", " f\"As of {date.today().isoformat()}, compare Python and Rust on three dimensions: \"\n", " f\"(1) the latest stable release version of each, (2) the most popular package \"\n", " f\"manager for each, and (3) one well-known company that uses each in production. \"\n", " f\"Cite at least 2 source URLs.\"\n", ")\n", "\n", "result = arch.run(TASK)\n", "\n", "print_header(\"Final answer\")\n", "print_md(result.output)\n", "print()\n", "print_header(\n", " f\"{result.metadata['steps_executed']} step(s) executed · \"\n", " f\"{result.metadata['replans']} replan(s) · \"\n", " f\"budget {result.metadata['max_replans']}\"\n", ")" ] }, { "cell_type": "markdown", "id": "c6c87ad8", "metadata": { "papermill": { "duration": 0.011881, "end_time": "2026-05-27T04:46:17.506250+00:00", "exception": false, "start_time": "2026-05-27T04:46:17.494369+00:00", "status": "completed" }, "tags": [] }, "source": [ "### 8.0 · What just happened, briefly\n", "\n", "Look at the two counts above:\n", "\n", "- **`steps_executed`** — should be 3-7 for a well-decomposed task. Above 8 = over-decomposition. Below 3 = the planner didn't really decompose and Planning degraded toward ReAct.\n", "- **`replans`** — should be 0 for tasks where the initial plan was good; 1-2 for tasks that needed adjustment. If `replans == max_replans`, the agent was forced to finalize with incomplete evidence.\n", "\n", "§ 9 below quantifies and analyses what happened." ] }, { "cell_type": "markdown", "id": "e57e0205", "metadata": { "papermill": { "duration": 0.008064, "end_time": "2026-05-27T04:46:17.518257+00:00", "exception": false, "start_time": "2026-05-27T04:46:17.510193+00:00", "status": "completed" }, "tags": [] }, "source": [ "### 8.1 · Full plan + step results" ] }, { "cell_type": "code", "execution_count": 6, "id": "e1c09c17", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T04:46:17.526456Z", "iopub.status.busy": "2026-05-27T04:46:17.526456Z", "iopub.status.idle": "2026-05-27T04:46:17.634978Z", "shell.execute_reply": "2026-05-27T04:46:17.634978Z" }, "papermill": { "duration": 0.110529, "end_time": "2026-05-27T04:46:17.634978+00:00", "exception": false, "start_time": "2026-05-27T04:46:17.524449+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
 [1] STEP\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1m]\u001b[0m\u001b[1m STEP\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Look up the latest stable release version of Python as of 2026-05-27\n",
       "
\n" ], "text/plain": [ "Look up the latest stable release version of Python as of \u001b[1;36m2026\u001b[0m-\u001b[1;36m05\u001b[0m-\u001b[1;36m27\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
     RESULT\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m RESULT\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
As of 2026-05-27, the latest stable release version of Python is Python 3.11.4, according to the official Python \n",
       "website (https://www.python.org/downloads/). The most popular package manager for Python is pip \n",
       "(https://pip.pypa.io/). A well-known company that uses Python in production is Google (http...\n",
       "
\n" ], "text/plain": [ "As of \u001b[1;36m2026\u001b[0m-\u001b[1;36m05\u001b[0m-\u001b[1;36m27\u001b[0m, the latest stable release version of Python is Python \u001b[1;36m3.11\u001b[0m.\u001b[1;36m4\u001b[0m, according to the official Python \n", "website \u001b[1m(\u001b[0m\u001b[4;94mhttps://www.python.org/downloads/\u001b[0m\u001b[4;94m)\u001b[0m\u001b[4;94m.\u001b[0m The most popular package manager for Python is pip \n", "\u001b[1m(\u001b[0m\u001b[4;94mhttps://pip.pypa.io/\u001b[0m\u001b[4;94m)\u001b[0m\u001b[4;94m.\u001b[0m A well-known company that uses Python in production is Google \u001b[1m(\u001b[0mhttp\u001b[33m...\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ "
 [2] STEP\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m2\u001b[0m\u001b[1m]\u001b[0m\u001b[1m STEP\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Look up the latest stable release version of Rust as of 2026-05-27\n",
       "
\n" ], "text/plain": [ "Look up the latest stable release version of Rust as of \u001b[1;36m2026\u001b[0m-\u001b[1;36m05\u001b[0m-\u001b[1;36m27\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
     RESULT\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m RESULT\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
As of 2026-05-27, the latest stable release version of Rust is Rust 1.68.2, according to the official Rust website \n",
       "(https://www.rust-lang.org/).\n",
       "
\n" ], "text/plain": [ "As of \u001b[1;36m2026\u001b[0m-\u001b[1;36m05\u001b[0m-\u001b[1;36m27\u001b[0m, the latest stable release version of Rust is Rust \u001b[1;36m1.68\u001b[0m.\u001b[1;36m2\u001b[0m, according to the official Rust website \n", "\u001b[1m(\u001b[0m\u001b[4;94mhttps://www.rust-lang.org/\u001b[0m\u001b[4;94m)\u001b[0m\u001b[4;94m.\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ "
 [3] STEP\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m3\u001b[0m\u001b[1m]\u001b[0m\u001b[1m STEP\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Identify the most popular package manager for Python\n",
       "
\n" ], "text/plain": [ "Identify the most popular package manager for Python\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
     RESULT\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m RESULT\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
The most popular package manager for Python is pip, according to the official Python website \n",
       "(https://www.python.org/downloads/) and the pip website (https://pip.pypa.io/).\n",
       "
\n" ], "text/plain": [ "The most popular package manager for Python is pip, according to the official Python website \n", "\u001b[1m(\u001b[0m\u001b[4;94mhttps://www.python.org/downloads/\u001b[0m\u001b[4;94m)\u001b[0m and the pip website \u001b[1m(\u001b[0m\u001b[4;94mhttps://pip.pypa.io/\u001b[0m\u001b[4;94m)\u001b[0m\u001b[4;94m.\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ "
 [4] STEP\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m4\u001b[0m\u001b[1m]\u001b[0m\u001b[1m STEP\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Identify the most popular package manager for Rust\n",
       "
\n" ], "text/plain": [ "Identify the most popular package manager for Rust\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
     RESULT\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m RESULT\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
The most popular package manager for Rust is Cargo, according to the official Rust website \n",
       "(https://doc.rust-lang.org/cargo/) and the Rust documentation \n",
       "(https://doc.rust-lang.org/book/ch01-01-installation.html). Cargo is the package manager for Rust, and it is used \n",
       "to manage dependencies and build ...\n",
       "
\n" ], "text/plain": [ "The most popular package manager for Rust is Cargo, according to the official Rust website \n", "\u001b[1m(\u001b[0m\u001b[4;94mhttps://doc.rust-lang.org/cargo/\u001b[0m\u001b[4;94m)\u001b[0m and the Rust documentation \n", "\u001b[1m(\u001b[0m\u001b[4;94mhttps://doc.rust-lang.org/book/ch01-01-installation.html\u001b[0m\u001b[4;94m)\u001b[0m\u001b[4;94m.\u001b[0m Cargo is the package manager for Rust, and it is used \n", "to manage dependencies and build \u001b[33m...\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ "
 [5] STEP\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m5\u001b[0m\u001b[1m]\u001b[0m\u001b[1m STEP\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Find a well-known company that uses Python in production and cite a source URL\n",
       "
\n" ], "text/plain": [ "Find a well-known company that uses Python in production and cite a source URL\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
     RESULT\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m RESULT\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Google is a well-known company that uses Python in production, according to the official Python website \n",
       "(https://www.python.org/about/success/) and the Google website (https://www.google.com).\n",
       "
\n" ], "text/plain": [ "Google is a well-known company that uses Python in production, according to the official Python website \n", "\u001b[1m(\u001b[0m\u001b[4;94mhttps://www.python.org/about/success/\u001b[0m\u001b[4;94m)\u001b[0m and the Google website \u001b[1m(\u001b[0m\u001b[4;94mhttps://www.google.com\u001b[0m\u001b[4;94m)\u001b[0m\u001b[4;94m.\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ "
 [6] STEP\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m6\u001b[0m\u001b[1m]\u001b[0m\u001b[1m STEP\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Find a well-known company that uses Rust in production and cite a source URL\n",
       "
\n" ], "text/plain": [ "Find a well-known company that uses Rust in production and cite a source URL\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
     RESULT\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m RESULT\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
As of 2026-05-27, a well-known company that uses Rust in production is Microsoft, according to the official Rust \n",
       "website (https://www.rust-lang.org/) and the Microsoft website (https://www.microsoft.com/).\n",
       "
\n" ], "text/plain": [ "As of \u001b[1;36m2026\u001b[0m-\u001b[1;36m05\u001b[0m-\u001b[1;36m27\u001b[0m, a well-known company that uses Rust in production is Microsoft, according to the official Rust \n", "website \u001b[1m(\u001b[0m\u001b[4;94mhttps://www.rust-lang.org/\u001b[0m\u001b[4;94m)\u001b[0m and the Microsoft website \u001b[1m(\u001b[0m\u001b[4;94mhttps://www.microsoft.com/\u001b[0m\u001b[4;94m)\u001b[0m\u001b[4;94m.\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ "
 [7] STEP\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m7\u001b[0m\u001b[1m]\u001b[0m\u001b[1m STEP\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Compare the results from the previous steps and summarize the comparison\n",
       "
\n" ], "text/plain": [ "Compare the results from the previous steps and summarize the comparison\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
     RESULT\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m RESULT\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
As of 2026-05-27, the latest stable release version of Python is Python 3.11.4, according to the official Python \n",
       "website (https://www.python.org/downloads/). The most popular package manager for Python is pip \n",
       "(https://pip.pypa.io/). A well-known company that uses Python in production is Google (http...\n",
       "
\n" ], "text/plain": [ "As of \u001b[1;36m2026\u001b[0m-\u001b[1;36m05\u001b[0m-\u001b[1;36m27\u001b[0m, the latest stable release version of Python is Python \u001b[1;36m3.11\u001b[0m.\u001b[1;36m4\u001b[0m, according to the official Python \n", "website \u001b[1m(\u001b[0m\u001b[4;94mhttps://www.python.org/downloads/\u001b[0m\u001b[4;94m)\u001b[0m\u001b[4;94m.\u001b[0m The most popular package manager for Python is pip \n", "\u001b[1m(\u001b[0m\u001b[4;94mhttps://pip.pypa.io/\u001b[0m\u001b[4;94m)\u001b[0m\u001b[4;94m.\u001b[0m A well-known company that uses Python in production is Google \u001b[1m(\u001b[0mhttp\u001b[33m...\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "for i, t in enumerate(result.trace, 1):\n", " print_step(f\"[{i}] STEP\", t['step'])\n", " snippet = (t['result'] or '')[:300].replace('\\n', ' ')\n", " print_step(f\" RESULT\", snippet + ('...' if t['result'] and len(t['result']) > 300 else ''))\n", " print()" ] }, { "cell_type": "markdown", "id": "c54b875d", "metadata": { "papermill": { "duration": 0.013438, "end_time": "2026-05-27T04:46:17.662484+00:00", "exception": false, "start_time": "2026-05-27T04:46:17.649046+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 9 · What we just observed\n", "\n", "The cells above are live. Below: a quantitative + qualitative breakdown of the **actual** Plan-Execute-Replan loop the Nebius-hosted Llama-3.3-70B agent produced on this run.\n", "\n", "### 9.1 · Quantitative summary\n", "\n", "| Metric | Value |\n", "|---|---|\n", "| Plan steps executed | **7** |\n", "| Replans triggered | **0** / 2 |\n", "| Final answer length | 385 chars |\n", "\n", "### 9.2 · Plan ↔ result alignment\n", "\n", "| # | Plan step | Execution result (truncated) |\n", "|---|---|---|\n", "| 1 | Look up the latest stable release version of Python as of 2026-05-27 | As of 2026-05-27, the latest stable release version of Python is Python 3.11.4, according to the official Python website… |\n", "| 2 | Look up the latest stable release version of Rust as of 2026-05-27 | As of 2026-05-27, the latest stable release version of Rust is Rust 1.68.2, according to the official Rust website (http… |\n", "| 3 | Identify the most popular package manager for Python | The most popular package manager for Python is pip, according to the official Python website (https://www.python.org/dow… |\n", "| 4 | Identify the most popular package manager for Rust | The most popular package manager for Rust is Cargo, according to the official Rust website (https://doc.rust-lang.org/ca… |\n", "| 5 | Find a well-known company that uses Python in production and cite a source URL | Google is a well-known company that uses Python in production, according to the official Python website (https://www.pyt… |\n", "| 6 | Find a well-known company that uses Rust in production and cite a source URL | As of 2026-05-27, a well-known company that uses Rust in production is Microsoft, according to the official Rust website… |\n", "| 7 | Compare the results from the previous steps and summarize the comparison | As of 2026-05-27, the latest stable release version of Python is Python 3.11.4, according to the official Python website… |\n", "\n", "### 9.3 · Pathologies / patterns surfaced in this run\n", "\n", "- **Over-decomposition.** 7 steps for a task this size is likely too many — each step is a sub-agent call. Tighten the Plan schema description: *'Use 3-5 steps; combine atomic lookups when possible.'*\n", "\n", "- **Plan was good first try.** The replanner immediately set `is_done=True` after the initial plan finished — no extension needed. Best-case outcome.\n", "\n", "- **No URLs in the final answer** despite the task asking for citation. The replanner synthesised from parametric knowledge instead of grounding in the executor's results — consider tightening the `_synthesize_from_history` prompt or adding a citation-required schema field.\n", "\n", "### 9.4 · The final answer (verbatim)\n", "\n", "> As of 2026-05-27, the latest stable release version of Python is Python 3.11.4, and the most popular package \n", "> manager for Python is pip. A well-known company that uses Python in production is Google. The latest stable release\n", "> version of Rust is Rust 1.68.2, the most popular package manager for Rust is Cargo, and a well-known company that \n", "> uses Rust in production is Microsoft.\n", "\n", "### 9.5 · The takeaway\n", "\n", "When a task **naturally decomposes** (multi-fact comparison, structured report, multi-step computation), Planning is the right tool — you save token cost vs. ReAct's per-step thinking *and* gain a human-inspectable contract. When the task is **open-ended or one-shot**, planning is overhead — fall back to ReAct or plain Tool Use.\n", "\n", "The cleanest signal: did your run use any replans? If yes — the initial plan wasn't quite right, but the recovery worked. If no replans AND ≥ 3 steps executed — you got the ideal Planning trace." ] }, { "cell_type": "markdown", "id": "0ac38f26", "metadata": { "papermill": { "duration": 0.009648, "end_time": "2026-05-27T04:46:17.677896+00:00", "exception": false, "start_time": "2026-05-27T04:46:17.668248+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 10 · Try other providers · also: a planner-focused reasoning model\n", "\n", "Planning relies on **structured output** (the Pydantic Plan schema). Every major provider supports it, but reasoning models are particularly good at *generating* plans because the upfront planning pass is itself a multi-step reasoning task." ] }, { "cell_type": "code", "execution_count": 7, "id": "7c4e7991", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T04:46:17.701395Z", "iopub.status.busy": "2026-05-27T04:46:17.701395Z", "iopub.status.idle": "2026-05-27T04:48:14.519780Z", "shell.execute_reply": "2026-05-27T04:48:14.517382Z" }, "papermill": { "duration": 116.844354, "end_time": "2026-05-27T04:48:14.527581+00:00", "exception": false, "start_time": "2026-05-27T04:46:17.683227+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[skip] openai: no API key in .env\n", "[skip] anthropic: no API key in .env\n" ] }, { "data": { "text/html": [ "
Re-running on Nebius Qwen3-Thinking (reasoning model — see if the plan is deeper) ─────────────────────────────────\n",
       "
\n" ], "text/plain": [ "\u001b[1;36mRe-running on Nebius Qwen3-Thinking \u001b[0m\u001b[1;36m(\u001b[0m\u001b[1;36mreasoning model — see if the plan is deeper\u001b[0m\u001b[1;36m)\u001b[0m \u001b[92m─────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "1. **Mandarin Chinese**: 1,346 million native speakers (Encyclopaedia Britannica, 2023)\n", "2. **Spanish**: 485 million native speakers (Encyclopaedia Britannica, 2023)\n", "3. **English**: 379 million native speakers (Encyclopaedia Britannica, 2023)\n", "\n", "*Note: While Ethnologue (2023) confirms these rankings, e\n", " steps: 6, replans: 0\n", " First plan step: Search Ethnologue's latest report for 'languages by number of native speakers' to obtain ranked data.\n" ] } ], "source": [ "from agentic_architectures.llm.factory import provider_supports_structured_output\n", "\n", "for p in [\"openai\", \"anthropic\"]:\n", " key = settings.api_key_for(p)\n", " if key is None or not key.get_secret_value():\n", " print(f\"[skip] {p}: no API key in .env\")\n", " continue\n", " if not provider_supports_structured_output(p):\n", " print(f\"[skip] {p}: no structured output\")\n", " continue\n", " print_header(f\"Re-running Planning on {p}\")\n", " r = Planning(llm=get_llm(provider=p), max_replans=1, executor_rounds=3).run(\n", " \"List the three most spoken native languages and their approximate number of speakers (millions). Cite sources.\"\n", " )\n", " print(r.output[:300])\n", " print(f\" steps: {r.metadata['steps_executed']}, replans: {r.metadata['replans']}\")\n", " print()\n", "\n", "# Swap to a Nebius reasoning model for a more sophisticated plan.\n", "print_header(\"Re-running on Nebius Qwen3-Thinking (reasoning model — see if the plan is deeper)\")\n", "thinking_llm = get_llm(\n", " provider=\"nebius\",\n", " model=\"Qwen/Qwen3-235B-A22B-Thinking-2507-fast\",\n", " temperature=0.0,\n", ")\n", "thinking_arch = Planning(llm=thinking_llm, max_replans=1, executor_rounds=3)\n", "r = thinking_arch.run(\n", " \"List the three most spoken native languages and their approximate number of speakers (millions). Cite sources.\"\n", ")\n", "print(r.output[:300])\n", "print(f\" steps: {r.metadata['steps_executed']}, replans: {r.metadata['replans']}\")\n", "if r.trace:\n", " print(' First plan step:', r.trace[0]['step'][:200])" ] }, { "cell_type": "markdown", "id": "57aca68c", "metadata": { "papermill": { "duration": 0.007501, "end_time": "2026-05-27T04:48:14.541387+00:00", "exception": false, "start_time": "2026-05-27T04:48:14.533886+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 11 · Failure modes, safety, extensions\n", "\n", "### 11.1 · Where this breaks\n", "\n", "| Failure | Mechanism | Mitigation |\n", "|---|---|---|\n", "| **Over-decomposition** | Plan has 7+ steps when 2 would suffice | Tighten Plan schema description: \"Use 3-5 steps; combine atomic lookups\"; or use a smaller model for planning |\n", "| **Vague step text** | \"Analyze the data\" — no concrete tool action | Schema description forces concrete verbs (\"compute\", \"look up\"); penalise vague steps in the prompt |\n", "| **Step interference** | Step 3 needs info step 2 didn't fetch | Replanner handles it; cap with `max_replans` |\n", "| **Sycophantic replanner** | `is_done=True` even with thin evidence | Add an LLMJudge-style evidence check before accepting `final_response` |\n", "| **Replan thrash** | Replanner keeps adding 1–2 steps without converging | `max_replans` cap (we use 2 by default); force-finalisation at budget |\n", "| **Static plans for dynamic tasks** | Plan made upfront can't react to surprises mid-execution | Use **PEV (nb 06)** which verifies each step's outcome and can replan on failure |\n", "\n", "### 11.2 · Production safety\n", "\n", "- **Show the plan to a human before executing** for high-stakes tasks. The plan is a *contract*; the human-in-the-loop catches bad decompositions before any side effect.\n", "- **Bound the executor budget** — each sub-agent's `max_rounds` is independent of the planner's. A 5-step plan with `executor_rounds=5` ≈ 25 LLM calls.\n", "- **Idempotency.** If a step has side effects (e.g., sending email), make sure replanning doesn't re-execute it. Track step IDs in `past_steps`.\n", "\n", "### 11.3 · Three extensions\n", "\n", "1. **Hierarchical planning** — let the planner emit *sub-plans* (lists of steps) for complex steps. Bridge to **HTN-style** planning.\n", "2. **Plan + verify per step** — wrap each `_execute` in a verifier that checks the step actually produced what was asked. That's **PEV (notebook 06)**.\n", "3. **Parallel execution** — when steps are independent, run them in parallel via `langgraph.graph.parallel`. Massive latency win for fan-out tasks.\n", "\n", "### 11.4 · What to read next\n", "\n", "- [**05 · Multi-Agent Systems**](./05_multi_agent.ipynb) — Planning where specialist sub-agents own each step.\n", "- [**06 · PEV (Plan-Execute-Verify)**](./06_pev.ipynb) — Planning + per-step verification + recovery.\n", "- [**11 · Meta-Controller**](./11_meta_controller.ipynb) — a router over architectures; picks Tool Use / ReAct / Planning per task.\n", "- [**26 · Adaptive RAG**](./26_adaptive_rag.ipynb) — same routing idea, but specialised for RAG complexity.\n", "\n", "### 11.5 · References\n", "\n", "1. Wang, L. et al. *Plan-and-Solve Prompting.* ACL 2023. [arXiv:2305.04091](https://arxiv.org/abs/2305.04091)\n", "2. LangGraph plan-and-execute tutorial — [official docs](https://langchain-ai.github.io/langgraph/tutorials/plan-and-execute/plan-and-execute/)\n", "3. Hierarchical Task Networks — classical AI planning ([overview](https://en.wikipedia.org/wiki/Hierarchical_task_network))\n", "4. Pydantic structured output — [langchain docs](https://python.langchain.com/docs/concepts/structured_outputs/)\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.0" }, "papermill": { "default_parameters": {}, "duration": 346.204563, "end_time": "2026-05-27T04:48:15.743122+00:00", "environment_variables": {}, "exception": null, "input_path": "all-agentic-architectures/notebooks/04_planning.ipynb", "output_path": "all-agentic-architectures/notebooks/04_planning.ipynb", "parameters": {}, "start_time": "2026-05-27T04:42:29.538559+00:00", "version": "2.7.0" } }, "nbformat": 4, "nbformat_minor": 5 }