{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "7a969399",
   "metadata": {
    "papermill": {
     "duration": 0.008287,
     "end_time": "2026-05-27T03:49:14.321358+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:14.313071+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# 02 · Tool Use — give the agent access to external tools\n",
    "\n",
    "> **TL;DR.** Bind one or more *tools* (Python functions exposed to the LLM via a JSON schema) to the model. The LLM decides when to call a tool, reads the tool's result, and either calls another tool or produces a final answer. The simplest \"agentic\" pattern that goes beyond a single forward pass.\n",
    ">\n",
    "> **Reach for it when** the answer requires information the model can't have memorized: live data (weather, stock prices, news), private data (your company's docs), or deterministic computation (math, code execution).\n",
    "> **Avoid when** the answer is in the model's training data and structured output would do — calling a tool you don't need adds latency and one more failure point.\n",
    "\n",
    "| Property | Value |\n",
    "|---|---|\n",
    "| Origin | OpenAI function-calling API, June 2023 — conceptual ancestor: Toolformer (Schick et al., 2023) |\n",
    "| Reasoning type | Reactive (no explicit *thought* step — see ReAct, notebook 03, for that) |\n",
    "| External tools needed? | **Yes** (web search by default) |\n",
    "| Memory across episodes? | No |\n",
    "| Provider requirement | Must support **tool calling** (Nebius, OpenAI, Anthropic, Groq, Together, Fireworks, Mistral, Google, recent Ollama) |\n",
    "| Typical tool calls | 1–4 per task |\n",
    "\n",
    "This notebook keeps the original scenario (research assistant doing live web queries) but rebuilds the implementation on top of the library's `ToolUse` class."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5c313a5c",
   "metadata": {
    "papermill": {
     "duration": 0.008037,
     "end_time": "2026-05-27T03:49:14.329395+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:14.321358+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## 2 · Architecture at a glance\n",
    "\n",
    "```mermaid\n",
    "flowchart LR\n",
    "    A([user task]) --> AG[Agent<br/><sub>LLM bound with tools</sub>]\n",
    "    AG --> Q{tool_calls<br/>present?}\n",
    "    Q -->|yes| T[ToolNode<br/><sub>executes the called tools</sub>]\n",
    "    T --> AG\n",
    "    Q -->|no| F([final answer])\n",
    "\n",
    "    style AG fill:#e3f2fd,stroke:#1976d2\n",
    "    style T fill:#fff3e0,stroke:#f57c00\n",
    "```\n",
    "\n",
    "The graph has only two nodes: an **Agent** (the LLM with `bind_tools(...)`) and a **ToolNode** (LangGraph's prebuilt that calls the requested tools in parallel). `tools_condition` — also a LangGraph prebuilt — inspects the latest message and routes to `tools` if there are pending tool calls, else to `END`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b44cb13",
   "metadata": {
    "papermill": {
     "duration": 0.0,
     "end_time": "2026-05-27T03:49:14.337800+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:14.337800+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## 3 · Theory\n",
    "\n",
    "### 3.1 · The shift from \"completion\" to \"tool-augmented agent\"\n",
    "\n",
    "Before tool-calling (mid-2023), the LLM produced a single chunk of text in response to a prompt. Anything that required *up-to-date facts*, *private data*, or *arithmetic the model can't do reliably* had to be handled with awkward workarounds — ReAct-style prompting, plugins, or downstream scaffolding that parsed natural-language pseudo-commands out of the output.\n",
    "\n",
    "OpenAI's function-calling API changed the game by training models to emit a **structured `tool_calls` field** rather than embedding tool invocations in free text. Now the LLM produces:\n",
    "\n",
    "```json\n",
    "{\n",
    "  \"tool_calls\": [\n",
    "    {\"name\": \"web_search\", \"args\": {\"query\": \"LangGraph release date\"}}\n",
    "  ]\n",
    "}\n",
    "```\n",
    "\n",
    "— which downstream code can execute deterministically. After execution, the tool's result is appended to the conversation as a `ToolMessage`, the LLM is invoked again, and it sees both the original request and the tool output before deciding what to do next.\n",
    "\n",
    "### 3.2 · The minimal control loop\n",
    "\n",
    "```python\n",
    "while True:\n",
    "    response = llm_with_tools.invoke(messages)\n",
    "    messages.append(response)\n",
    "    if not response.tool_calls:\n",
    "        return response.content       # final answer\n",
    "    for tc in response.tool_calls:\n",
    "        result = run_tool(tc.name, tc.args)\n",
    "        messages.append(ToolMessage(content=result, tool_call_id=tc.id))\n",
    "```\n",
    "\n",
    "LangGraph's `StateGraph + ToolNode + tools_condition` is exactly this loop, expressed as a graph so it's stoppable, observable in LangSmith, and replaceable with a more elaborate routing strategy when you grow into ReAct (notebook 03), Planning (notebook 04), or PEV (notebook 06).\n",
    "\n",
    "### 3.3 · Where Tool Use sits in the taxonomy\n",
    "\n",
    "| Pattern | Loop body | Thought step? | Plan ahead? | Use this when... |\n",
    "|---|---|---|---|---|\n",
    "| **Tool Use** *(this notebook)* | act → observe | no | no | a single query benefits from one or two external calls |\n",
    "| ReAct (nb 03) | think → act → observe | **yes** | no | multi-step reasoning needs intermediate thoughts to stay coherent |\n",
    "| Planning (nb 04) | plan once → execute step-by-step | no | **yes** | the task naturally decomposes into a fixed sequence |\n",
    "| PEV (nb 06) | plan → exec → verify → maybe replan | no | yes + verification | actions can fail and you need automatic recovery |\n",
    "| Agentic RAG (nb 23) | decide-to-retrieve → retrieve → answer | yes | no | the agent owns *when* to retrieve, not just *what* |\n",
    "\n",
    "Tool Use is the \"single forward step\" version of ReAct. If you find yourself wanting the agent to write a *because-of-this* sentence between tool calls, you've grown into ReAct.\n",
    "\n",
    "### 3.4 · The three failure modes you'll see in § 9\n",
    "\n",
    "1. **Over-search** — the agent keeps calling the search tool even after it has enough information. Fix: a system prompt that explicitly tells it to stop after 2–3 calls (we use this).\n",
    "2. **Result drift** — the agent searches, gets a relevant result, then *answers from parametric knowledge anyway*, ignoring the result. We'll see this live in the captured run.\n",
    "3. **Bad query selection** — the agent issues queries that are too vague to be useful (\"information about X\" instead of \"X release date 2024\"). Tool Use has no built-in fix; ReAct's *thought* step is the standard upgrade.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b0620cfb",
   "metadata": {
    "papermill": {
     "duration": 0.003357,
     "end_time": "2026-05-27T03:49:14.349705+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:14.346348+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## 4 · Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "8c6323b0",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-05-27T03:49:14.362602Z",
     "iopub.status.busy": "2026-05-27T03:49:14.362602Z",
     "iopub.status.idle": "2026-05-27T03:49:15.586048Z",
     "shell.execute_reply": "2026-05-27T03:49:15.585720Z"
    },
    "papermill": {
     "duration": 1.230961,
     "end_time": "2026-05-27T03:49:15.586048+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:14.355087+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">Provider: nebius  ·  Model: meta-llama/Llama-</span><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.3</span><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">-70B-Instruct</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">─────────────────────────────────────────────────────</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;36mProvider: nebius  ·  Model: meta-llama/Llama-\u001b[0m\u001b[1;36m3.3\u001b[0m\u001b[1;36m-70B-Instruct\u001b[0m \u001b[92m─────────────────────────────────────────────────────\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">LangSmith tracing: enabled                                                                                         \n",
       "</pre>\n"
      ],
      "text/plain": [
       "LangSmith tracing: enabled                                                                                         \n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Tavily key set: <span style=\"font-weight: bold\">True</span>                                                                                               \n",
       "</pre>\n"
      ],
      "text/plain": [
       "Tavily key set: \u001b[1mTrue\u001b[0m                                                                                               \n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from agentic_architectures import get_llm, enable_langsmith, settings\n",
    "from agentic_architectures.architectures import ToolUse\n",
    "from agentic_architectures.ui import print_md, print_header, print_step\n",
    "\n",
    "traced = enable_langsmith()\n",
    "print_header(f\"Provider: {settings.llm_provider}  ·  Model: {settings.llm_model}\")\n",
    "print_md(f\"LangSmith tracing: {'enabled' if traced else 'disabled (no LANGSMITH_API_KEY)'}\")\n",
    "print_md(f\"Tavily key set: **{settings.tavily_api_key is not None}**\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7e0e03c5",
   "metadata": {
    "papermill": {
     "duration": 0.007779,
     "end_time": "2026-05-27T03:49:15.593827+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:15.586048+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## 5 · Library walkthrough\n",
    "\n",
    "Source: [`src/agentic_architectures/architectures/tool_use.py`](../src/agentic_architectures/architectures/tool_use.py).\n",
    "\n",
    "The class is short — most of the heavy lifting is done by LangGraph prebuilts (`ToolNode`, `tools_condition`) and the library's `web_search_tool` wrapper around `langchain_tavily.TavilySearch` (which replaces the deprecated `TavilySearchResults` from the original repo).\n",
    "\n",
    "Key design choices:\n",
    "\n",
    "1. **`provider_supports_tools()` check at construction time.** Fails fast with a helpful error if you try Tool Use on a provider that doesn't support tool-calling (e.g., `huggingface`).\n",
    "2. **Default system prompt** caps over-search (\"after at most 2–3 searches, STOP and answer\"). The original `bind_tools(...)` call alone produces 6+ searches per question for chatty models like Llama 3.3 — see § 9.\n",
    "3. **Single `_agent` node** that prepends the system message on first turn only. `add_messages` reducer takes care of the rest.\n",
    "4. **`recursion_limit = 4 × max_rounds + 4`** — LangGraph counts edge traversals, so we budget generously."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "0b57492f",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-05-27T03:49:15.601932Z",
     "iopub.status.busy": "2026-05-27T03:49:15.601932Z",
     "iopub.status.idle": "2026-05-27T03:49:15.626145Z",
     "shell.execute_reply": "2026-05-27T03:49:15.626145Z"
    },
    "papermill": {
     "duration": 0.024213,
     "end_time": "2026-05-27T03:49:15.626145+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:15.601932+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "def __init__(tools, max_rounds, system_prompt):\n",
      "def _agent(state):\n",
      "def build():\n",
      "def run(task):\n",
      "\n",
      "Default system prompt:\n",
      "You are a research assistant with access to web search.\n",
      "\n",
      "Rules:\n",
      "1. Use the search tool only when you need facts you don't already know.\n",
      "2. After at most 2-3 searches, STOP searching and answer using what you found.\n",
      "3. Cite your sources with URLs in the final answer.\n",
      "4. If a search returns enough information, do NOT search again - answer the user.\n"
     ]
    }
   ],
   "source": [
    "import inspect, ast\n",
    "from agentic_architectures.architectures import tool_use as tu_mod\n",
    "\n",
    "src = inspect.getsource(tu_mod.ToolUse)\n",
    "tree = ast.parse(src)\n",
    "for node in ast.walk(tree):\n",
    "    if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):\n",
    "        args = ', '.join(a.arg for a in node.args.args if a.arg != 'self')\n",
    "        print(f\"def {node.name}({args}):\")\n",
    "print()\n",
    "print('Default system prompt:')\n",
    "print(tu_mod.ToolUse.DEFAULT_SYSTEM_PROMPT)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4c5ff4a8",
   "metadata": {
    "papermill": {
     "duration": 0.004012,
     "end_time": "2026-05-27T03:49:15.636203+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:15.632191+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## 6 · State & messages\n",
    "\n",
    "`ToolUseState` has just one field — the message list. LangGraph's `add_messages` reducer means each node returns *the messages it produced* (not a replacement list); the reducer appends in order. That's why both `_agent` and `ToolNode` can return `{'messages': [...]}` without colliding.\n",
    "\n",
    "| Message type | Produced by | Contains |\n",
    "|---|---|---|\n",
    "| `SystemMessage` | first call to `_agent` | the cap-search instruction |\n",
    "| `HumanMessage` | the caller | the task |\n",
    "| `AIMessage` (with `tool_calls`) | `_agent`, mid-loop | the next tool(s) to call |\n",
    "| `ToolMessage` | `ToolNode` | the tool's stringified result |\n",
    "| `AIMessage` (no `tool_calls`) | `_agent`, terminal | the final answer |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "b3de3b18",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-05-27T03:49:15.641959Z",
     "iopub.status.busy": "2026-05-27T03:49:15.641959Z",
     "iopub.status.idle": "2026-05-27T03:49:15.723274Z",
     "shell.execute_reply": "2026-05-27T03:49:15.721478Z"
    },
    "papermill": {
     "duration": 0.087643,
     "end_time": "2026-05-27T03:49:15.725850+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:15.638207+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ToolUseState fields: ['messages']\n"
     ]
    }
   ],
   "source": [
    "from agentic_architectures.architectures.tool_use import ToolUseState\n",
    "print('ToolUseState fields:', list(ToolUseState.__annotations__.keys()))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0d016516",
   "metadata": {
    "papermill": {
     "duration": 0.005791,
     "end_time": "2026-05-27T03:49:15.737488+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:15.731697+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## 7 · Build the graph\n",
    "\n",
    "The cell below renders the **actual compiled `StateGraph`** as a PNG (via `mermaid.ink`). If this rendered diagram ever disagrees with the static one in § 2, the implementation has drifted from the documentation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "5ff9a5bc",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-05-27T03:49:15.748250Z",
     "iopub.status.busy": "2026-05-27T03:49:15.748250Z",
     "iopub.status.idle": "2026-05-27T03:49:18.302352Z",
     "shell.execute_reply": "2026-05-27T03:49:18.300990Z"
    },
    "papermill": {
     "duration": 2.563533,
     "end_time": "2026-05-27T03:49:18.303304+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:15.739771+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAANgAAAD5CAIAAADKsmwpAAAQAElEQVR4nOydB2AURdvHZ/dKLr33QhICAUKJvAEUFRAQ9aUrrwgEAT+kCeInRf0AQSwgiIqKYASESInSW6QoTQidlxJKEJKQkEp6LuXa7vfsXXI5krtAILc3ezc/w7k7M7uX7P1vyvPMPCNmWRYRCJZGjAgEDCBCJGABESIBC4gQCVhAhEjAAiJEAhYQIdYn767y2pnS4hylopoBNEqEaIQYRNEsy1CIAmsXhVjEUtx/ukTuDHJoitGw8H/dfXQFuAOapRjtAaRxt+KO4YBl6r91TSLNcmUo7gJtKnedYTFazDLqB1Kk9pREQts5ifzD7WP6uCEBQhE7oo6sfxRHtuaXFigYhhWJaZmDSGJHi8RIrWAoEWI1NXLkXlntj04o2kROfZxyKFbDIqr+nWkQKKN9yJRWx1r90WKKUdc9+VpZUyyUFFFIwym75pOhKcQ88BnVuxaQymiNhlJVaxRVjErN2MlE/mGyAeP9kXAgQkR5GardcfdU1Yyrl7Tjs64dnndBgkaDjmwtSLsur67U+ATLhr0biISArQtxyzdZefeqQiKdBk3wQ9ZFQa5m38+ZleWaF4b5tenqiPDGpoX485xUqUQ0ZkELZL1cP1V+fMf9oEiHAf+D9TfNdoW4Zl5aYEvHl8f6IBtg9dz0Lv3cO/VwRbhio0Jc9cGdiGiXviO8kc2wek6ad7D94EmY1os0sj3Wzk9v0cbRplQIjP88LD+j8u/tBQhLbE6Iu3/KAXPdK+OsbWjyKLz9WfiVkyUIS2xMiBqUeati3PxQZJvQKLiV49oF6Qg/bEuI8YszvIPskQ0zaJK/sopJuVCBMMO2hFhWqHx9ujAMvObDr4Usafd9hBk2JMQ9cTkOzhKe/+IPP/xw165dqOm8+OKLWVlZyAz0Hx9QUa5GmGFDQsxJqw6J5Ltdvn79Omo6OTk5xcXFyDxIpAjc6H9uxqtStCEhqlRMTB9PZB5Onjw5ceLE5557bsiQIfPnzy8o4KwkMTEx2dnZn376aa9eveBULpevWrVqzJgxumLffPNNdXW17vI+ffps3rz57bffhkuOHTs2cOBASBw8ePCMGTOQGXD3kWanViKcsBUh3rlSSVPIzVeEzMDNmzenT5/epUuXrVu3zp49+9atWwsWLEBadcLrvHnzjh49CgcJCQnr1q0bPXr0t99+C+UPHToUFxenu4NEItmxY0dkZOSKFSueffZZKACJ0KYvW7YMmQH/UHtlJYNwwlbmI+akVYkkFDIPly5dkslkb731Fk3Tfn5+7dq1u337dsNisbGxUPOFhYXpTi9fvpyUlPTuu+8iboIY5erqOnPmTMQLPsHS5FNEiJagSq6haHMJMTo6GhrZ9957r1u3bj169AgODoYWtmExqPZOnToFDTdUmWo1N1zw8PDQ54J8EV94eNuxDF6uXVtpmrmpqWbzqrdp0+a7777z9vb+/vvvhw4dOmXKFKjtGhaDXGiLocDOnTvPnz8/btw4w1ypVIp4QyxCyFxfy8fDVoTo4CQ266Pv3r079AX37NkDvcPS0lKoHXV1nh6WZbdt2zZ8+HAQIjTfkFJeXo4sREl+NcIMWxGid6BUWa1B5uHChQvQ2+Pexdt7wIABMNQFkYEJxrCMSqWqqqry8amZdaZUKo8fP44sRF6Gghbh9dHbihDbdHWGlllZaZbWGRpiGCxv374djH/JyckwOgZF+vv729nZgfJOnz4NDTGMY0JDQ3fv3n3v3r2SkpKFCxdCz7KsrKyiwoi3DUrCKwyr4W7IDOTdrZY5EiFaCImUPn2gCJkBGA5Dg/vVV1+BO2TChAmOjo7QFxSLuYEgDKXPnTsHdSRUh1988QUMrocNGwZGxK5du06dOhVO+/btC7bGejcMCgoCUyIYHaFbicxAYU61b6AM4YQNTYz9bdm9yjL1uE9Ckc3z/f/+M35huL2zWayqj4cN1YgvjvKVl2HnY+WfxF9ywMWHlQqRTS2w9/CTyBzonSuzh0wOMFpAo9GAwdloFowtwAoIZueGWeHh4WvXrkXmYZ0Wo1lOTk7gMzSaFRUVBR4aZIK05IrOvd0RZtjWmpV7/yh2rcp8Z1mEqQINu2s64COHD95oFvQF9WPhZqdci9EsMKFDF9NoFnxnYLRkNOvgxvy0ZPnEReEIM2xu8dSmJRmMho39yJqXkDbCihm3X50S4t+SR+P5o2Fza1ZGzg6pKFWfSTTXJCuc+WVBelArRwxViGxzFd/ExS0vHC4qy7etpmDzkntgwBo8CdOAOLa7wH7FzDt93wiIjLGJJSzxn2Z4BEhxDvZg0yFHfpyd6h8sGzotAFk1a+alyRxFoz4MQRhj60GY1nycplIwT7/iGd1LkGEFG2fHj9nZdypbRbv0G417ZBUSlg6d3FV45UQJ2HgDW9q/HOtHSZDQSbtaeXp/QXGeysFFPHZOC4SX6do4RIg1/L298OaFMkWVhqKRo4vYyU1q7yASSRiVsu75iMQijbpmCo8uqCaljdupfYT6CK9IIqZUtYE09fE2DaNr1gTkhEQR9/xrQsfW3gByaRpp1Ky+mD68rFhMq9WM/lR/ALZ2jZoCB6ZcrqmWq+E+zh6SXsO8g1oJpgdMhFifE7sKs1OrKss0KiUDz0ZjEJuVplmGqXGu6BSmc7VoD+BB1mSJJEijqr2mVl60GDFa/yLDMDRnq9D+o7i4xPU/AQreCDEapHsHbUpN9GLdTerei2IQy91HasdFm7WzF7l4iltHu0R2wT0aYkOIEPlm2rRpI0eOfOaZZxDBABLMnW/UarVuhhjBEPJE+IYI0SjkifANEaJRyBPhG5VKJZEI30TU3BAh8g2pEY1CngjfECEahTwRviFCNAp5InwDQiR9xIYQIfINqRGNQp4I3xAhGoU8Eb4hQjQKeSJ8Q4RoFPJE+AYM2kSIDSFPhFe4yYcMIxIJYaoqvxAh8gppl01BHgqvECGagjwUXiEzHkxBhMgrpEY0BXkovEKEaAryUHiFCNEU5KHwChGiKchD4RUyWDEFESKvkBrRFOSh8I2pWK42DhEir4BzLzc3FxEaQITIK9Au19sajaCDCJFXiBBNQYTIK0SIpiBC5BUiRFMQIfIKEaIpiBB5hQjRFESIvEKEaAoiRF4hQjQFESKvgBA1Gg0iNMAWd56yLOBcIVpsCBEi35DW2ShEiHxDhGgU0kfkGyJEoxAh8g0RolGIEPmGCNEoRIh8Q4RoFLLzFE9ER0fTdM3QEJ45HMPrgAEDFi5ciAhk1MwbHTt2RNxufhxgSqQoyt/fPzY2FhG0ECHyxJtvvuno+MBejZ06dWrdujUiaCFC5Im+ffsays7T03PEiBGIUAsRIn+MHTvWxcVFd9ymTZsOHTogQi1EiPzx/PPPR0ZGwoGrq+uoUaMQwQAyam6ABh3fXVxRplQrNTCo0Gi45yOWiuCU0u45DynwymjTRRJao+I2kdeXpGmK23WeGxdz24rrpjfQIm4jcKCkpORK8iUXJzcYRHNXiZGm1pJD0dy1+n3K9Zfoc7nNwg0+LLGEUqse+Oyk9mK/YPtOPZ2RACFCfIDfl2UV5FVLpCKWYTUqlhJRrE5wUqRRcnvLc4IAfYg4vSKtLnWK1G9QzxXglEghWrvVvEa71XxteYClGcSARLl0SkyxtcqjuMaJZRmqppzBJQ/cthaRBGlUD/zyUhnomLMN9RnuF/GUAxIUxKBdx65V2ZVlzOg5LZGQuXNJ/mdCHi31DY8SkhZJjVjD9uXZlXLN4KnByCrY8Hlq7KxwZ+FENyGDlRpy71X3GRWErAUvP9meNZlIOBAhciT/XQ7jBid3ClkL/uEOFWVC8miTPiIHNMqMClkTMkdKpRTSggQiRA41o9YwVtVXhp4/wyABQYRIwAIiRA6aAhDBghAhcoCtmEXWZcaiOD8NEg5EiBzgPNP+syK4PqKQvlpEiAQsIELkgEaM9BEtCxEiB8sgxrpcnQzFCuurRYTIwXKTYayqSqRZSljfLCJEDmK9sThEiFq42oPMQrIkRIgcDAUytKo6UWgtMxGiFsoKa0OBdTbINDAObuY+xmLcsfP3RV/Ob9IlgvtqkRqRg3PwYexZSUm5jqwdIsTHRC6Xb9m64ey5U+npdzw9vLp37/nWuMkymQyyGIZZ/t2XJ04elUqkffq83D6q00dz3tu25YCHh6darV6z9sfTZ07k5+e2bx89dPDrTz/9nO6GQ17tO27spNLSkvXxcfb29l1inpn6zkxPT6/33p9w+fJFKHDw4L49u446OTkha4Q0zRwURTe1S7V9R8KmzeuGvz76i8+/nThx+tFjh0BAuqwtWzfu2bt92tRZq1ZtsLd3AOUhbdQbeP3u+yVbt20aOmT4po17evboM/+T2ceO/6W7SiKR/PZbPBTbueOv9b9su5p8ad36nyD926/j2rZt369f/yN/nW+SCoXVRyQ1IgfLNtmx8vp/YkFJLVqE6U6Tky+fPZc0ccK7cHzg4N4ez/fu1bMvHI8aOQ7SdWUUCgVkjRwxdtDA1+D0368Mhqvif/0Z7qMrEBgYHDvqLe7IyRlqxFu3bqAnQFizOIgQHxOowM6dP7X4y/m379zSxTt0d/eAV41Gk56e+srLg/Qlezzf58qV/8IBCEupVILC9FnRnf71x/7dpWWlri6ucNq6dVt9lrOzS0WFHD0JxLMiOLjF7U0cZ8b9/H1i4k5olEFYvr5+q9esSPxjF6TLK+Qsyzo41AX+cnV10x3I5eXwOm36/9S7VXFRoU6ItuzfIULkYJmmtWQgtT17tw17beSA/kN1KTqRAQ723LJ2lapuLVZxcaHuwNOLW2Y84/050AQb3s3Hxw/ZPESIHNxYhW5CjQjtb1VVlZeXj+4UGtykU8d1x9Bk+/j4wlBaX/hk0jHdQVBgiJ2dHRw8FR2jSykuLtJWn80fkkEbnURI9SsZNXNwYxWmCR+bWCwOCQmF7l1W9j0wuCz5amGH9tHl5WUVFRWQ2/2ZHgcP7Tt3/jSIDEbQkK67CgQ3dsxEGJ1cvXoJtAvj5Zmzp3y7fPFD3w5q0Bs3ki/+95xhRds4XKwcQRm1iRAfk3lzvpDZycaOGxb75pB/de46fvxUOB36Wt+c3Owxb07o0OGp2R9MHf3m0Lt306AFR5x2JfD6xvA3Z838eFPCuoGDe4GtMcA/aMaMuQ99r4H9X4Xu46zZ71RWVqBHhBHYYIXEvuE4tbfgwuHSMfObJ/xSdXU12KuhytSdJvwWv3Hj2j27jyIeuXmm9Mz++1O/jkACgdSIHM1rcgPlTZg0atv2BGi1Dx85+PuWDYMGDUOERiGDFQ5uYmzzfSXHjplQWlp88ODen1d/7+3tC34UMGsjfmG4wQpZxSc0oH/CNmuAjunvfoAsCs0NVsi6ZqFBN2uNiAVCG6wQIXIwzV0jEpoKESKHNc7QFhhEiFq47hQRoyUhQtRCWduEA848TAYrgoNF1hYNjPteCcpVQYSoDYFbmAAAEABJREFUhbW6aGBCgwiRg2jQ4hAhaiHRwCwNESKHdjkpIlgQIkQOqVQskVlXlUgjiUSEhAOZfcMR1NKBEdLuOA+nJEclrK8WESKHX7hUIqXP/VGErIV7d+QB4ULaFJIIsYZXxgSkXCxGVsH+X3JYhn15jA8SDmSGdg1VVVXvT5/TwfUdD19ZWFsXO0dWbWoaBMX5ptkHErhTmkIMixr6CrnCVL0bcLNjWLruWlSz3En/vxq427LGTZz6C/UHYlpUmKPMSCmTOYhGzBbYBpdEiDX8+uuvUVFRndt3TlieWV6kVqoZRl33ZPR+Cr0kGj41XRlDj0atB1ubbJiu/VfvwevsRywXDKpuDoa+ZD3ZcWsOmZpi+jtL7CiJRKwS5XV4UdWqVSsfH1IjCoeioqLly5d/8skniC+mT58+fPjw7t27IzOwZs2auDguhpOzs7OLi0tISEinTp1at27duXNnhDe2br6ZO3cuKAPxiJeXl6OjIzIPo0aN2rdvX0ZGhlwuz8rKunnz5qFDh9zc3OAdd+3ahTDGRmvE3NzcM2fODB48GFkdq1atWr16db1E+JQvXLiAMMYWR82lpaXjx49/+umnkSWA74BCoUBmY9iwYYGBgYYpdnZ2mKsQ2ZoQc3JyoMFSq9V79+719fVFluCDDz64ffs2MhvQ9D/33HP6hg4OFi1ahLDHhoR4+fLlCRMmwOfk6emJLAd8AcwR7MaQESNGeHtzAZ90LfLOnTtXrlyJ8MYmhJiXl4e0cTL37NmjC4NkQZYsWRIWFobMSVBQUExMDMMwfn5cnLGvv/5aKpVOmzYNYYz1D1ZgtHj48GGw0SA8gL4BVIpisdntFf369Tt48KD+9NSpU3PmzImPjweZIvyw5hqxrIwLw1VZWYmPCoHJkyfn5+cj82OoQuCZZ56BNnrq1KkHDhxA+GG1Qly7dm1iYiLSdpgQTkBzCQZnZAnAxA1aPH78+DfffIMwwwqbZpVKdf/+fXjiU6ZMQQRjbNq0CborDc2NFsTahAgPF/pGUOtA9xxhCbg9oJem2+3CgoANYdKkSevXrwcHIMIAq2qat27dCjZCcLBiq0IgNja2uroaWRrwQUMbvWDBAmg6EAZYiRC3bNkCr71794ZvOcKbgIAATL4nEokE2ujk5OTPP/8cWRprEOKMGTN0HQwPDw+EPQkJCTzYbh6duXPntmvXbtSoUbrdYiyFsPuI58+fB8stWObqeVdx5u7duy1atECYkZKSMmbMmJ9++gmabGQJhFojKpVK8O7ruvwCUiH0DqHuQfgRGRl5+vTp7777bvPmzcgSCFKIRUVFBQUFy5Ytw3++Zz2g/QkPD0e4smbNmuzsbGisEe8IrGkG/b399ttgrHZ3d0cE87B///64uDiw7Dg7OyO+EJgQt2/f3qVLl+DgYCRMNBpNTk4Ont5eQ8DYCV3GxYsXd+vWDfGCMJrm1NTUd955Bw5effVV4aoQAJcP/gYmAGyxR44ciY+Ph8YH8YIwhAj+ko8//hgJH4qiMBwym2LFihUKhQKsY8j8YN00X7t27cqVK7jNWrA1jh07tmjRIqgdzbo+Fd8aEYbGS5cuHTBgALIiwOoEw1IkKHr27Llhw4axY8devXoVmQ18hQjuh3Xr1vE5cOOBqqqq+fPnC86J4OXllZiYCFZG3Vx3c4CpEDdu3Hj27Flkdbi6uv7444979uxhGOHt63Lp0iXzrTjDdIF9fn6+lYX51yORSAYNGpSZmQluIQH5hP7555+ICDPudYqpEGGAgtXMgGYHjFCDBw/etGmT+aI+NC8gxFatWiGzgWnT7OfnB/0SZNXs2rUrJSVFLpcjIXDnzh2z1oiYCnHHjh27d+9G1g74yrOyspKSkhD2mLtpxlSI4FMGVxiyASIjIxMSEvCvF2/fvm1WIWJq0AZXGIwrLRUVhH/AuAh/L7Y+6NLSUnCu/vXXX8hsYFojent7244KkXb9QHFxsaXmAj4Uc1eHCFshHjhw4LfffkO2RIcOHaBeBIs3wg/bFWJhYaHgXGFPjm7xzcWLFxFmmNt2g7AV4ksvvfTGG28g28PBwUEmk33xxRcIJ6BGNLcQMTUaWzZynGVp167dzZs3EU7YbtN87Nix9evXI1sFhqjwioklFbyRMHY0dzg/TIUI9oKMjAxk28DwZebMmcjS8NBBRNg2zT169BDcCr1mJywsbOzYscjS8NAuI2xrRDc3N/xXGPFA+/bt4dWyUeRsWohnz57FP+wzb0C9aMElV/w0zZgKEXyvaWlpiKDF3d196dKlcKAPT/Pyyy8PHDgQmR+FQpGfn8/DyklMhRgTE6NbP0rQoVsyARbvioqKAQMGFBQUgEuQhyDEPFgQdWAqRBcXFwEtu+SN5cuXv/LKK7m5uUi7/MWssxB0mHv2lx5MhXjt2rVly5YhwoMMHz68srJSd0xRVEpKik6U5oOfkQrCVojwuM26PZMQGTly5J07dwxT8vLywPKPzAk/IxWErRDBzTVr1ixEMEA3YVEkEulTlErloUOHkDkx9woBPZgatB0dHXEO32YREhISLl68eO7cuTNnzoBVIScnx9exM1vmcWj7LX9/v7pylHZzccPpzlDbMNxu5Gyj6yLrtjZnazZILy8vD/V8PvM6lYnK6rYuR4aHhtcbSaVpyifIzivw4aGa8ZqhPX78eHjE8CtB01xWVgZmC6gG4PjPP/9EBAN+WZhaWaqhaKTh7DkP6IvTw4Oio2mkXUXN1itZD70QtRc3KGlwT31JQ+1RtW9gqCexBARGSaRUx2fdu/3bDZkGrxoRWuQNGzbot34AUwXSztZGBAPiPkr1DrEfNtkf4bt3wgNcSyq9erLIP9QupJ3JnY7w6iPGxsY29Ox17doVEWqJ+7/UtjGefUcKRoVAVHfX4bPCEtfnnD9YaqoMXkL08fHp37+/YYqnpyeeQactwh/r88USUXRfVyRA2nZzu3Ss0FQudqPmESNGGFaK0dHRmGyNhAN5GdVe/jIkTDr38VCpWKWJdbPYCRF8KuBF1cUb8fDwGD16NCLUolKoxTIBb40DY6aCPOOrw3D8q/SVYnstiFCLWsmqlSokWBgNy5jYVeiJRs3KKpS0ryDvbnVFmVqlBAsBBe9Ul/3gUB5sDSxj7LS2GEVTLMMdgXWgV4tF6iC1VCRZOTsVmTBcUZzpiar3dnXGMMM/Ugw3p8Ui5OghCWwp6z7AdhfEYMtjCnF/fF7GjQqVgqElYOqnaanIzkkKMmINBFNPPfUkUndae1RnnaKQlH3QRmVMXkblaUKIIlC4RqEuylXlZVRd+KvIwVnSurPz80OIInGhyUL845e81GtykYhy9nYOjBLA3ncN0Sg1964VXTlRcvVkSecX3J7+t2DkCM2IoMNGao3txn//pgnxpw/SoLpp0cHfyce8a7rMikgqavEUGMm9798phdrxxpnycZ+EIiEAnRlB753I9d6Q8d//UQcrGTerfnj/trOPY5teIYJWoSHeLV2j+oYhkeTHmXcQwfxwlaGJ79EjCbH0vnp3XFa73mEB7aywUxXWxc8v0lsQWoRmzUoDOj+CEG9frty45G77F8NoEbJWPIIcw7oEr5iJ/QxIquafQNH2EY1nPVyIB9bnRHQV8K5jj4i9i8irhcdPH2G9YkvofURues7jNc1xc9KdfZ2kTtZbGRrgG+FKi+lNSzMRgXcaE+KxrQVqpSako5UHVTekVfegwmxFTpoSEcwCix6jab52ptQ7XJCWwifBycN+7+osRDADlMmW2bQQk3YXgqfEO9QFYcmlq3/OnNdNXlGMmpuwGD9FpaasAMudoSgLDFWGvNo3/tfVqDl4nD7i9bNlDm4m59NaNyIpvT8e23i1TZPiJws/TPxjF8Iek0KsqtD4Rdhcu6zD2de5MFeBMIRFLGraqDkl5ToSAsZdfDfOyMGtae9mrtno6RlXDh5ZnXnvupOje9vI5/q9MF4m43YCO3l6y6Fjaye/tTI+4aO8/FR/34ge3Ud06VyzU+7e/d+fv5xoJ3V4quNLPl4hyGz4tfQozSpDwueFPjHwuvSrT1eu+mbPrqNwfPLksfXxcXcz0lxd3SIiIqdP+8DXt2YFYCNZOsBytG375gMH9mbeu9siJCwm5um3xk02XN76CLBN6yOm35CLJOZaV1VQmPnTumkqlWLqhNVjRn6Zk/fPyrWTNdrlaCKxpKqqfOe+r14f8n9LF57u2L737zs/Ky7hghkknd2WdHbrq/1nTZ/4i6d7wKEja5DZEEs5B8Y/F4SxOVkj7E88Ca+zZs7TqfD8hTMfL5jVr1//3xMS589bnJeX8+13i3UlG8nSs317woaNa4e9NjJh096BA1/bl7gz4bd41DRMOoaMC7G8WCMWmatbfPHyfrFIMnbEl77eoX4+4f8ZPCcrJyX5Rk3EAo1G9eIL41sEd6AoKia6P3wLs3JuQfqJU793jOoD0nRwcIE6MiI8BpkTWkTnZmLZOj8Ba39Z2eP53qAkqPOiojpOmfz+6dMnbmrb7kay9Fy+cjEyst1LLw1wc3Mf0H/oih/Wdev6LGomjAtRrdKYz6kJ7XJwUDtHx5pVrh7u/p4eQWl3L+kLhARG6Q4c7Lkxe1V1OcixoCjT1ydMXyYooA0yKyxbWYHdXGjO1/wE4+bU1H/atInSn0a2bgevN29eazxLT/v2nS5cOLNk6cL9B/aUlpUGBgRFRDTbciKT7S+DzGW/qKqWZ2ZdB+OLYWJZed36roZT7qoVFQyjsbNz0KdIpWYe0VOUiMJuHQXDck4+9FjI5XKFQmFnV7f2ysGBe56VlRWNZBneAepLBwfHk0nHvlzyiVgs7tXrxYlvv+vl1YRV5424yo0LUWonopC5fJrOzp5hLaJf6j3BMNHRsbElkjI7R5oWqVTV+hSFshKZE6iDZQ7YOTafZPKNTMbprLq6bu1ShVZnnh5ejWQZ3oGmaWiR4Sc9PfXixbPr4uMqKuRffNaUsMqUSYO2cSG6eEoKcszl5grwbXXhcmJ46FP6iA65+aneno2NgqGOdHfzT8+42rO2T3Ij5SQyJwzD+oXhZ0alHn+GNtRhka3bXrt2RZ+iOw5v2aqRLMM7wHi5deu2YWEtQ0PD4adcXr4vcQdqCtwioyYZtFt1ctaozdU0g0WGYZjdf3yjVFbn37+798APy34YmZP3kClYndr3vXr9CDhU4Pjw3/F37yUjs6GUa6AVjOjkgDBDO7G0CS2VnZ2dt7fP+fOn/3vpvFqtHjpk+ImTR7dt21xWXgYpP678uvNTXVpFRELJRrL0/HV4P4ysk5KOQwcRhjJ/nzjcPqoTaiaM14hhHezhDy6/X+3s3fzLuWHYO3PqpiN///rtqjH599NDgqL+M2TOQwcffXuOq6go3pm4bMPvc6BlH/TKe5u2fGymOVF5acViGa4TjppYIY4a+dYv61adPZe0edNesM7cL8j/bcuvP/y4DGyEMf96+u3xUwN8t4UAAAP+SURBVHXFGsnSM+P9uT+s+GrOvPcRt+TcE9ro/wyLRc2EyWhg6z+9q2FF4V38ke2RcjwzIFQ2cKIfwoyVs+8ERti/MDwACZN1C24PnRQYFGmkz2NyYNiph3tVqbUZ0h4RlUKNoQoRsoJ1Aixq6iq+6J4up/bez7lZ5N/GuMe5pDTvqx9GGs2yt3OqUhh3S/h5h0+d8DNqPuZ+3sdUFnhrRCIjf2BoSMfxo02O9W6fyXZylSAsEfLsbB2Uqb+hMT9el5c9z+4vNCVEZyfP96f8ajQLRiFSqfHOJU03s+fQ1O/A/RoqhVRiZMGhWNSYD726TDF5MR/Beh8DWkRRtHWunmpMFjF93K6eKE07nxMWY6SnCJWNh7vlOyvN+zvcOp4Z3MpBhGvoQUbD6qKyWB8PcR6Mm9+iulxRkmNe6zEm3Lt6nxaxgyfjOxSw6eWkkxe3vHctH1k7OTeKywsqxn8WhjCGiy1knRXioyywp9HkJS2TD6UVZVltvZh5pbA0vwz+TIQ31BN4VjChyWtWDBGJ0NSvI3Ju5qeey0FWR8rfmZUlFZMWC2A3DW2NKOwqsWnzEY3yzlctEaO+fjg9N6X5lyxZhPRL+df+Snd1E09chHWLrIerEW1w1NyQtxaEnjlQfPlocXF2mczJzjfCw8FdOMHtaynKkhemlSoVKqmMHjohOKC1YP4ErkYU9qjZ5C/fZKtet5fc4efCnyWXT5SkX8zm5lOIaA4RhR6MCVu7z0ztVjD1IsaaiMOpL0ZRqGalEMXtP6OP4ak9YCkuXmzNqW6WW90pd4cHCsBYGDE0wzBqhYbhspCzh+TFEQGh7YW2TFHwtaHJP+Axzcv/6usGP3Bw+7/yW5fk5UUqRRWjtXLVlRGJKY1aKySttvSnOmiRLpJLzTGjqb1KQmlUXCromNVGe4FfnqZZXYEacdNazTO1qqW0d1DDW3A7MdFiitWwkAWXgPkc0sUSipYgmYPUxV3c7hmXwJZCDcyPrECKJnhSP0fEU07wgwj8wCIrtd7guikkwSgSqUgsEXBALLFY23gZzUIE4SCRUYpKLGOhPBrQaQ8KNz40FPDuMTZIaFtcQ1A8Akm7C+zsRchEhU6EKCR6vuYBg5XDmwTpcb2bXNb7Pz6mcvHar5nwKMR/lgHmg869vFpECWD4Ly9hL/55/+7N8jFzQx1dTXZwiRAFyZZvs4pylRo1ozHc6qvBzuCclfdxfdPGtw9HD91//AHAugyeIHsncb9RvgERjX1tiBCFjBJVVWnqTh9wDBiY+A3RuxmMJhq4GViaoup5cfT+g4auCMqE00Qksn804x4RIgELiPmGgAVEiAQsIEIkYAERIgELiBAJWECESMCC/wcAAP//XBFHRQAAAAZJREFUAwCMoYZjcDValQAAAABJRU5ErkJggg==",
      "text/plain": [
       "<IPython.core.display.Image object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---\n",
      "config:\n",
      "  flowchart:\n",
      "    curve: linear\n",
      "---\n",
      "graph TD;\n",
      "\t__start__([<p>__start__</p>]):::first\n",
      "\tagent(agent)\n",
      "\ttools(tools)\n",
      "\t__end__([<p>__end__</p>]):::last\n",
      "\t__start__ --> agent;\n",
      "\tagent -.-> __end__;\n",
      "\tagent -.-> tools;\n",
      "\ttools --> agent;\n",
      "\tclassDef default fill:#f2f0ff,line-height:1.2\n",
      "\tclassDef first fill-opacity:0\n",
      "\tclassDef last fill:#bfb6fc\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from IPython.display import Image, display\n",
    "\n",
    "arch = ToolUse(max_rounds=4)\n",
    "graph = arch.build()\n",
    "display(Image(graph.get_graph().draw_mermaid_png()))\n",
    "print(arch.diagram())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1a0d25d1",
   "metadata": {
    "papermill": {
     "duration": 0.008397,
     "end_time": "2026-05-27T03:49:18.317625+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:18.309228+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## 8 · Live run\n",
    "\n",
    "Concrete task: time-sensitive research that the model can't possibly have memorized. The question below asks about the *current* state of an open-source project — the model's training cutoff is well in the past, so an honest answer requires real web search and the citation requirement forces grounding."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "961ce3b8",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-05-27T03:49:18.330237Z",
     "iopub.status.busy": "2026-05-27T03:49:18.325748Z",
     "iopub.status.idle": "2026-05-27T03:49:52.522374Z",
     "shell.execute_reply": "2026-05-27T03:49:52.520961Z"
    },
    "papermill": {
     "duration": 34.220081,
     "end_time": "2026-05-27T03:49:52.537706+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:18.317625+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">Final answer</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">──────────────────────────────────────────────────────────────────────────────────────────────────────</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;36mFinal answer\u001b[0m \u001b[92m──────────────────────────────────────────────────────────────────────────────────────────────────────\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">The latest stable Python release version as of 2026-05-27 is Python 3.14.5. Two user-visible features new in this  \n",
       "release are <span style=\"color: #008080; text-decoration-color: #008080; background-color: #000000; font-weight: bold\">compression.zstd</span> and <span style=\"color: #008080; text-decoration-color: #008080; background-color: #000000; font-weight: bold\">except*</span> statements.                                                               \n",
       "\n",
       "For more information, see the official Python blog at https://blog.python.org/2026/05/python-3145-is-out and the   \n",
       "Python documentation at https://docs.python.org/3.16/whatsnew/3.16.html.                                           \n",
       "</pre>\n"
      ],
      "text/plain": [
       "The latest stable Python release version as of 2026-05-27 is Python 3.14.5. Two user-visible features new in this  \n",
       "release are \u001b[1;36;40mcompression.zstd\u001b[0m and \u001b[1;36;40mexcept*\u001b[0m statements.                                                               \n",
       "\n",
       "For more information, see the official Python blog at https://blog.python.org/2026/05/python-3145-is-out and the   \n",
       "Python documentation at https://docs.python.org/3.16/whatsnew/3.16.html.                                           \n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> tool call(s)  ·  </span><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> final agent round(s)  ·  tools used: tavily_search</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">───────────────────────────────────────────</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;36m2\u001b[0m\u001b[1;36m tool \u001b[0m\u001b[1;36mcall\u001b[0m\u001b[1;36m(\u001b[0m\u001b[1;36ms\u001b[0m\u001b[1;36m)\u001b[0m\u001b[1;36m  ·  \u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;36m final agent \u001b[0m\u001b[1;36mround\u001b[0m\u001b[1;36m(\u001b[0m\u001b[1;36ms\u001b[0m\u001b[1;36m)\u001b[0m\u001b[1;36m  ·  tools used: tavily_search\u001b[0m \u001b[92m───────────────────────────────────────────\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from datetime import date\n",
    "\n",
    "TASK = (\n",
    "    f\"As of {date.today().isoformat()}, what is the latest stable Python release \"\n",
    "    f\"version, and name 2 user-visible features new in that release. \"\n",
    "    f\"You MUST cite at least 1 source URL (e.g. python.org or PEP page) — \"\n",
    "    f\"answers without a URL will be considered ungrounded.\"\n",
    ")\n",
    "\n",
    "result = arch.run(TASK)\n",
    "\n",
    "print_header(\"Final answer\")\n",
    "print_md(result.output)\n",
    "print()\n",
    "print_header(\n",
    "    f\"{result.metadata['tool_calls']} tool call(s)  ·  \"\n",
    "    f\"{result.metadata['rounds']} final agent round(s)  ·  \"\n",
    "    f\"tools used: {', '.join(result.metadata['tools_used']) or 'none'}\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1c59a2a8",
   "metadata": {
    "papermill": {
     "duration": 0.01702,
     "end_time": "2026-05-27T03:49:52.571820+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:52.554800+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "### 8.0 · What just happened, briefly\n",
    "\n",
    "Look at the **tool call count** above. Three regimes you might see, each meaningful:\n",
    "\n",
    "- **`tool_calls = 0`** — the agent answered from parametric knowledge, ignoring the citation requirement. That's *result drift*, the most dangerous Tool-Use failure mode (the answer *looks* confident but is ungrounded).\n",
    "- **`tool_calls = 1–3`** — focused use; the agent searched, found enough, answered. This is what we want.\n",
    "- **`tool_calls ≥ 4`** — *over-search*. The agent kept searching past the point of diminishing returns. Usually a sign that either (a) the model didn't trust the first results, or (b) it ignored its own system-prompt cap.\n",
    "\n",
    "§ 9 below will quantify which regime this specific run fell into."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fdcff209",
   "metadata": {
    "papermill": {
     "duration": 0.007624,
     "end_time": "2026-05-27T03:49:52.594358+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:52.586734+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "### 8.1 · Full trace\n",
    "\n",
    "Every event the agent took, in order. `tool_call` events show *what the model asked for*; `tool_result` events show what came back; `agent` events are the model's natural-language outputs (only the final one has no tool calls)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "a63ededd",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-05-27T03:49:52.642752Z",
     "iopub.status.busy": "2026-05-27T03:49:52.641722Z",
     "iopub.status.idle": "2026-05-27T03:49:52.755367Z",
     "shell.execute_reply": "2026-05-27T03:49:52.751294Z"
    },
    "papermill": {
     "duration": 0.144899,
     "end_time": "2026-05-27T03:49:52.757796+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:52.612897+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">›</span> <span style=\"font-weight: bold\">[</span><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span><span style=\"font-weight: bold\">] USER</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1m]\u001b[0m\u001b[1m USER\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">As of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2026</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">05</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">27</span>, what is the latest stable Python release version, and name <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> user-visible features new in that \n",
       "release. You MUST cite at least <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> source URL <span style=\"font-weight: bold\">(</span>e.g. python.org or PEP page<span style=\"font-weight: bold\">)</span> — answers wi\n",
       "</pre>\n"
      ],
      "text/plain": [
       "As of \u001b[1;36m2026\u001b[0m-\u001b[1;36m05\u001b[0m-\u001b[1;36m27\u001b[0m, what is the latest stable Python release version, and name \u001b[1;36m2\u001b[0m user-visible features new in that \n",
       "release. You MUST cite at least \u001b[1;36m1\u001b[0m source URL \u001b[1m(\u001b[0me.g. python.org or PEP page\u001b[1m)\u001b[0m — answers wi\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">›</span> <span style=\"font-weight: bold\">[</span><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span><span style=\"font-weight: bold\">] TOOL CALL → tavily_search</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m2\u001b[0m\u001b[1m]\u001b[0m\u001b[1m TOOL CALL → tavily_search\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">`latest stable Python release version and new features`\n",
       "</pre>\n"
      ],
      "text/plain": [
       "`latest stable Python release version and new features`\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">›</span> <span style=\"font-weight: bold\">[</span><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span><span style=\"font-weight: bold\">] TOOL RESULT (tavily_search)</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m3\u001b[0m\u001b[1m]\u001b[0m\u001b[1m TOOL RESULT \u001b[0m\u001b[1m(\u001b[0m\u001b[1mtavily_search\u001b[0m\u001b[1m)\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span><span style=\"color: #008000; text-decoration-color: #008000\">'error'</span>: <span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">ValueError</span><span style=\"font-weight: bold\">(</span><span style=\"color: #008000; text-decoration-color: #008000\">'Error 400: When time_range is set, start_date or end_date cannot be set'</span><span style=\"font-weight: bold\">)}</span><span style=\"color: #808000; text-decoration-color: #808000\">...</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\u001b[32m'error'\u001b[0m: \u001b[1;35mValueError\u001b[0m\u001b[1m(\u001b[0m\u001b[32m'Error 400: When time_range is set, start_date or end_date cannot be set'\u001b[0m\u001b[1m)\u001b[0m\u001b[1m}\u001b[0m\u001b[33m...\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">›</span> <span style=\"font-weight: bold\">[</span><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span><span style=\"font-weight: bold\">] TOOL CALL → tavily_search</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m4\u001b[0m\u001b[1m]\u001b[0m\u001b[1m TOOL CALL → tavily_search\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">`latest stable Python release version and new features`\n",
       "</pre>\n"
      ],
      "text/plain": [
       "`latest stable Python release version and new features`\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">›</span> <span style=\"font-weight: bold\">[</span><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span><span style=\"font-weight: bold\">] TOOL RESULT (tavily_search)</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m5\u001b[0m\u001b[1m]\u001b[0m\u001b[1m TOOL RESULT \u001b[0m\u001b[1m(\u001b[0m\u001b[1mtavily_search\u001b[0m\u001b[1m)\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span><span style=\"color: #008000; text-decoration-color: #008000\">\"query\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"latest stable Python release version and new features\"</span>, <span style=\"color: #008000; text-decoration-color: #008000\">\"follow_up_questions\"</span>: null, <span style=\"color: #008000; text-decoration-color: #008000\">\"answer\"</span>: null, \n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">\"images\"</span>: <span style=\"font-weight: bold\">[]</span>, <span style=\"color: #008000; text-decoration-color: #008000\">\"results\"</span>: <span style=\"font-weight: bold\">[{</span><span style=\"color: #008000; text-decoration-color: #008000\">\"url\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"https://blog.python.org/2026/05/python-3145-is-out\"</span>, <span style=\"color: #008000; text-decoration-color: #008000\">\"title\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"Python 3.14.5 is </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">out! | Python Insider\"</span>, <span style=\"color: #008000; text-decoration-color: #008000\">\"content\"</span>: \"## Major new features of the <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.14</span> ser<span style=\"color: #808000; text-decoration-color: #808000\">...</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\u001b[32m\"query\"\u001b[0m: \u001b[32m\"latest stable Python release version and new features\"\u001b[0m, \u001b[32m\"follow_up_questions\"\u001b[0m: null, \u001b[32m\"answer\"\u001b[0m: null, \n",
       "\u001b[32m\"images\"\u001b[0m: \u001b[1m[\u001b[0m\u001b[1m]\u001b[0m, \u001b[32m\"results\"\u001b[0m: \u001b[1m[\u001b[0m\u001b[1m{\u001b[0m\u001b[32m\"url\"\u001b[0m: \u001b[32m\"https://blog.python.org/2026/05/python-3145-is-out\"\u001b[0m, \u001b[32m\"title\"\u001b[0m: \u001b[32m\"Python 3.14.5 is \u001b[0m\n",
       "\u001b[32mout! | Python Insider\"\u001b[0m, \u001b[32m\"content\"\u001b[0m: \"## Major new features of the \u001b[1;36m3.14\u001b[0m ser\u001b[33m...\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">›</span> <span style=\"font-weight: bold\">[</span><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span><span style=\"font-weight: bold\">] AGENT</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;35m›\u001b[0m \u001b[1m[\u001b[0m\u001b[1;36m6\u001b[0m\u001b[1m]\u001b[0m\u001b[1m AGENT\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">The latest stable Python release version as of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2026</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">05</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">27</span> is Python <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.14</span>.<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>. Two user-visible features new in this \n",
       "release are `compression.zstd` and `except*` statements. \n",
       "\n",
       "For more information, see the official Python blog at <span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://blog.python.org/2026/05/python-3145-is-out</span> and the \n",
       "Python documen\n",
       "</pre>\n"
      ],
      "text/plain": [
       "The latest stable Python release version as of \u001b[1;36m2026\u001b[0m-\u001b[1;36m05\u001b[0m-\u001b[1;36m27\u001b[0m is Python \u001b[1;36m3.14\u001b[0m.\u001b[1;36m5\u001b[0m. Two user-visible features new in this \n",
       "release are `compression.zstd` and `except*` statements. \n",
       "\n",
       "For more information, see the official Python blog at \u001b[4;94mhttps://blog.python.org/2026/05/python-3145-is-out\u001b[0m and the \n",
       "Python documen\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "for i, t in enumerate(result.trace, 1):\n",
    "    if t['type'] == 'user':\n",
    "        print_step(f\"[{i}] USER\", t['content'][:200])\n",
    "    elif t['type'] == 'tool_call':\n",
    "        args = t['args'] if isinstance(t['args'], dict) else str(t['args'])\n",
    "        query = args.get('query', args) if isinstance(args, dict) else args\n",
    "        print_step(f\"[{i}] TOOL CALL → {t['tool']}\", f\"`{query}`\")\n",
    "    elif t['type'] == 'tool_result':\n",
    "        snippet = t['content'][:300].replace('\\n', ' ')\n",
    "        print_step(f\"[{i}] TOOL RESULT ({t['tool']})\", snippet + '...')\n",
    "    elif t['type'] == 'agent':\n",
    "        print_step(f\"[{i}] AGENT\", (t.get('content') or '')[:300])\n",
    "    print()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ba45894a",
   "metadata": {
    "papermill": {
     "duration": 0.026235,
     "end_time": "2026-05-27T03:49:52.834361+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:52.808126+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## 9 · What we just observed\n",
    "\n",
    "The cells above are live. Below: a quantitative breakdown of the **actual** tool-call sequence the Nebius-hosted Llama-3.3-70B agent produced on this run.\n",
    "\n",
    "### 9.1 · Quantitative summary\n",
    "\n",
    "| Metric | Value |\n",
    "|---|---|\n",
    "| Tool calls made | **2** |\n",
    "| Tools used | tavily_search |\n",
    "| Final agent rounds | 1 |\n",
    "| Final answer length (chars) | 583 |\n",
    "\n",
    "**Queries the agent issued to Tavily:**\n",
    "\n",
    "| # | Query |\n",
    "|---|---|\n",
    "| 1 | `latest stable Python release version and new features` |\n",
    "| 2 | `latest stable Python release version and new features` |\n",
    "\n",
    "\n",
    "### 9.2 · Pathologies surfaced in this run\n",
    "\n",
    "- **Repeated queries.** 1 of the 2 queries were duplicates. The agent has no memory that it already asked. This is a real limitation of Tool Use — ReAct's *thought* step partly fixes it because the model has to justify each search.\n",
    "\n",
    "\n",
    "### 9.3 · The final answer (verbatim)\n",
    "\n",
    "> The latest stable Python release version as of 2026-05-27 is Python 3.14.5. Two user-visible features new in this  \n",
    "> release are compression.zstd and except* statements.                                                               \n",
    "> \n",
    "> For more information, see the official Python blog at https://blog.python.org/2026/05/python-3145-is-out and the   \n",
    "> Python documentation at https://docs.python.org/3.16/whatsnew/3.16.html.                                           \n",
    "> \n",
    "> \n",
    "> \n",
    "> 2 tool call(s)  ·  1 final agent…\n",
    "\n",
    "### 9.4 · The takeaway\n",
    "\n",
    "Tool Use is the **right** pattern when the model needs one or two facts from outside its training data. It's the **wrong** pattern when you need:\n",
    "\n",
    "- *Multi-step reasoning* between calls → use **ReAct (nb 03)**.\n",
    "- *Guaranteed grounding* of the final answer → use **Self-RAG (nb 25)** or **Corrective RAG (nb 24)**.\n",
    "- *Recovery* from failed tool calls → use **PEV (nb 06)**.\n",
    "\n",
    "The pathologies you saw above are not bugs in the implementation — they're inherent to the act-only loop. They motivate the next several notebooks."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c3d4894",
   "metadata": {
    "papermill": {
     "duration": 0.021154,
     "end_time": "2026-05-27T03:49:52.867643+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:52.846489+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## 10 · Try other providers\n",
    "\n",
    "Tool Use needs **tool-calling support**. The library's capability matrix (`provider_supports_tools`) gates this — providers without tool-calling (e.g., `huggingface`) will refuse to construct `ToolUse(...)`. Everywhere else, the same notebook runs unchanged."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "93ea2c93",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-05-27T03:49:52.903081Z",
     "iopub.status.busy": "2026-05-27T03:49:52.903081Z",
     "iopub.status.idle": "2026-05-27T03:49:52.932647Z",
     "shell.execute_reply": "2026-05-27T03:49:52.929403Z"
    },
    "papermill": {
     "duration": 0.047263,
     "end_time": "2026-05-27T03:49:52.932647+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:52.885384+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[skip] openai: no API key in .env"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "[skip] anthropic: no API key in .env\n",
      "[skip] groq: no API key in .env\n"
     ]
    }
   ],
   "source": [
    "from agentic_architectures.llm.factory import provider_supports_tools\n",
    "\n",
    "PROVIDERS_TO_TRY = [\"openai\", \"anthropic\", \"groq\"]\n",
    "for p in PROVIDERS_TO_TRY:\n",
    "    key = settings.api_key_for(p)\n",
    "    if key is None or not key.get_secret_value():\n",
    "        print(f\"[skip] {p}: no API key in .env\")\n",
    "        continue\n",
    "    if not provider_supports_tools(p):\n",
    "        print(f\"[skip] {p}: provider does not advertise tool-calling\")\n",
    "        continue\n",
    "\n",
    "    print_header(f\"Re-running Tool Use on {p}\")\n",
    "    other_llm = get_llm(provider=p)\n",
    "    other_arch = ToolUse(llm=other_llm, max_rounds=2)\n",
    "    r = other_arch.run(\"What is the current price (USD) of one Bitcoin? Cite the source URL.\")\n",
    "    print(r.output[:400])\n",
    "    print(f\"  tool_calls: {r.metadata['tool_calls']}, rounds: {r.metadata['rounds']}\")\n",
    "    print()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ee3c71a0",
   "metadata": {
    "papermill": {
     "duration": 0.019857,
     "end_time": "2026-05-27T03:49:52.979522+00:00",
     "exception": false,
     "start_time": "2026-05-27T03:49:52.959665+00:00",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## 11 · Failure modes, safety, extensions\n",
    "\n",
    "### 11.1 · Where this breaks\n",
    "\n",
    "| Failure | Mechanism | Mitigation |\n",
    "|---|---|---|\n",
    "| **Over-search** | Model keeps calling tools even when it has enough info | System prompt cap (we use this) + `max_rounds` bound |\n",
    "| **Result drift / hallucination** | Tool result is in context but model answers from parametric knowledge anyway | Force grounding in prompt; switch to **Self-RAG (nb 25)** or **Corrective RAG (nb 24)** |\n",
    "| **Vague queries** | Model issues queries too generic to retrieve anything useful | Add a thought step → upgrade to **ReAct (nb 03)** |\n",
    "| **Repeated identical queries** | Agent forgets it already asked and re-asks | Maintain a dedup cache; or use a planner (nb 04) |\n",
    "| **Tool execution errors** | Tool raises (network down, rate limit) and agent doesn't recover | The library's `web_search_tool` wraps with `tenacity` exponential backoff; for tools you write yourself, do the same |\n",
    "| **Prompt injection through tool output** | Tool result contains adversarial text that hijacks the agent | Treat tool output as untrusted; sanitize / quote when re-prompting |\n",
    "\n",
    "### 11.2 · Production safety\n",
    "\n",
    "- **Cap rounds + recursion limit.** Both are configured by default in this library; never remove them.\n",
    "- **Whitelist tools.** Don't bind every tool you have — give the agent only the tools relevant to the task. Each extra tool widens the prompt-injection surface.\n",
    "- **Don't let tool output flow to the user verbatim.** A user-asked-for \"summary\" should still be model-generated, not pasted from a search result the user can't see.\n",
    "- **Add a per-tool timeout.** A hung tool with no timeout will deadlock the whole graph.\n",
    "\n",
    "### 11.3 · Three extensions to try\n",
    "\n",
    "1. **Add a Python REPL tool.** Use `agentic_architectures.tools.code_exec.python_repl_tool` to give the agent arithmetic / data-manipulation power. Useful for \"calculate this from the search results\" tasks.\n",
    "2. **Multi-tool agent.** Bind both `web_search_tool` and a domain-specific tool (e.g., a SQL query tool). Watch how the agent picks between them. This is the path toward **Meta-Controller (nb 11)**.\n",
    "3. **Swap to ReAct (nb 03).** Same task, with explicit *thought* before each action. You'll see the agent's queries get more specific because it has to write a justification first.\n",
    "\n",
    "### 11.4 · What to read next\n",
    "\n",
    "- [**03 · ReAct**](./03_react.ipynb) — Tool Use + an explicit reasoning step. The natural next stop.\n",
    "- [**04 · Planning**](./04_planning.ipynb) — when the task is big enough that planning ahead beats reacting.\n",
    "- [**06 · PEV**](./06_pev.ipynb) — Tool Use + an automatic verifier that catches bad tool outcomes.\n",
    "- [**23 · Agentic RAG**](./23_agentic_rag.ipynb) — Tool Use where the tool is a vector retriever and the agent decides when to retrieve.\n",
    "\n",
    "### 11.5 · References\n",
    "\n",
    "1. Schick, T. et al. *Toolformer: Language Models Can Teach Themselves to Use Tools.* 2023. [arXiv:2302.04761](https://arxiv.org/abs/2302.04761)\n",
    "2. OpenAI. *Function calling and other API updates.* June 2023. [openai.com/blog/function-calling-and-other-api-updates](https://openai.com/blog/function-calling-and-other-api-updates)\n",
    "3. LangGraph `ToolNode` & `tools_condition` — [official prebuilts docs](https://langchain-ai.github.io/langgraph/reference/prebuilt/)\n",
    "4. Tavily search API — [tavily.com](https://tavily.com)\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.0"
  },
  "papermill": {
   "default_parameters": {},
   "duration": 42.252407,
   "end_time": "2026-05-27T03:49:54.324863+00:00",
   "environment_variables": {},
   "exception": null,
   "input_path": "all-agentic-architectures/notebooks/02_tool_use.ipynb",
   "output_path": "all-agentic-architectures/notebooks/02_tool_use.ipynb",
   "parameters": {},
   "start_time": "2026-05-27T03:49:12.072456+00:00",
   "version": "2.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}