{
"cells": [
{
"cell_type": "markdown",
"id": "59377cbd",
"metadata": {
"papermill": {
"duration": 0.0,
"end_time": "2026-05-27T12:43:16.474314+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:16.474314+00:00",
"status": "completed"
},
"tags": []
},
"source": [
"# 13 · Ensemble — N parallel voters + aggregator\n",
"\n",
"> **TL;DR.** Run K voter agents (each with a different perspective system prompt) against the **same** task. An aggregator LLM synthesises their answers into one balanced response. Compared to Multi-Agent (notebook 05), where specialists divide labour, Ensemble has all voters answer the **full** task — value comes from *diverse opinions*, not divided work.\n",
">\n",
"> **Reach for it when** the question is contested / has multiple legitimate framings (forecasts, judgement calls, fact-checking).\n",
"> **Avoid when** the answer is objectively correct or wrong — Ensemble adds noise, not signal.\n",
"\n",
"| Property | Value |\n",
"|---|---|\n",
"| Origin | \"Wisdom of crowds\" (Surowiecki 2004); modern LLM ensembles in practice |\n",
"| Voter selection | Different *perspectives* (analytical / skeptical / pragmatic by default) |\n",
"| Aggregation modes | `llm_synth` (default), `highest_confidence`, `majority_vote` (extension) |\n",
"| External tools needed? | No |\n",
"| Cost | K voter calls + 1 aggregator call |\n",
"| Composability | Each voter is just an LLM with a system prompt — could be replaced with full architectures |\n",
"\n",
"This pattern is *structurally similar* to Blackboard (notebook 07) but the key difference is **everyone answers the full question at once**, no turn-taking, no bidding. Cheaper than Blackboard for the same number of opinions."
]
},
{
"cell_type": "markdown",
"id": "ad796252",
"metadata": {
"papermill": {
"duration": 0.0,
"end_time": "2026-05-27T12:43:16.474314+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:16.474314+00:00",
"status": "completed"
},
"tags": []
},
"source": [
"## 2 · Architecture at a glance\n",
"\n",
"```mermaid\n",
"flowchart TB\n",
" A([task]) --> V[Vote round
each voter independently
answers the SAME task]\n",
" V --> O1[analytical opinion]\n",
" V --> O2[skeptical opinion]\n",
" V --> O3[pragmatic opinion]\n",
" O1 --> Ag[Aggregate
llm_synth: balanced synthesis
OR highest_confidence pick]\n",
" O2 --> Ag\n",
" O3 --> Ag\n",
" Ag --> Z([final answer])\n",
"\n",
" style V fill:#e3f2fd,stroke:#1976d2\n",
" style Ag fill:#fff3e0,stroke:#f57c00\n",
"```\n",
"\n",
"**Fan-out then fan-in.** The voters operate in parallel (we run them sequentially in the demo for clarity — a real production path uses LangGraph parallel branches). The aggregator sees ALL voter opinions and produces one balanced synthesis."
]
},
{
"cell_type": "markdown",
"id": "71cb2d78",
"metadata": {
"papermill": {
"duration": 0.0,
"end_time": "2026-05-27T12:43:16.485601+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:16.485601+00:00",
"status": "completed"
},
"tags": []
},
"source": [
"## 3 · Theory\n",
"\n",
"### 3.1 · Why use multiple voters at all?\n",
"\n",
"A single LLM has *systematic biases* — recency bias toward training-data viewpoints, sycophancy toward the question's framing, mode collapse toward \"safe\" hedged answers. Putting the *same* question to one LLM 3 times with the same prompt mostly produces the same answer (LLMs are mostly deterministic at temperature 0).\n",
"\n",
"Ensemble breaks the bias by **varying the prompt perspective**. Each voter is *forced* to look at the question through a different lens:\n",
"- The Analytical voter focuses on data / evidence / mechanism.\n",
"- The Skeptical voter looks for what could go wrong / what's missing.\n",
"- The Pragmatic voter focuses on what actually ships / works in practice.\n",
"\n",
"These three perspectives produce *substantively different* answers — not because the model knows different facts, but because each prompt activates a different reasoning pattern.\n",
"\n",
"### 3.2 · The structured-output `_VoterOpinion`\n",
"\n",
"```python\n",
"class _VoterOpinion(BaseModel):\n",
" bottom_line: str # 1-2 sentence direct answer\n",
" key_points: list[str] # 2-4 supporting points\n",
" confidence: int = Field(ge=1, le=5) # self-reported confidence\n",
"```\n",
"\n",
"Three crucial design choices:\n",
"\n",
"1. **Bottom-line first** — forces each voter to commit to a directional answer (yes / no / depends) before listing supporting points. Without this, voters hedge endlessly.\n",
"2. **Key points are a *list*** — not free prose. Easier to compare across voters in the aggregator.\n",
"3. **Self-reported confidence** — drives the `highest_confidence` aggregator mode. Note: LLM-self-reported confidence is *noisy* (see § 11.1).\n",
"\n",
"### 3.3 · Aggregation modes\n",
"\n",
"| Mode | What it does | When to use |\n",
"|---|---|---|\n",
"| **`llm_synth`** (default) | Aggregator LLM weaves all K opinions into one balanced response | Long-form questions, contested topics, when you want nuance preserved |\n",
"| **`highest_confidence`** | Pick the voter with highest self-reported confidence | Short factual answers; deferring to the most assured voice |\n",
"| **`majority_vote`** (extension) | Tally categorical answers, return mode | Yes/no, A/B/C, classification |\n",
"\n",
"The default `llm_synth` aggregator's prompt explicitly asks it to:\n",
"1. State the most-likely answer.\n",
"2. Identify points of *agreement* across voters.\n",
"3. Identify points of *genuine disagreement* (not paper over them).\n",
"4. End with a hedged recommendation.\n",
"\n",
"That structure forces the aggregator to preserve the multi-perspective nature, not flatten it.\n",
"\n",
"### 3.4 · Where Ensemble sits\n",
"\n",
"| Pattern | Voters on same task? | Coordination | Use when |\n",
"|---|---|---|---|\n",
"| ReAct (nb 03) | n/a | n/a | single focused query |\n",
"| Multi-Agent (nb 05) | no — different sub-tasks | central supervisor | task spans domains |\n",
"| Blackboard (nb 07) | no — turn-taking, dynamic | distributed bidding | exploratory |\n",
"| **Ensemble** *(this notebook)* | **yes — full task each** | **fan-out / fan-in** | contested / forecasting / fact-checking |\n",
"| Self-Consistency (nb 21) | yes — same prompt, N samples | majority vote | classification / arithmetic where vote tally helps |\n",
"| Multi-Agent Debate (nb 28) | yes — adversarial back-and-forth | converge via critique | controversial topics where iterative refinement helps |\n",
"\n",
"### 3.5 · What goes wrong (you'll see in § 9)\n",
"\n",
"1. **Flat confidence scores** — same Llama-as-Scorer pathology as ToT/Mental Loop. The bottom-line answers differ but confidence values are similar. Watch for it in § 9.\n",
"2. **Aggregator washout** — synthesis blends opinions so much that the minority view disappears. The aggregator prompt explicitly forbids this but it still happens.\n",
"3. **Hidden conformity** — all 3 voters arrive at the same answer despite different prompts. Either the question wasn't really contested, or the perspective prompts weren't different enough.\n",
"4. **Adversarial perspective wins** — Skeptical voter is the loudest because \"what could go wrong\" is easy to generate. Aggregator may over-weight skepticism.\n"
]
},
{
"cell_type": "markdown",
"id": "2c095e57",
"metadata": {
"papermill": {
"duration": 0.0,
"end_time": "2026-05-27T12:43:16.490169+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:16.490169+00:00",
"status": "completed"
},
"tags": []
},
"source": [
"## 4 · Setup"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "06156927",
"metadata": {
"execution": {
"iopub.execute_input": "2026-05-27T12:43:16.490169Z",
"iopub.status.busy": "2026-05-27T12:43:16.490169Z",
"iopub.status.idle": "2026-05-27T12:43:17.390419Z",
"shell.execute_reply": "2026-05-27T12:43:17.389460Z"
},
"papermill": {
"duration": 0.901291,
"end_time": "2026-05-27T12:43:17.391460+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:16.490169+00:00",
"status": "completed"
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"
Provider: nebius · Model: meta-llama/Llama-3.3-70B-Instruct ─────────────────────────────────────────────────────\n", "\n" ], "text/plain": [ "\u001b[1;36mProvider: nebius · Model: meta-llama/Llama-\u001b[0m\u001b[1;36m3.3\u001b[0m\u001b[1;36m-70B-Instruct\u001b[0m \u001b[92m─────────────────────────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Default voters: analytical, skeptical, pragmatic \n",
"\n"
],
"text/plain": [
"Default voters: \u001b[1manalytical, skeptical, pragmatic\u001b[0m \n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from agentic_architectures import get_llm, enable_langsmith, settings\n",
"from agentic_architectures.architectures import Ensemble\n",
"from agentic_architectures.architectures.ensemble import DEFAULT_VOTERS\n",
"from agentic_architectures.ui import print_md, print_header, print_step\n",
"\n",
"enable_langsmith()\n",
"print_header(f\"Provider: {settings.llm_provider} · Model: {settings.llm_model}\")\n",
"print_md(f\"Default voters: **{', '.join(DEFAULT_VOTERS.keys())}**\")"
]
},
{
"cell_type": "markdown",
"id": "5f786a61",
"metadata": {
"papermill": {
"duration": 0.003001,
"end_time": "2026-05-27T12:43:17.397419+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:17.394418+00:00",
"status": "completed"
},
"tags": []
},
"source": [
"## 5 · Library walkthrough\n",
"\n",
"Source: [`src/agentic_architectures/architectures/ensemble.py`](../src/agentic_architectures/architectures/ensemble.py).\n",
"\n",
"Two nodes:\n",
"\n",
"1. **`_vote`** — runs each voter LLM with their unique perspective prompt; collects `_VoterOpinion` structured outputs into a list.\n",
"2. **`_aggregate`** — branches on `aggregator_mode`:\n",
" - `llm_synth`: LLM synthesises a balanced response.\n",
" - `highest_confidence`: returns the most-confident voter's opinion verbatim.\n",
"\n",
"The voters are run sequentially for trace clarity; a production path can fan them out with `langgraph.graph.parallel` for an N× latency win."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a5b0a037",
"metadata": {
"execution": {
"iopub.execute_input": "2026-05-27T12:43:17.404267Z",
"iopub.status.busy": "2026-05-27T12:43:17.403268Z",
"iopub.status.idle": "2026-05-27T12:43:17.424675Z",
"shell.execute_reply": "2026-05-27T12:43:17.423721Z"
},
"papermill": {
"duration": 0.026214,
"end_time": "2026-05-27T12:43:17.425676+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:17.399462+00:00",
"status": "completed"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--- VoterOpinion schema ---\n",
"{\n",
" \"description\": \"One voter's typed answer to the shared task.\\n\\nThe `categorical_answer` field is the key to the deterministic-picker fix\\n(see module docstring): for yes/no or multiple-choice questions, each\\nvoter commits to a SHORT discrete string (e.g. \\\"YES\\\" / \\\"NO\\\" / \\\"A\\\" / \\\"B\\\")\\nwhich Python can tally to produce a `majority_vote` argmax \\u2014 sidestepping\\nthe unreliable self-repo...\n",
"\n",
"--- Voter perspectives ---\n",
"\n",
" analytical:\n",
" You are an ANALYTICAL voter. Approach the question with data, logic, and structured reasoning. Cite concrete numbers, mechanisms, and evidence-based arguments.\n",
"\n",
" skeptical:\n",
" You are a SKEPTICAL voter. Identify weaknesses, edge cases, missing context, and assumptions in the question itself. Argue what could go wrong with the most obvious answer.\n",
"\n",
" pragmatic:\n",
" You are a PRAGMATIC voter. Focus on what actually works in practice, real-world constraints, and shippable recommendations. Skip theoretical ideals; favour decisions a busy professional would actually make.\n"
]
}
],
"source": [
"from agentic_architectures.architectures.ensemble import _VoterOpinion, DEFAULT_VOTERS\n",
"import json\n",
"print('--- VoterOpinion schema ---')\n",
"print(json.dumps(_VoterOpinion.model_json_schema(), indent=2)[:400] + '...')\n",
"print()\n",
"print('--- Voter perspectives ---')\n",
"for name, prompt in DEFAULT_VOTERS.items():\n",
" print(f'\\n {name}:')\n",
" print(f' {prompt}')"
]
},
{
"cell_type": "markdown",
"id": "08276d8c",
"metadata": {
"papermill": {
"duration": 0.002999,
"end_time": "2026-05-27T12:43:17.431675+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:17.428676+00:00",
"status": "completed"
},
"tags": []
},
"source": [
"## 6 · State\n",
"\n",
"| Field | Type | Set by |\n",
"|---|---|---|\n",
"| `task` | `str` | caller |\n",
"| `voter_opinions` | `list[dict]` (one per voter) | `_vote` (appended) |\n",
"| `aggregated_answer` | `str` | `_aggregate` |\n",
"| `aggregator_mode` | `Literal[...]` | caller / default |"
]
},
{
"cell_type": "markdown",
"id": "7bbf1d62",
"metadata": {
"papermill": {
"duration": 0.003002,
"end_time": "2026-05-27T12:43:17.436676+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:17.433674+00:00",
"status": "completed"
},
"tags": []
},
"source": [
"## 7 · Build the graph"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "73fa1898",
"metadata": {
"execution": {
"iopub.execute_input": "2026-05-27T12:43:17.443718Z",
"iopub.status.busy": "2026-05-27T12:43:17.442676Z",
"iopub.status.idle": "2026-05-27T12:43:20.602539Z",
"shell.execute_reply": "2026-05-27T12:43:20.601533Z"
},
"papermill": {
"duration": 3.163873,
"end_time": "2026-05-27T12:43:20.603548+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:17.439675+00:00",
"status": "completed"
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAHYAAAFNCAIAAABaDwycAAAQAElEQVR4nOydB3wTR9rGZ7WyJbkX3Luptmk2Nj02NeEoRw0HAULApBAgmHZHAuQ7Si4XSkgBwtGSCyUcgYQWOqHagDGm17hhcAHcLUtW2/1eaY1xkYzW1jgszB//xGpmdnb1aPadd4pmxCzLIgJOxIiAGSIxdojE2CESY4dIjB0iMXYsL/HdS0Vp1xTF+RqdFmk1eo+QpkU6HUPTlE7HikQUoGNYCundRREN/yOGMSSzEjFaRh8ooliIZQwHnE8JqQxvISWcwhjygWMKEsC5lD4zfWIGMmMrTzechyANHHCXqPjMtEirY6reMy2mrCWUnbPYP8SmdWcnZFEoS/nF8Xsf30suU5bq4GOJrUAvuGma1ek/nohGDATTiIVXEbynWJ3hoqzhLaV/C+9FYn0CLlD/Hg5ow+2xeqUMKUAqQyxIScGtVxwbovVSMgwDwfo8EeIkhkTwjXKXePaZ4ULa6iqIWPj6dVpWVc7AiTI7UbN2ttHDPJAlsIDEJ3c9vpNYAgceAdJOb7h4N7VBQib3viLxUEFuhgoKfotwu16jPFHDaKjEmz5N06jZdjFOnf/iil4uLv2el3y8GMxO7OJg1ADqL/GTbOWOFVn+LaWD3vNFLy8HNmWn31QMiPUIDLVH9aKeEquVunWfpA+a7BnQwg697OQ/Uv7076zYxUEyOxrxpz4S56TLf1mVO2VFM/QqsXpOSt9Rri06OCOeiBB/QN+Rs3zQK8Z7SwKObstH/OEt8fr5aUFhMjdvGXrFsJJYhUTarZ+XinjCT+IjW3IZDdN/4itXhDl6jfYEH3v/xmxeZ/GTOOWKvPNAF/QK02Nkk/s3FbxO4SHxsf/lQsOp3WuvtMTN2jlYS6mDP+SYfwoPiVMvy/1bCbvlZhGC2trev82jIPOQWKNCfcdYptluPn379s3KykI8SU1NHThwIMJDn1GeWjWrKFabmd5ciRP2PYHOHejZQY1ITk5OYWEh4s+tW7cQTiQ21IUjBWYmNlfinIxyiQ0ufaH5s23btrfeeqtbt25jx45dtWqVTqdLSkoaNGgQxA4ePHjWrFnIUDa/+OKLESNGdO3aFZLt3LmTOz0lJSUyMvLs2bP9+vUbPXr02rVrFy5cmJubC4Fbt25FGJDa0k8emluKze0vVpTqbO1xSbx9+/ZNmzbFxcWBxCdPnly9erWtre2ECRO++uorCNyzZ4+Pj95NXLFiRXZ29rx586B/MiMjA+T28vKCU6ysrCB2w4YN48aNa9++fVhYmFqtPnLkyP79+xEeQIrSIq2Zic2VGPpSrRwphIfk5OTQ0FDOeg4dOjQqKkqhMFKffP7552VlZd7e3nAMJXTv3r0JCQkgsb5HGKHOnTuPGTMGNQrWUpFOY25icyVmoQ8c4SrF7dq1+/bbbxctWhQeHh4dHe3r62viHlgo7/Hx8ffv3+dCuNLNERISghoLlk/fjrkS0zSrUZv7aPAFrDBYhlOnToENFYvF4EV89NFHbm5uVdPAiMb06dPBAkydOhWKsL29fWxsbNUEEokENRZqlY4yu7yZK7HMntYPGuFBJBINNZCWlpaYmLhu3Tq5XL5y5cqqae7cuXPz5s01a9Z07NiRCyktLXV3d0d/BooirVRmrsbmehRuPtKyEgbhAeol8BbgIDg4eNSoUeAV3L17t0aaoqIieK3UNM0A+pOQl+icvazNTGyuxJ37O+s0uCYYHjp0aM6cOadPny4uLgbf6/fffwfrDOGBgYHwevTo0Rs3boD6YEM2b95cUlIC7sSyZcugfgPH2WiG/v7+eXl54JxUWm3LAnVd+xgHMxObK7HUxooWoxM7HiEMzJ8/HxScOXNm7969Fy9eHBMTA54ZhEO9B64x+LlQGXp6ei5ZsuT69eu9evWaMWPGlClTwEEG6eG1dobdu3cH72327NmHDx9GliZhXx68egXYmpmeR824b31Wbrrq3X81aKzwJWDjp2nO7lbDpvqZmZ5HH8Wgd31USibzHr+uvJeMkgK1spQxX1/EdzaQTzPJ0c25pga9wUS+8847RqOgdWDqcRkyZAg04RAeIOcrV64YjXJ0dATTbzQKKoYBAwYYjdrxZaaLFz/ReA+Prv1HarsYxy79m9SOgo4Fo60yQKlUymTGx6Kg+SuVShEe4H7groxGaTQaruVdG7gfo1GXT+Wf21f44XJ+48K857QNm+b988osoxLTNA0tAqNnmQrHjY2NJTu44/cU9p/AuzuX9/Cou68svIfjf+byHiUUOus+SQnpZBvchndZqedUlcy7ZXv/kzP1y1dlKsWqmSkDYj2DwuozL6f+E67OH3hy+URx5BvOUX1ettlsVbkaXxj/S36rzna93qzn/MEGTRvMzlDs/S5bZkcPmezt2KTxemEaB4VctevrHHmR9o3xHsGt61+XWGDy686vHzzKVNk60K062nX+ixsSPhePFNy6UFxaqHP3tR450x81DItN4f511YNHD9TQcw/d1dAtZ+9ESyQihqpWndIipGO4Oe0VIYZZ7hU9/YaOdUOUYUr208CKO+Qm2VecJYK+Te50xDxLWS1bw7xvfUjVcMhMP9NbVxFYGQU3oSrTKssYRalWrWLhPl19JW9+xKN9UQcWk5gjJ115/WzRkyy1RqXTamDQulrmIopioG+/6mcWPZ3Orn+j/8caOv8rT6tMzE2yR4aOY1pMs0zFDxSqJDV+XLXVA6qKafDf2RpNIbgNGPyV2oqbeFuFdXXya25JV8/CEjcC0B8PI6dIOAjsF0tarRYaOEhQCE9i6DVGgoJIjB2B3W4dfTcvLKQUY4dIjB0iMXaILcYOKcXYIRJjh0iMHSIxdkh1hx1SirFDJMYOkRg7RGLsEImxQyTGDpEYO0Ri7JCmB3ZIKcaOwG4XirCtrbm/Y3lBEJjEOp2utLQUCQqhPXRiMdgKJCiIxNghEmOHSIwdIjF2iMTYIRJjh0iMHSIxdojE2CESY4dIjB0iMXaIxNghEmOHSIwdYfx6NC4u7tSpU9zyo/pf8BqQSqXx8fHohac++3o0PtOmTfPz8xMZoGkaXkFof/+G/sa+cRCGxE2bNu3atSvDPFtRUiaTjRw5EgkBYUgMjB07Fgpy5VtPT8+hQ4ciISAYiX19faOjo7ljbqVYJBAEIzHw9ttvcwUZ5B42bBgSCNg9intXiu/fVmpU1QK5VU+qXtmw52WVBT70215WRFcEGlbxSEtNTc/ICAwMBOtc40JPF0ph0dOcn+VWsZBITcTWTEAr25YRjggnGCXW6XTf/1+6Rg3OrEijrnYV/Xo2LKp6ZW55lYrtWquLwu3kWpGM0m/UavDZELc2zbOFVLhFcJ46dpUhyPRykmJrVqfRr6cydoG/TGbuesR8wSUx6Ltubnpga1n3IS/6nlcXDubeS5JP+Ke/zA6Lyrgk/u4fKR1edw6JFMYSbmm3CxJ2FUxe1gxhAEt1d3hzttiKEoq+QHCIi7WM2rv+AcIAFomfPFA7uAhsiqqjm6QgB0vvBxaJVUrOAxASYmuRWolwgKWnjdEhRmDdYYjVUlUb6BZEYJ2Z+NBX+ng21SASP8Ww1iPCABaJRWL4E5gtFlFcq8XyYJGY1cGfwNaJhHtmBGQo2OqNY2FA6W0FwgCxxRUIrboTIVyGDRs0rXeNEQbwSMwgwa3Zq9MhrZr4xTjRP3UCctpoMUXTAjMU+qeOEU51p9OyOqE5bbQVZYVnl2ssElMiXC0lfOg0rEaFZedPPMOjLNUIjvHCRXMPHNyDLIW+AY1wgCXXxml63L1r0V3hWYoVkF8MxYGXnZg2PVYmlS39YlVlyMfz4oqLi9as+gGOf9y84fCR/Xl5j93dPdu36zAj7mORSNSzdyRELVu++Lu1K/ftOQnHhw7v27tvV3p6SlBQs149Xx8+bDQv31xf1vAYN0yGAvEqxD1j+l5KTiwrK+PelpeXJyWd79OrHxx//8Pa3Xt2TH4/bufPh2Mnfnjy1NGfd26F8EMH9BMG58xewOl77PihL5YubNG81bYteyfFTtm5a9uqNSsQH1gKU/sZ31QVPgUiJqYPdIefOfs79/Zs/El426NH31J56U/b/ztu7KTu3XvY29n3iOkzdMjftmzdqNHU3K39wIHdbduGx02f6+zsEhEeNWH8B7t374DnAJmNwbhh0RibLeZj11xdm4AFOHP2BPc2Pv5kh4iOLi6uDx7cBzVDQlpXpmzRIkQul2dlVRvHhO/jxs2rUZFdKkPCw6Mg8PbtG8hsBNb0AI9NxPO7gzK7avVyMBE0TZ87f+ajaX+HwIIC/XbAUsmzbTNlMv3WPEplte031Wo1fBMbN62Bv6rhRcWFyGwE1vQwFGNeJ+gl/ubbpQnnTltbW+utRExfCLS11W/hpyx/NmypUOjttYtLtW05pVKpjY3N630HREf3rhru5xuAzIYyTDtCGMDVX8y3QDg6OIJxSExMUKnKu3WN4fYMbdq0BRTqmzevhrQK45LBsw9G2c3NvYY5hpRguMPbR3JvITYnJwvsDzIbqO7w+Gx4bDFYCRH/Pgqo9K5dS7506QKUaC7Ewd6hb5/+W7ZuSkg4XVJacuTIb7/u/t+IEWPAaZNIJCA0OB6XryRptdp3Y6eCBYeWCDwB169fWbT445mzP6hdK9YFNkOBZcLV+k/S7ZzEA9/nt18cOG1/HdITtNu7+0TlSmFQuYHn+/uJw6Cjt7dvn95/GT1qPBe7Z+9OcOm0Ws1P2/ZD0Ya6ceu278GOl5crw0LbvvfeR61ahpp/9aNbch5nKD5Y1hRZGiwSb1iQYe8k7j/JFwmHo1uzH2eUf7A0GFkaTMOjLCO0njZ8/Sp4Bvn5O21/OvpGv4Aa0CyLabwcIyy2oTA8frHwCjHX0yasQX6hjUDjA5vEgpurgq1LHo9HwQjPGFN6jwLhAE83ECU8O8EiJKjqTiRMUyygUoxvliNGRLjGJ7CN3QnOa2OENW1QiJNfsUHmF2MHi8TWEiSWCMxSiMTI2kY4ox4SW6pcrkaCQl5Ubi1FOMBS1sJ7OZYVY5kfhg95gS60sxPCABaJW0Y42zehty9NQQJhx/IUWwc6oqcLwgDG9SiO/ZSdelXh09zGu7mNtYlde6ha/j5r+NrZWtGV63cY1vUwfsXKGT1Pfx9MoaerhVTNn3r6qlYzOWml2X8ovJvZDJjojfCAd1WVk7tyQWVVOcPwGag0SqXEVY5Mp3lOwgpoMbKSivxDZK+/5YWwIYyl8KoSGRmZlJSEhIPwtrGiaSxz2fFBdgrDDpEYO0Ri7BCJsUMkxg6RGDtke1fskFKMHSIxdojE2CESY4dUd9ghpRg7RGLsEImxQ2wxdkgpxg6RGDtEYuwQibFDJMYOkRg7ArtdqVQqEtrscIFJXF5eXlxcjAQF2XsUO0Ri7BCJsUMkxg6RGDtEYuwQibFDJMYOkRg7RGLsEImxQyTGDpEYO0Ri7BCJsUMkxo4wfj06bty469evcz9qrHrDE6UmCQAAEABJREFUycnJ6IVHGIM0M2fOdHd3pwyInhIcbPmFcHEgDInDw8PDwsKq7tMMWg8YMAAJAcEMNcbGxrq4PFsvws/Pb8iQIUgICEbi1q1bd+rUiTPE8NqzZ09nZ2ckBIQ0YD5hwgQPDw848Pb2HjlyJBIIZjlt6bdLGE1dSxQY1tagUK1VUqg6F0msGstS+n+oLlgR8uoePvxiUlKXDl3kj+zkj8qqRRsWVan7ivpFV9jnL2PFmrd/ES3WBYY6PDfZc5y27cvSCx7p4MZ12obdk5l33YAM6ljQBhMiWn9PTm7it/4eWEeyuiTesjRNXca8NtTDM8geEYzxJEtxemcOCP3OApO7VZiU+IeFabQEDZksDN/zz2Xf+nRFsW7S4mZGY41XdzfPFZaXMURfMxn0bpBWjS79nm801rjEtxNLpHbCW3H/T0TmILqXXGI0yriOqnKKFtoc0z8XqdRKrTBe2xrXUatmWIZsacADrYbVqo3XaqSoYodIjB0iMXaIxNghEmPHuNMmElFkjyRLYVxihhHcgrAvLiLToURjHtTRnWbcFhvGyIil4IN+o0o+rTsCb0wXSCKxZWAZZGrTR+JRYOfV9SgWLpp74OAehJ9Xt1P47t1bqFGwmC1OT0/du29n8uWLubnZgQHB/fsPGfzXEVxUYWHB5//+9Oata/5+gYMHv/nwYeaZsyf++/1OiLp16/pXX//7YVZmmzbhb4+dtHbd18FBzWbEfZyWlhL77qjPP/tq+ZdLnJycN6z7SavVbty05vyFs48f57Zu3X7o4JGdO3d/bv7nzp35/cTha9cvl5QUh7RqPW7cpPD2kRDes7f+ddnyxd+tXblvz0k4PnR43959u9LTU4KCmvXq+frwYaN520oTyS0m8eo1K0DcmTPnwZ1lZmZ8/c0XHh5enTt1g6ilyxdlPshYtnSNh7vnqtXLQQJuwYPy8vJP5s9o2SJk0cLlJaXFoHVBQV7T4OYQxS1j9eOWDX8bOQ4EheNvvl168NDeaVPnxMT0iY8/+X8L//7Jx4tjonvXnf9nn8+PCO849x8L4e2pU8fmzZ+x5cfdLi6uhw7E9+vfbc7sBf3/Mhiijh0/9MXShVAmPlv8ZXpG6tJlC3Nys6dNmY14YcK0mjIUvCu7BQs+X7ZsTUR4FBQTuFcQLvFiAoQXFxedP3925JvjQkNau7o2mTVzPnwT3ClQJCH2/feme3p6tWje6t1JUx89yq24vKEERUV2fnPEmJBWYSqV6vCR/W+Nfuevg4Y7OjiCLr179ftx8/q685dKpRvWbZ81cx7cEvx98H6cUqm8fuNK7Zs/cGB327bhcdPnOju7wEeYMP6D3bt3wMOBLIHxUkyJWN6jHiz7yy/bLyTGP3hwnwvw8vKB19S0P5B+ulQ7LtDOzi4ioiMUOqS3LSnwNji4YuAWVLC3rzbzo0XzEO7g3r3barU6KrJLZVT7dh2gUBeXFNeRP6BQlG3YuOrK1Uv5+XlcSFFRYY0bZxjmxs2rb497tzIkPDwKAsG8cE+JOVDQ7uBlKFieW0TCDc39ZLpGo4aS2B6UsrOfNj2Wiyot1Q8a2traVSZ2cHCsiJKX2tjYVs0HzG7Vt9YSCXcgl5fCa2WelRQW5NeRPzwT02dMAkOxYN6/QkPbwJPR943OqBbw5Wk0GjD08Fctc16lmGVN2W7L2OJ7f9y5c+fm8mVrOkR05EJAFLcm7nAgkeh3kdOon21/V1hUcetSiVStrrYtXn7+E6P5uzZxg1d45H18/KqGu7t75j7KMZX/yVNHIX8wxDKZDBkrvxW3IZXa2Ni83ndAdPUy6+3li8xGv1Elg3PsDgwivHKaAhkZafAXFKifIOPnFwCvUIcEBupnZcjl8uTkRKgJ4Rj0go9dUJAP9Q+8vXwlSaFQGM3f18dfYijRnD+ADEUMXHeQpo78wYsAy8PpC5w6fdzU/Tdt2gIeqcrMoVDn5GS5u3sg86GQqVJsmeoOvDSxWPy/HZtLSkvAnfh21TKoqbjy5ePtGxAQ9N8f12VlP4TP/9XXn3M2GujcqTtN05C4rKzsYdaDzZs3uLm5G80fpHxn/PtQv12/fgUKJog1++8fggdSd/7Bwc3BBIMrBg7fhcQEkN7R0Ql8PqR/tiRwraSk8/C9Quy7sVPBS4GWCFg8uMSixR/PnP1BjSfsObDIVGuN/uc//1k79OrpIsRS5u9iCJWMv3/gseMH16796mLSubiP/gFvd/y85cTJI0OHjIRa69z5M99/vzYxMaFjx64yqQxK7qBBw0E4V1c3qLW2btt0+86NcWMngVvt5xsAaaAA/rr7f3379gcFuUtAhRYY2HTHzi1frvwMkoFvN3v2AqnBCpnKH1xshtHt3LXtP+u+KS4uBDujVCqgHIBr2KXLa9bWErj08eMHBw8eCY8C1GzHjh0ENxyEhiphzpxPPT147NB2N6kYBvkjehmZ8my8n/O/izPAoxgeF4AsAZgRcFE9PDy5tx/PixPT4sWLlsMxFD14lh0MjgTcycC/xkx8Z/Lw4aP5ZF9X/o3G3u8yy+W62CVBtaMao6cNegPAV508eUbbNuHw2F66dOGzJSuRQZoPp4xv1rRFbOwUcEg3blwNna49evRFPDGV/wtCo5TikuJl0ADLzHjy5FGAfxAYhG7dYrio27dvrN+wCtxYtUoVEtJ6yoezwMIgntSRf6Oxdy2UYiZ2cWDtKOMS//hZBqujhk23jMSvArwNBRRhFs8G9S8r4LBRIn5OG0uG7niBvelBoEQmS7GpbiCKjPHzgvfYHUumqlgOYiiwY0piUtnxhH83EDETPDHdDUQMBXaIxNgxLrG1FaUlv1jiA02zYms+TpvEjmK0OkQwG7WGldoaL6/GJW4Xba8oJRLzQF6kbdnR1miUcYmbtnW2cxbv+joNEczg19UpMjuq/WuuRmPrWizh19UP87PL2/VwbdVRGEvEND53kwuvnMi3d7b62wyTHb/PWfLj1zUPHt1X67QsY7Rv09g6HMYXLjFjyY/aC6Lob66GP29kYY9qWde4euVyL6YyMCzmwprIrOYN1DyXQmIr5OZrNXxaXR3rZi2FpyxUypXGFq7RLxWjv4fKPPRTYvQfsmaeVeXTp4EETM1vAvqqmMpUhkwpQ/41VsKZOGHCxu+/r8zTkEZ/zaefp6pk0OBiOY0rg0RwZerZRaoqwF2TS8utwqOfEkWxladTFZ28FentpDqZiww9D7P8YpmzTPZimAqdTpdb+IebtzUSDmSnMOwQibFDJMYOkRg7ZB9o7JBSjB0iMXaIxNgRnsTEFuOFlGLsEImxQyTGDvjFRGK8kFKMHSIxdojE2CF9FNghpRg7RGLsEImxQ2wxdkgpxo5EIhHK7lWVCExilUpVXFyMBAXZexQ7RGLsEImxQyTGDpEYO0Ri7BCJsUMkxg6RGDtEYuwIT2KdTmA/CBSYxDRNk1KMF2IosEMkxg4MecDABxIUpBRjhxLEUlZjx47Ny8uDW4Uu+dLSUqlUCkJDcU5OTkYvPMLYOgUklsvl+fn58EpRFAgNrltQUBASAsKQuF+/fs2bN68R2K1bNyQEBLMB0IQJE+zsnm174O3tPWLECCQEBCNxdHR0aGgodwxGOSoqyt/fHwkBIW1jNWnSJG6E393dfdSoUUggCEniiIiINm3aQEUXGRlZ2zS/sGBx2o7/lPswVSEvZAxr8OlDXkzHkHq6hgolQg4utEeg9PUxPHZBMPsqlpM4/Zb85M9Pyop08GxYSWkbJ6mtk8RKBs0Fq2qLxxhZocaMEPMxdi5rWGClVkr9aizKUrWiUKks0WjLNQzDyuzo7oNdWkQ4IgthMYl/WJgmL2JkDtZ+4R7WEqEuYqhWqx9czVOVqiQyOnaRZfxuC0icfCIvYW+R1N6qWRcemxK94KQmZiuLVBG9HboOdEcNo6ESn/st79KxooAO7vautujlQlmmTI3PDetq3/NNPpst1aJBEicezr94pDCsjzAasvXj5rH0ttGOrw12Q/Wl/hKf3p17M0Ee0vNl1pfj5on0Zm1kb4zzQfWinn5xQa7y2ulXQl8grGfQH8nK+3fkqF7UU+IdK7NdfO3QK4NbkMP+DbmoXtRH4sObs8HCeIfU3zwJDo/mrjRN7Vn7EPGnPhKnXlW4BlnMMxcKHi1dHv5RjvjDW+Izux9DQ8ktwNzdBhsZeVnh7AWdrlw/hiyNs5cDtLOPbs3heR7/sbs/LpdJ7AT2syxLIXOUpt8qQzzhXYoVpTqXAAf0SuLW1FGt4H0Wv1KceVe/6bKThz3CQ0lp/r6DX2U8uKZWl7ds3rlPzER3N/3SwPHnfz56atPkid/9uP3jR4/TvDyaRXcdHRUxkDvr8rUjh47/BzpyQlu9FtNtDMKGnZMNvN65WNQqioed5FeK02+oEDagI3jtpg9TM5KHD5o7a+o2O1uXb9ZNzMvXV+K02EqpLN392/KRQz5Ztuh829a9duxeUlik96JyHqVs2/lpZHj/uXG7ItsP2PPbCoQTMMeZd5W8TuEncUmBRiTGtRlFeuaVx3kZo0csbNWii4O966B+H9naOJ05t52L1ek0fXtOCvDTb0oOUkKjNCvnHoQnXNjl5OjZt0esjY1Ds+AOnSKHIJzAx5cX8pvIwc9QqJQYp4lk3L9K01bNgyv2uwYpmwZFpGVcrkzg7xPGHdjI9JWBsly/i3xewQNPj+DKNH4+oQgnIormtT004iuxlbXI1GZNDUdZLoeiCi5X1UA722c/xzV6aYWipInrs13kra2fv/J4g6BZaym/M/hJbOtEI2ybktrbuYJAE8dUM6Yi0XNMGdgHjeZZi0Cl4u1U8YLVslIbmtcp/CT2DpbdvYjrM/h4tVCrlU5OHk1cKrr28wuyqpZiozg7ed26c4ZhGO7LuHX3LMIJo2Pc/CW8TuFX3YV2dILBLlUZFr+iedOoVs27/Lz7M3AV5GVF8Rd2fr32ncTkfXWf1S6sD7Todv+2AirAlLRLCRd2IpwwOhTV25XXKbxbd9ZS6nFKkV+7Bg0EmGLi2C/PXfxly4759x9cd2sSENGu32td/lb3KS2bdxr4xrRzib/M+bQzuBZj3ly4esP7mIa8s24/seJXgvXw7pLfvyHrYYqqVUwAevW4c/q+u4/VsKl+vM7i3YAeOMlHq2ZUSoxtkBcWrYoZ+B7vx7c+o/GunlaZlx8372ryy5z/WW+j4VqtGjxfo76Xp1vw1PfWI8uxcfPM9MyrRqM0GpWViQd+ybzjyAQp5x46NhFbW/PetaWeY3erZ6UERHnZORp3EQsKs42Gl5fLpVLjYyUikdjJsaHD6VUpKcnT6ow3EsoUJbY2xnuyXJy9jYar1eo/TmVN+bIZ4k8955SEdrW/fT43tFeg0VhTN9qYODg0MRVVj9tLjc9pFl7PWQz1HLvrOdzD3olOSajPQIvgSE3MsrETvTGuntPd6j8zc9y8QIrS3TqZjl5q7pzOpI/ahqkAAAFASURBVBjt+E8DUX1p6GygbUszS4q0rV57OX24u2fvS6Wi8QsCUQOwwJy2Lf9KL87XeYW4uvi8PKMhxY/lWTef2NrT4xc0dK6IZWZmXjiYn3S0UCwVuTdzcfbCNSbSOJQ+kWffKdCU69pG20cPsUAj1pLzi3etepibru/0ktpZO3jZvrCj1EbJe1BYnK1QlamhJ9Hdz/rNOIv9kMTys+RP7nyUelWhLNNV9BOIuN0kq2wPWmUjTcN7w6aeNRKgmt0MFXtcVgsybC3KmrX7qYimGF2tTyqCrlm2ctdLia2oaZhNr1GeyKJg/PWoSqlJvVZWlK9RK6s7LrU0Nghc4zaqiY5MTnOvkRUXKKrdqU3RLKurqTucLJZSDq7WQWG2Nva4pp0L4we6gkaovxgQEERi7BCJsUMkxg6RGDtEYuz8PwAAAP//n+OMowAAAAZJREFUAwAzA+N37YjGyAAAAABJRU5ErkJggg==",
"text/plain": [
"Aggregated answer (llm_synth) ─────────────────────────────────────────────────────────────────────────────────────\n", "\n" ], "text/plain": [ "\u001b[1;36mAggregated answer \u001b[0m\u001b[1;36m(\u001b[0m\u001b[1;36mllm_synth\u001b[0m\u001b[1;36m)\u001b[0m \u001b[92m─────────────────────────────────────────────────────────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Based on the perspectives provided, it is unlikely that electric vehicles will account for over 50% of new car \n",
"sales globally by 2030, due to current infrastructure and production limitations. The majority of voters, including\n",
"the Skeptical and Pragmatic ones, highlight significant hurdles such as high upfront costs, limited charging \n",
"infrastructure, and inadequate production capacity. \n",
"\n",
"There are points of agreement among the voters, including the impact of declining battery costs and the role of \n",
"government regulations in promoting electric vehicle adoption. All voters also acknowledge the importance of \n",
"factors such as global economic conditions, technological advancements, and consumer preferences in influencing \n",
"electric vehicle sales. \n",
"\n",
"However, genuine disagreements exist regarding the pace of electric vehicle adoption, with the Analytical voter \n",
"being more optimistic about the industry's ability to meet the 50% target. The Skeptical and Pragmatic voters, on \n",
"the other hand, are more cautious due to the existing limitations. Given the uncertainty and complexity of the \n",
"issue, it is recommended to continue monitoring the progress of electric vehicle technology, infrastructure \n",
"development, and government policies, as these factors will ultimately determine the feasibility of achieving the \n",
"50% target by 2030. \n",
"\n"
],
"text/plain": [
"Based on the perspectives provided, it is unlikely that electric vehicles will account for over 50% of new car \n",
"sales globally by 2030, due to current infrastructure and production limitations. The majority of voters, including\n",
"the Skeptical and Pragmatic ones, highlight significant hurdles such as high upfront costs, limited charging \n",
"infrastructure, and inadequate production capacity. \n",
"\n",
"There are points of agreement among the voters, including the impact of declining battery costs and the role of \n",
"government regulations in promoting electric vehicle adoption. All voters also acknowledge the importance of \n",
"factors such as global economic conditions, technological advancements, and consumer preferences in influencing \n",
"electric vehicle sales. \n",
"\n",
"However, genuine disagreements exist regarding the pace of electric vehicle adoption, with the Analytical voter \n",
"being more optimistic about the industry's ability to meet the 50% target. The Skeptical and Pragmatic voters, on \n",
"the other hand, are more cautious due to the existing limitations. Given the uncertainty and complexity of the \n",
"issue, it is recommended to continue monitoring the progress of electric vehicle technology, infrastructure \n",
"development, and government policies, as these factors will ultimately determine the feasibility of achieving the \n",
"50% target by 2030. \n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"VOTERS: ['analytical', 'skeptical', 'pragmatic']\n",
"CONFIDENCES: [4, 4, 4]\n",
"CONFIDENCE_SPREAD: 0\n"
]
}
],
"source": [
"TASK = (\n",
" \"Will electric vehicles account for over 50% of new car sales globally by 2030? \"\n",
" \"Answer YES or NO with a 2-3 sentence rationale.\"\n",
")\n",
"\n",
"result = arch.run(TASK)\n",
"\n",
"print_header(\"Aggregated answer (llm_synth)\")\n",
"print_md(result.output)\n",
"print()\n",
"print(f\"VOTERS: {result.state['voters_used']}\")\n",
"print(f\"CONFIDENCES: {result.metadata['confidences']}\")\n",
"print(f\"CONFIDENCE_SPREAD: {result.metadata['confidence_spread']}\")"
]
},
{
"cell_type": "markdown",
"id": "96d33310",
"metadata": {
"papermill": {
"duration": 0.0,
"end_time": "2026-05-27T12:43:41.542083+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:41.542083+00:00",
"status": "completed"
},
"tags": []
},
"source": [
"### 8.0 · What just happened, briefly\n",
"\n",
"Three things to look at:\n",
"\n",
"- **Voter disagreement** — read each voter's bottom_line in §8.1. If all 3 agree, the question wasn't actually contested or the perspective prompts didn't activate distinct framings.\n",
"- **Confidence spread** — if everyone is 4/5 (flat), the LLM-as-Scorer pathology again (see Mental Loop nb 10 §9). Bottom-line *content* discrimination matters more than the confidence number on a contested question.\n",
"- **Aggregator quality** — does the synthesis preserve genuine disagreement or wash it out?"
]
},
{
"cell_type": "markdown",
"id": "dbed2e24",
"metadata": {
"papermill": {
"duration": 0.015971,
"end_time": "2026-05-27T12:43:41.558054+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:41.542083+00:00",
"status": "completed"
},
"tags": []
},
"source": [
"### 8.1 · Per-voter opinions"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "67eda73c",
"metadata": {
"execution": {
"iopub.execute_input": "2026-05-27T12:43:41.558054Z",
"iopub.status.busy": "2026-05-27T12:43:41.558054Z",
"iopub.status.idle": "2026-05-27T12:43:41.609559Z",
"shell.execute_reply": "2026-05-27T12:43:41.608877Z"
},
"papermill": {
"duration": 0.051505,
"end_time": "2026-05-27T12:43:41.609559+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:41.558054+00:00",
"status": "completed"
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"› === ANALYTICAL (confidence 4/5) ===\n", "\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m=== ANALYTICAL \u001b[0m\u001b[1m(\u001b[0m\u001b[1mconfidence \u001b[0m\u001b[1;36m4\u001b[0m\u001b[1m/\u001b[0m\u001b[1;36m5\u001b[0m\u001b[1m)\u001b[0m\u001b[1m ===\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
BOTTOM LINE: YES, electric vehicles will account for over 50% of new car sales globally by 2030, driven by \n", "declining battery costs and increasing government regulations.\n", "\n" ], "text/plain": [ "BOTTOM LINE: YES, electric vehicles will account for over \u001b[1;36m50\u001b[0m% of new car sales globally by \u001b[1;36m2030\u001b[0m, driven by \n", "declining battery costs and increasing government regulations.\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
› point\n", "\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m point\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
The cost of electric vehicle batteries has fallen by over 80% in the last decade, making them more competitive with\n",
"internal combustion engines.\n",
"\n"
],
"text/plain": [
"The cost of electric vehicle batteries has fallen by over \u001b[1;36m80\u001b[0m% in the last decade, making them more competitive with\n",
"internal combustion engines.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"› point\n", "\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m point\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Many countries have set targets for electric vehicle adoption, such as Norway's goal of 100% electric vehicle sales\n", "by 2025.\n", "\n" ], "text/plain": [ "Many countries have set targets for electric vehicle adoption, such as Norway's goal of \u001b[1;36m100\u001b[0m% electric vehicle sales\n", "by \u001b[1;36m2025\u001b[0m.\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
› point\n", "\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m point\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Several major automakers, including Volkswagen and General Motors, have announced plans to electrify their entire \n",
"lineups in the coming years.\n",
"\n"
],
"text/plain": [
"Several major automakers, including Volkswagen and General Motors, have announced plans to electrify their entire \n",
"lineups in the coming years.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
},
{
"data": {
"text/html": [
"› === SKEPTICAL (confidence 4/5) ===\n", "\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m=== SKEPTICAL \u001b[0m\u001b[1m(\u001b[0m\u001b[1mconfidence \u001b[0m\u001b[1;36m4\u001b[0m\u001b[1m/\u001b[0m\u001b[1;36m5\u001b[0m\u001b[1m)\u001b[0m\u001b[1m ===\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
BOTTOM LINE: NO\n",
"\n"
],
"text/plain": [
"BOTTOM LINE: NO\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"› point\n", "\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m point\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
The adoption of electric vehicles is hindered by high upfront costs and limited charging infrastructure in many \n",
"parts of the world.\n",
"\n"
],
"text/plain": [
"The adoption of electric vehicles is hindered by high upfront costs and limited charging infrastructure in many \n",
"parts of the world.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"› point\n", "\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m point\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Government policies and incentives play a crucial role in promoting electric vehicle sales, but their impact can \n",
"vary greatly by region.\n",
"\n"
],
"text/plain": [
"Government policies and incentives play a crucial role in promoting electric vehicle sales, but their impact can \n",
"vary greatly by region.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"› point\n", "\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m point\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Global economic conditions, technological advancements, and consumer preferences are also significant factors that \n",
"can influence the sales of electric vehicles.\n",
"\n"
],
"text/plain": [
"Global economic conditions, technological advancements, and consumer preferences are also significant factors that \n",
"can influence the sales of electric vehicles.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
},
{
"data": {
"text/html": [
"› === PRAGMATIC (confidence 4/5) ===\n", "\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m=== PRAGMATIC \u001b[0m\u001b[1m(\u001b[0m\u001b[1mconfidence \u001b[0m\u001b[1;36m4\u001b[0m\u001b[1m/\u001b[0m\u001b[1;36m5\u001b[0m\u001b[1m)\u001b[0m\u001b[1m ===\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
BOTTOM LINE: NO, electric vehicles will not account for over 50% of new car sales globally by 2030, due to current \n", "infrastructure and production limitations.\n", "\n" ], "text/plain": [ "BOTTOM LINE: NO, electric vehicles will not account for over \u001b[1;36m50\u001b[0m% of new car sales globally by \u001b[1;36m2030\u001b[0m, due to current \n", "infrastructure and production limitations.\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
› point\n", "\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m point\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Current charging infrastructure is not yet widespread enough to support a majority of electric vehicles\n",
"\n"
],
"text/plain": [
"Current charging infrastructure is not yet widespread enough to support a majority of electric vehicles\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"› point\n", "\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m point\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Production capacity for electric vehicles is still ramping up and may not meet demand by 2030\n",
"\n"
],
"text/plain": [
"Production capacity for electric vehicles is still ramping up and may not meet demand by \u001b[1;36m2030\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"› point\n", "\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1m point\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Many countries still lack clear policies and incentives to drive adoption of electric vehicles\n",
"\n"
],
"text/plain": [
"Many countries still lack clear policies and incentives to drive adoption of electric vehicles\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
}
],
"source": [
"for t in result.trace:\n",
" print_step(\n",
" f\"=== {t['voter'].upper()} (confidence {t['confidence']}/5) ===\",\n",
" f\"BOTTOM LINE: {t['bottom_line']}\"\n",
" )\n",
" for pt in t.get('key_points', []):\n",
" print_step(\" point\", pt[:200])\n",
" print()"
]
},
{
"cell_type": "markdown",
"id": "56c1e111",
"metadata": {
"papermill": {
"duration": 0.004009,
"end_time": "2026-05-27T12:43:41.618027+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:41.614018+00:00",
"status": "completed"
},
"tags": []
},
"source": [
"## 9 · What we just observed\n",
"\n",
"The cells above ran 3 voters (analytical / skeptical / pragmatic) against the same contested forecasting question, then aggregated their opinions.\n",
"\n",
"### 9.1 · Quantitative summary\n",
"\n",
"| Metric | Value |\n",
"|---|---|\n",
"| Voters run | 3 |\n",
"| Confidence values | [4, 4, 4] |\n",
"| Confidence spread | 0 |\n",
"| Voters who answered YES/likely | 2 |\n",
"| Voters who answered NO/doubt | 2 |\n",
"\n",
"### 9.2 · Per-voter bottom-line answers\n",
"\n",
"| Voter | Confidence | Bottom line |\n",
"|---|---|---|\n",
"| analytical | 4/5 | YES, electric vehicles will account for over 50% of new car sales globally by 2030, driven by declining battery costs and increasing government regulations. |\n",
"| skeptical | 4/5 | NO |\n",
"| pragmatic | 4/5 | NO, electric vehicles will not account for over 50% of new car sales globally by 2030, due to current infrastructure and production limitations. |\n",
"\n",
"### 9.3 · Patterns surfaced in this run\n",
"\n",
"- **Flat confidence scores** — all 3 voters reported 4/5 confidence. The familiar Llama-as-Scorer pathology (see Mental Loop nb 10 §9). Self-reported confidence is unreliable as a signal — focus on the CONTENT of each bottom-line instead.\n",
"\n",
"- **Genuine perspective disagreement detected.** At least one voter said YES/likely AND another said NO/doubt. This is what Ensemble is FOR — different perspectives producing different directional answers on the same question.\n",
"\n",
"- **Aggregator preserved nuance** — used hedging language ('however', 'uncertainty', etc.) in the synthesis, suggesting minority views weren't washed out.\n",
"\n",
"### 9.4 · The aggregated answer (verbatim)\n",
"\n",
"> Based on the perspectives provided, it is unlikely that electric vehicles will account for over 50% of new car sales globally by 2030, due to current infrastructure and production limitations. The majority of voters, including the Skeptical and Pragmatic ones, highlight significant hurdles such as high upfront costs, limited charging infrastructure, and inadequate production capacity. There are points of agreement among the voters, including the impact of declining battery costs and the role of government regulations in promoting electric vehicle adoption. All voters also acknowledge the impor…\n",
"\n",
"### 9.5 · The takeaway\n",
"\n",
"A *healthy* Ensemble run has:\n",
"\n",
"1. **Genuine disagreement** — at least 2 of K voters produce different directional answers.\n",
"2. **Aggregator preserves nuance** — hedging language ('however', 'on the other hand') in the synthesis.\n",
"3. **Confidence values are NOT the signal** — they're noisy. The bottom-line CONTENT discrimination matters.\n",
"4. **The synthesis is shorter than the sum of inputs** — extracts insight, doesn't just concatenate."
]
},
{
"cell_type": "markdown",
"id": "cf0e971b",
"metadata": {
"papermill": {
"duration": 0.004011,
"end_time": "2026-05-27T12:43:41.628065+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:41.624054+00:00",
"status": "completed"
},
"tags": []
},
"source": [
"## 10 · `majority_vote` mode — the deterministic-picker fix\n",
"\n",
"The default `llm_synth` mode in § 8 uses the *content* of each voter's `bottom_line`, so it works even when confidence scores are flat. But what if you need a *decisive single answer* (YES / NO / A / B) — not a balanced synthesis?\n",
"\n",
"The naive approach is `highest_confidence` mode: pick the voter with the highest self-reported confidence number. **This is broken on Llama** — confidences come back `[4, 4, 4]` and the argmax is arbitrary.\n",
"\n",
"The fix is `majority_vote` mode: voters emit a categorical answer (e.g. `\"YES\"`), and **Python tallies the votes** deterministically. Confidence numbers are ignored entirely. This is the same pattern as Mental Loop's `scoring_fn` (notebook 10) — **let the LLM predict the underlying signal, let Python compute the picker**."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "dc162679",
"metadata": {
"execution": {
"iopub.execute_input": "2026-05-27T12:43:41.639868Z",
"iopub.status.busy": "2026-05-27T12:43:41.639868Z",
"iopub.status.idle": "2026-05-27T12:43:51.534803Z",
"shell.execute_reply": "2026-05-27T12:43:51.534039Z"
},
"papermill": {
"duration": 9.902302,
"end_time": "2026-05-27T12:43:51.536380+00:00",
"exception": false,
"start_time": "2026-05-27T12:43:41.634078+00:00",
"status": "completed"
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"Mode: majority_vote (deterministic Python picker) ─────────────────────────────────────────────────────────────────\n", "\n" ], "text/plain": [ "\u001b[1;36mMode: majority_vote \u001b[0m\u001b[1;36m(\u001b[0m\u001b[1;36mdeterministic Python picker\u001b[0m\u001b[1;36m)\u001b[0m \u001b[92m─────────────────────────────────────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Majority answer: YES (2/3 voters) — tally: YES=2, NO=1. \n", "\n", "Supporting voters' bottom lines: \n", "\n", " • (analytical) Electric vehicles will likely account for over 50% of new car sales globally by 2030 due to \n", " decreasing battery costs and increasing government regulations. \n", " • (pragmatic) Electric vehicles will likely account for over 50% of new car sales globally by 2030 due to \n", " decreasing battery costs, government incentives, and growing consumer demand. \n", "\n", "Dissenting voter(s): \n", "\n", " • (skeptical -> NO) I do not think electric vehicles will account for over 50% of new car sales globally by 2030. \n", "\n" ], "text/plain": [ "\u001b[1mMajority answer: YES\u001b[0m (2/3 voters) — tally: YES=2, NO=1. \n", "\n", "Supporting voters' bottom lines: \n", "\n", "\u001b[1m • \u001b[0m(analytical) Electric vehicles will likely account for over 50% of new car sales globally by 2030 due to \n", "\u001b[1m \u001b[0mdecreasing battery costs and increasing government regulations. \n", "\u001b[1m • \u001b[0m(pragmatic) Electric vehicles will likely account for over 50% of new car sales globally by 2030 due to \n", "\u001b[1m \u001b[0mdecreasing battery costs, government incentives, and growing consumer demand. \n", "\n", "Dissenting voter(s): \n", "\n", "\u001b[1m • \u001b[0m(skeptical -> NO) I do not think electric vehicles will account for over 50% of new car sales globally by 2030. \n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "VOTE_TALLY (Python-computed): {'YES': 2, 'NO': 1}\n", "CATEGORICAL_ANSWERS (LLM-supplied): ['YES', None, 'YES']\n", "CONFIDENCES (unused for argmax — note flatness): [4, 4, 4]\n" ] } ], "source": [ "print_header(\"Mode: majority_vote (deterministic Python picker)\")\n", "mv_arch = Ensemble(aggregator_mode=\"majority_vote\")\n", "mv_result = mv_arch.run(\n", " \"Will electric vehicles account for over 50% of new car sales globally by 2030? Answer YES, NO, or UNCERTAIN.\"\n", ")\n", "print_md(mv_result.output[:700])\n", "print()\n", "print(f\"VOTE_TALLY (Python-computed): {mv_result.metadata['vote_tally']}\")\n", "print(f\"CATEGORICAL_ANSWERS (LLM-supplied): {mv_result.metadata['categorical_answers']}\")\n", "print(f\"CONFIDENCES (unused for argmax — note flatness): {mv_result.metadata['confidences']}\")" ] }, { "cell_type": "markdown", "id": "12fec1f3", "metadata": { "papermill": { "duration": 0.002138, "end_time": "2026-05-27T12:43:51.550363+00:00", "exception": false, "start_time": "2026-05-27T12:43:51.548225+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 11 · Failure modes, safety, extensions\n", "\n", "### 11.1 · Where this breaks\n", "\n", "| Failure | Mechanism | Mitigation |\n", "|---|---|---|\n", "| **Flat confidences** | Same Llama-as-Scorer pathology — all confidence 4/5 | **Use `aggregator_mode=\"majority_vote\"`** (see § 10) — Python tallies discrete `categorical_answer` values, sidestepping the flat-confidence noise |\n", "| **Aggregator washout** | Synthesis blends opinions so minority view disappears | Aggregator prompt EXPLICITLY asks to preserve disagreement (we do this); still happens occasionally |\n", "| **Hidden conformity** | All 3 voters give the same answer despite different prompts | Run higher temperature; or use genuinely-different LLMs (gpt-4o + claude + llama) |\n", "| **Voters skip categorical_answer** | Llama leaves optional field null even after instruction | Library has keyword-fallback: scans `bottom_line` for YES/NO/UNCERTAIN if `categorical_answer` is missing |\n", "| **Adversarial-voice winner** | Skeptical voter's \"what could go wrong\" is loudest | Counterbalance with explicit \"what could go right\" voter |\n", "| **Cost** | K voter calls + 1 aggregator = K+1 LLM calls per task | Don't use Ensemble for cheap tasks; reserve for high-stakes decisions |\n", "\n", "### 11.2 · Production safety\n", "\n", "- **Confidence is unreliable.** Don't expose voters' self-reported confidence to users as if it were calibrated probability — it isn't.\n", "- **Aggregator is a single point of failure.** If the aggregator LLM gets a bad seed, the whole ensemble's output is corrupted. Run the aggregator 2-3 times and pick the best (meta-ensemble).\n", "- **Voter diversity matters more than count.** 3 *genuinely different* voters > 7 paraphrase-of-same voters.\n", "\n", "### 11.3 · Three extensions\n", "\n", "1. **Parallel voter execution.** Replace the sequential `_vote` loop with `langgraph.graph.parallel` — K× latency win.\n", "2. **Voters are full architectures.** Instead of 3 LLM calls with different prompts, use 3 *different architectures* (ReAct + Reflection + Planning) on the same task. Higher cost, much richer ensemble.\n", "3. **Confidence-weighted aggregation.** Replace the LLM aggregator with a weighted majority vote where each voter's contribution is weighted by their confidence. (Use after fixing self-reported confidence calibration.)\n", "\n", "### 11.4 · What to read next\n", "\n", "- [**05 · Multi-Agent**](./05_multi_agent.ipynb) — specialists with DIFFERENT sub-tasks vs Ensemble's same-task voters.\n", "- [**07 · Blackboard**](./07_blackboard.ipynb) — distributed bidding instead of fan-out.\n", "- [**21 · Self-Consistency**](./21_self_consistency.ipynb) — N samples + majority vote (Ensemble's simpler cousin).\n", "- [**28 · Multi-Agent Debate**](./28_agent_debate.ipynb) — adversarial back-and-forth instead of one-shot vote.\n", "\n", "### 11.5 · References\n", "\n", "1. Surowiecki, J. *The Wisdom of Crowds.* 2004.\n", "2. Wang, X. et al. *Self-Consistency Improves Chain-of-Thought Reasoning in Language Models.* ICLR 2023. [arXiv:2203.11171](https://arxiv.org/abs/2203.11171)\n", "3. Du, Y. et al. *Improving Factuality and Reasoning in Language Models through Multiagent Debate.* 2023. [arXiv:2305.14325](https://arxiv.org/abs/2305.14325)\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.0" }, "papermill": { "default_parameters": {}, "duration": 37.898106, "end_time": "2026-05-27T12:43:52.663198+00:00", "environment_variables": {}, "exception": null, "input_path": "all-agentic-architectures/notebooks/13_ensemble.ipynb", "output_path": "all-agentic-architectures/notebooks/13_ensemble.ipynb", "parameters": {}, "start_time": "2026-05-27T12:43:14.765092+00:00", "version": "2.7.0" } }, "nbformat": 4, "nbformat_minor": 5 }