{ "cells": [ { "cell_type": "markdown", "id": "f5c67cc4", "metadata": { "papermill": { "duration": 0.026531, "end_time": "2026-05-27T11:53:24.565065+00:00", "exception": false, "start_time": "2026-05-27T11:53:24.538534+00:00", "status": "completed" }, "tags": [] }, "source": [ "# 12 · Graph Memory (World Model) — knowledge graph + traversal Q&A\n", "\n", "> **TL;DR.** Extract `(subject, predicate, object)` triples from a corpus into a knowledge graph. Answer questions by **traversing the graph** — entity match + N-hop expansion — rather than re-reading the source text. The graph **is** the agent's world model.\n", ">\n", "> **Reach for it when** you have a structured-fact-heavy corpus (company filings, medical records, biographical data) and questions are entity-anchored (\"what does the corpus say about X?\").\n", "> **Avoid when** questions need fuzzy/semantic matching (\"what's the overall mood?\") — vector RAG fits better.\n", "\n", "| Property | Value |\n", "|---|---|\n", "| Origin | Knowledge graphs (Berners-Lee, semantic web); modern LLM revival: Microsoft GraphRAG (2024) |\n", "| Memory backend | **NetworkX** (in-process, zero setup) by default; **Neo4j** swap in 1 env-var |\n", "| Persistence | On the architecture instance; `ingest()` adds, `run(question)` queries |\n", "| Cost per query | 1 LLM call (entity match is pure Python) + 1 LLM call (synthesis) |\n", "| Cost per ingest | 1 structured-output LLM call per document |\n", "| Composability | Reuses `SemanticMemory` from the library (same backend as nb 08) |\n", "\n", "This is the **read-mostly** companion to notebook 08 (Episodic + Semantic Memory). nb 08 builds memory *interactively* across a conversation; nb 12 builds memory *upfront* from a corpus and queries it many times." ] }, { "cell_type": "markdown", "id": "7ba8040a", "metadata": { "papermill": { "duration": 0.03596, "end_time": "2026-05-27T11:53:24.634653+00:00", "exception": false, "start_time": "2026-05-27T11:53:24.598693+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 2 · Architecture at a glance\n", "\n", "```mermaid\n", "flowchart LR\n", " T[corpus text] -.ingest.-> X[Extract triples
structured-output
(s, p, o) list
]\n", " X -.write.-> G[(Knowledge graph
NetworkX / Neo4j)]\n", " Q([question]) --> M[Entity match
find graph nodes mentioned in question]\n", " M --> Tr[Graph traversal
N-hop facts from matched entities]\n", " G -.read.-> Tr\n", " Tr --> S[Synthesise
LLM with retrieved triples as context]\n", " S --> Z([answer])\n", "\n", " style X fill:#fff3e0,stroke:#f57c00\n", " style M fill:#e3f2fd,stroke:#1976d2\n", " style Tr fill:#fce4ec,stroke:#c2185b\n", " style S fill:#e8f5e9,stroke:#388e3c\n", "```\n", "\n", "**Ingest path** (left): text → LLM extracts triples → write to graph.\n", "**Query path** (right): question → entity match (Python) → graph traversal (Python) → synthesis (LLM). The query path uses only ONE LLM call (the final synthesis); retrieval is mechanical." ] }, { "cell_type": "markdown", "id": "d8afd03d", "metadata": { "papermill": { "duration": 0.279742, "end_time": "2026-05-27T11:53:25.183299+00:00", "exception": false, "start_time": "2026-05-27T11:53:24.903557+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 3 · Theory\n", "\n", "### 3.1 · Why a graph, not just vectors?\n", "\n", "Vector RAG (notebook 23) is great for *semantic* questions: \"what does the corpus say *about* X?\" — embeddings recall passages by topic similarity.\n", "\n", "Vector RAG is *bad* for *structural* questions:\n", "- \"What does the graph say is the relationship between X and Y?\"\n", "- \"Who works at the same company as X?\"\n", "- \"Find all entities that are connected to X via path A→B→C.\"\n", "\n", "These need **structured traversal**, not similarity. The graph encodes the relationships explicitly; the query walks them.\n", "\n", "### 3.2 · Ingest = structured-output triple extraction\n", "\n", "```python\n", "class _IngestionTriple(BaseModel):\n", " subject: str # specific named entity\n", " predicate: str # snake_case verb (founded_by, headquartered_in, ceo_is)\n", " object: str # specific entity OR literal\n", "```\n", "\n", "The `predicate` discipline matters: free-form predicates fragment the graph (`works_at` vs `is_employed_by` vs `works_for` produce 3 different edges). The schema description nudges the model toward snake_case verbs, but production use needs **predicate normalisation** (a fixed vocabulary or entity-linking layer).\n", "\n", "### 3.3 · Query = entity match + N-hop expansion\n", "\n", "```python\n", "def run(question):\n", " entities = entities_in_query(question) # Python string match\n", " facts = facts_about(entities, depth=2) # graph traversal\n", " return llm.invoke(f\"Question: {question}\\nFacts: {facts}\")\n", "```\n", "\n", "Two design choices worth flagging:\n", "\n", "1. **Pure Python entity match.** Faster + cheaper than LLM-based entity recognition, but fragile to synonyms (\"the CEO\" vs \"Sam Altman\"). Production version uses an entity-linker.\n", "2. **N-hop expansion**, not just direct facts. If the question is about X, we also retrieve facts about Y where X-[*]→Y for N hops. This catches transitive answers like *\"who reports to the CEO?\"*\n", "\n", "### 3.4 · Graph backend swap — same API, different scale\n", "\n", "```python\n", "# Default (zero setup):\n", "arch = GraphMemoryAgent() # NetworkX in-process\n", "\n", "# Neo4j / AuraDB Free (when GRAPH_BACKEND=neo4j):\n", "from agentic_architectures.memory import SemanticMemory\n", "arch = GraphMemoryAgent(semantic=SemanticMemory(backend=\"neo4j\"))\n", "```\n", "\n", "The `SemanticMemory` class (from the library) wraps both NetworkX and Neo4j behind a uniform Cypher-subset API. The architecture doesn't know which backend it's running on.\n", "\n", "### 3.5 · Where Graph Memory sits\n", "\n", "| Pattern | Storage | Recall mechanism | Best for |\n", "|---|---|---|---|\n", "| Vector RAG (nb 23) | Dense vectors | Cosine similarity | Semantic / fuzzy questions |\n", "| **Graph Memory** *(this notebook)* | **Triples in graph** | **Entity match + traversal** | Structural / relational questions |\n", "| Hybrid RAG | Both | RRF / reranker | Best-of-both at higher cost |\n", "| GraphRAG (nb 27) | Graph + community summaries | Traversal + summary recall | Global questions over a corpus |\n", "| Episodic + Semantic (nb 08) | Vector + Graph | Both | Personal-assistant continuity + facts |\n", "\n", "### 3.6 · What goes wrong (you'll see in § 9)\n", "\n", "1. **Predicate fragmentation.** `works_at`, `is_employed_by`, `works_for` → 3 different edges instead of 1. Production needs normalisation.\n", "2. **Entity drift.** \"OpenAI\" vs \"Open AI\" vs \"OpenAI Inc\" → 3 different nodes. Production needs entity linking.\n", "3. **Brittle entity match.** Query \"the CEO\" doesn't match graph node \"Sam Altman\". Mitigation: alias-aware match, or LLM-based entity extraction at query time.\n", "4. **N-hop noise.** Depth=3 retrieval pulls in irrelevant facts at the periphery. Cap depth + score by relevance.\n" ] }, { "cell_type": "markdown", "id": "051b7454", "metadata": { "papermill": { "duration": 0.037854, "end_time": "2026-05-27T11:53:25.270733+00:00", "exception": false, "start_time": "2026-05-27T11:53:25.232879+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 4 · Setup" ] }, { "cell_type": "code", "execution_count": 1, "id": "7960aea9", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T11:53:25.355229Z", "iopub.status.busy": "2026-05-27T11:53:25.352228Z", "iopub.status.idle": "2026-05-27T11:53:35.615338Z", "shell.execute_reply": "2026-05-27T11:53:35.607651Z" }, "papermill": { "duration": 10.317961, "end_time": "2026-05-27T11:53:35.623894+00:00", "exception": false, "start_time": "2026-05-27T11:53:25.305933+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
Provider: nebius  ·  Model: meta-llama/Llama-3.3-70B-Instruct ─────────────────────────────────────────────────────\n",
       "
\n" ], "text/plain": [ "\u001b[1;36mProvider: nebius · Model: meta-llama/Llama-\u001b[0m\u001b[1;36m3.3\u001b[0m\u001b[1;36m-70B-Instruct\u001b[0m \u001b[92m─────────────────────────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Graph backend: networkx                                                                                            \n",
       "
\n" ], "text/plain": [ "Graph backend: \u001b[1mnetworkx\u001b[0m \n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from agentic_architectures import get_llm, enable_langsmith, settings\n", "from agentic_architectures.architectures import GraphMemoryAgent\n", "from agentic_architectures.ui import print_md, print_header, print_step\n", "\n", "enable_langsmith()\n", "print_header(f\"Provider: {settings.llm_provider} · Model: {settings.llm_model}\")\n", "print_md(f\"Graph backend: **{settings.graph_backend}**\")" ] }, { "cell_type": "markdown", "id": "05dc932c", "metadata": { "papermill": { "duration": 0.033525, "end_time": "2026-05-27T11:53:35.697354+00:00", "exception": false, "start_time": "2026-05-27T11:53:35.663829+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 5 · Library walkthrough\n", "\n", "Source: [`src/agentic_architectures/architectures/graph_memory.py`](../src/agentic_architectures/architectures/graph_memory.py).\n", "\n", "Three operations:\n", "\n", "| Method | Purpose | LLM calls |\n", "|---|---|---|\n", "| `ingest(text)` | Extract triples from text → write to graph | 1 (structured output) |\n", "| `run(question)` | Query the graph | 1 (final synthesis) |\n", "| `_entities_in_query` | Python string match against graph node names | 0 |\n", "| `_facts_block(entities)` | N-hop traversal via `SemanticMemory.facts_about` | 0 |\n", "\n", "`SemanticMemory` lives at [`src/agentic_architectures/memory/semantic.py`](../src/agentic_architectures/memory/semantic.py); the NetworkX/Neo4j abstraction is at [`src/agentic_architectures/memory/graph.py`](../src/agentic_architectures/memory/graph.py)." ] }, { "cell_type": "code", "execution_count": 2, "id": "852cbff0", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T11:53:35.759003Z", "iopub.status.busy": "2026-05-27T11:53:35.757470Z", "iopub.status.idle": "2026-05-27T11:53:35.830908Z", "shell.execute_reply": "2026-05-27T11:53:35.821804Z" }, "papermill": { "duration": 0.115287, "end_time": "2026-05-27T11:53:35.837022+00:00", "exception": false, "start_time": "2026-05-27T11:53:35.721735+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Triple schema:\n", "{\n", " \"properties\": {\n", " \"subject\": {\n", " \"description\": \"A specific named entity (capitalised).\",\n", " \"title\": \"Subject\",\n", " \"type\": \"string\"\n", " },\n", " \"predicate\": {\n", " \"description\": \"A short relation verb in snake_case (e.g. founded_by, headquartered_in, ceo_is).\",\n", " \"title\": \"Pred...\n" ] } ], "source": [ "from agentic_architectures.architectures.graph_memory import _IngestionResult, _IngestionTriple\n", "import json\n", "print('Triple schema:')\n", "print(json.dumps(_IngestionTriple.model_json_schema(), indent=2)[:300] + '...')" ] }, { "cell_type": "markdown", "id": "9cdcfdba", "metadata": { "papermill": { "duration": 0.032419, "end_time": "2026-05-27T11:53:36.096410+00:00", "exception": false, "start_time": "2026-05-27T11:53:36.063991+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 6 · State\n", "\n", "The graph lives on the **architecture instance** (`arch.semantic`), not in LangGraph state. To reset, create a new `GraphMemoryAgent()`. To persist across process restarts, swap `semantic=SemanticMemory(backend='neo4j')` and use AuraDB Free." ] }, { "cell_type": "markdown", "id": "314a6287", "metadata": { "papermill": { "duration": 0.042594, "end_time": "2026-05-27T11:53:36.174087+00:00", "exception": false, "start_time": "2026-05-27T11:53:36.131493+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 7 · Build the graph (notebook compatibility — the real work is in `ingest()` + `run()`)" ] }, { "cell_type": "code", "execution_count": 3, "id": "5abdac71", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T11:53:36.254406Z", "iopub.status.busy": "2026-05-27T11:53:36.251383Z", "iopub.status.idle": "2026-05-27T11:53:54.326148Z", "shell.execute_reply": "2026-05-27T11:53:54.314061Z" }, "papermill": { "duration": 18.143919, "end_time": "2026-05-27T11:53:54.336128+00:00", "exception": false, "start_time": "2026-05-27T11:53:36.192209+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAGoAAADqCAIAAADF80cYAAAQAElEQVR4nOydB3gU1drHz8y2JJu+SUiy6RAICRpKgHApkY5KFwFBbEhvSvHDB8QbRBERRQUuinIRFPCCBTAPRYiUBAUkoHRIQnpCsim72ZItM/Od3U1ZktkyM+wyLPODZ5/dc87MTv57ynvqyycIAnDQhQ84GMDJxwhOPkZw8jGCk48RnHyMYCrf3WsNd3IU9bX6Bg1m0BKgjRWEAwIFiGUIygM4BgACEyMkd0QJgDeG8wQA0xvfmJK2Tkwgxn9NKRFMT7S6vOmGxoewhCdChCLU248fneidlOoNGIDQs/tyMhVXsmtVCgN8Mr4AEXqifCG8FSCwtndDwP2i8vgIZiAQFBA42QOhCIE3pkcFCG4WBUFAm+ckEAQhWqe0vLzp+wDA7g8QoDgO9A2YXkfAJ/EQ82KTvAdODALUoSxfzom6v47XwCcMkXqkDJFEdRaBRxllNXHmUGXpHbVBj8d08R7xUjtKl1OTb+d7hWollpjqN2CcBLgXN84pz2ZU4Rgx4904IHT0Kgry/WdpXlCkx/OLpMB9OblPdu3Pur6jg7um+TmS3lH5Ni3OHTQxNJFZRfuosGVp7tS3Y/0kPLspHZJv85LcGWs6CD3B48OX/5efMlTSY4idPIgCe/znrfxBk8IeK+0gs9bFnTsqk1dhtpPZke/b1YXtIkWde4nB40fv4ZLd6wtsp7El31/H6zRqbPwCd24rbNBjiL+XD//Hz0ttpLEtX01SL4caIHdlwsKI8gKNjQRW5fv7pAI3EP3Hu5t9RwmxH8/Lh/fTJqsZ0Kp8l87Utot0dXsxdOjQ0tJSqlfl5eWNHDkSOIfkAQGVxVprsVbl09RjPYfQ6QbSpry8vLa2FlDn+vXrwGn0GOwPuyKFN9WkseQjLncuq6A5GJXkcOeFCtDS3LNnz6+//lpYWBgbG5uamjpnzpxLly7Nnj0bxo4ZMyYtLW3Dhg0wT+3fv//ChQtlZWVxcXFjx46dMGGC+Q6DBw9+/fXXMzMz4VXTpk3btWsXDExJSXnzzTenTp0KHjSe3rxrZ+ujE7zaRpHLd/eqSuC0oYC9e/du3779jTfe6Nu378mTJzdv3iwWi1999dWNGzfCwAMHDkilxrYeKgiFW7FiBRxYKSgoWLduXVhYGLwERgkEgp9//rlXr15QxB49esAEx44dg78HcA5iP37NPfLySy6folrv4WW/y0KPnJycxMREc201bty4nj17qtUkRWPt2rUqlSo8PByYctbBgwfPnj1rlg/q5efnt3TpUuASfCWCsjzy9pdcPp0WEwidJV9ycvIXX3yxevXqbt26DRgwICIigjQZLOMwn2ZnZ8Mybg4x50oz8AcArgIWXr2OvPtBLp9x1JOHA+cwZcoUWFpPnTqVnp7O5/Nha7tw4cLg4GDLNDiOL1q0SKfTzZ8/H2Y9Hx+f6dOnWyYQCp1SL5OCIsb8ThpFLp/Qk6dV2+nu0X8aFB1nIj8///z581999ZVSqfz0008t09y8efPatWtbtmyBFZw5pL6+PiQkBDwM1PU4ilKRT+wjqKvSAecA6/jOnTu3b98+zgTUBbYDrdLU1dXB12a98k3AS8DDQC7T84XkFh55aFSit0HnrLUvR44cWbZs2enTp+VyeVZWFrQ/YG0Iw2NiYuDrb7/9dvXqVSgrLNfQIlEoFLDZXb9+PbRvoGFIesOoqCiZTAYb8eZa8sGilOsDQ8gNEXL5uvQRYxghK3NKBly5ciVUZ/HixdB8e++996CVB60TGA7bkFGjRm3duhU2LKGhoWvWrLly5cqgQYOgNTdv3jxo9EFZm00/S/r169e1a1fYEB89ehQ4AXW9oWM38jEnq8OlX76d3y7SY+zccPB4c+OCMvOHinkfdyCNtdppS+zlW1HYAB57/jpWHRhitZW3Ok3ef1zQlWz536fkyVYmTSoqKiZPnkwa5e3tDRtT0ihYbGGXAziHHSZIo0xzwuTlDNpGpHWCmTqZfsZ7VpssW3MdmXsqb19Wzl4XRxprMBgqKytJoxoaGjw8PEijYIPgPPuj3gRpFGyCfH19SaNgOPy9SaP2fFQMxwumvh0FrGBnqmjbivyoBK/h00LB40fRrYZDX5XM29DBRho7cx0z3o+7c0mpkTvLhGYzGV+X9h9rp6DYn2kb/mLojg8KwGPG9n8XRHYSP9nf13Yyh+Z5ayp0u9cXz9/wcIx+17N1ef6AcSGJve2vCXB0lcHdq+qM7WXJ/QP6u93qFkuKbmoy/lsWk+j99MsOrRWiskQIA1+uzBeK0BEvhYbFeQC3Y/dHxXKZrs/I4K4DfB28hPICtYztFYU3VJ5iXodkb2gbgkefy6frr2bXyqv1klDR5KURlK6luTzy8PZ7xXkqg5bgCxE4l+zhhcIxLpTfZnkkYlrY2DRyiKJwIK9xuSdiarQsV0gixlWoCG5a3GhOaV7riPJR3IA339C81hJ+F24AFimNMeY35nuaw3l8gBlAq6WYPD4c/sRVdQaNCtNqMB4PCQoTPTdbCqjPT9CUz4yyhjj/W3VVsUapMOi1OIKieJvVpZbrQs3vjZJAYUwDaJZfbhnSmBL+HDjBF6B4G8OJxwMYZpnSqKzxDXLfHaA0cOyj1dpUPh/hCxCRJy+gneCJfwVEdKI/rcNIPhcwfPjw3bt3SyQsba/YvrIedg1hPw+wFU4+RnDyMYLt8un1ejgpDtgKq+XDjWaOcWYOsBVWy8fykgs4+RjC6odjecUHuNzHEE4+RnDyMYKTjxFsl49rOujD5T5GcPIxgpOPEdBs5uSjD5f7GMHJxwhOPkZw8jGCG3FhBJf7GMHj8Xx8fACLYftUkVwuByyG3UWDz4flF7AYTj5GcPIxgpOPEZx8jGC74cLJRx8u9zGCk48RnHyM4ORjBCcfIzj5GMHJxwhOPkZw8jGC/fKxcVdRenr6wYMHzQ8GXxETKIpeuHABsAw2LlqfM2dOTEwMagJ2e+ErlM/aQWsPFzbKFxISMmTIEMsQKN+YMWMA+2DplokXX3wxOjq6+aNUKh07dixgHyyVD06wjRo1qnlDzLBhw/z9/QH7YO+GnSlTppjru/Dw8PHjxwNWwrTlrSzRXc1WNKj0GImbnZY93y0hzRvKCdDWsRFiOmWw2VtOaVlJbm5eeFh4fMd4y13pTbcyPnzz4zdvKyfw1u59YBQBWl/u6SWISfLu0JXR2dSM5Nu5pkgtN/A9UYMOJ8hOykF5BI7dd+6ieV+8UQtA4uqpVThMjGE40rj1vE16c8nB77szsJaSaB0oEqE6PSEQIq/8O4ZH95xW+vLtSC/08hE+PT0MPMpcPFZz44J8Znosj1YupCnfjtVF/oGiwdOoOUZiJ4XXddkHSmd9GAuoQ6fpKLyh06gM7qEdJDpRKBShR3dVAerQ6fNeP1fr4emsU50fCj4SYWWRmvp1tORTyzED5lZOkWEbrm2g8xfRkQ+DRorBreQz4Dimp3PMN+fi04T5AB3qcPIZMY7qoK4qvAiP3k/FXoimg5yoQkc+2MFg98lNlMFxgqDl4YBW7kPdLfeZHCK4qvCaj0ED7oR1lxK2oVV4jWe9uV/l57rc5251H2E+/486nOFiggCuq/tg04G6W9PhwroP0PulWAzh0roPIO4moEtbXuMP5ValFzFNEQDqcC2vEYKgWZxc1PKq1er3167MyTlvMBjmzV0ik1WePpO5c8ePMOrpZ/u9/NLMyZNeMqf8aP3qvLzbX279Dpg2pH6zfcuf57IqKyu6dOk6bszE1NR+wOg4K3f6jMlr39/48Sdr/P0DxGJvkVD00bpNzV/3zqql1TWyLZt2OPh4CN26iNY8L/VO2ycbP8jPu7Px020/7MkoKSk6fuKwI/ucP//io/0/7h43dtLu7w+lDRj8bvpbp06fACYflfB153dfT5o4bcnilc+MGHMx53xNTbX5qoaGBqj4sKHPAocxn/QMqENHPoRiVlepVKdOHZ84cVqnjp0DAyXz5i7m8wV2WzqtVnv02K9TXnhl9Kjn/Hz9nnl6zOBBI3bu2gYau6igZ0rq8xOmdk5IGjhwmJeXV+bvjW6ysrJPwtdBg4YDx0HptRy05KNa9ZWWFcNimJCQZP4In7Rz5y525bt9+4ZOp+uZ0qc5pGtyD1hs5YrGHb4d4zub3wiFwiGDnz5+/LD545kzmX3/lebr4+ix/UbggAvhOrMZoXQaa12d0Wmxl2eLf1vL99ZQKo1ehxYsmt4qvLam2rxDXyhqOSx95LPjfzmwr7SsRBIYdO589jsrPgBUMB2t78LxPpzK6Ji3t3FDvVbX4uJWpVZZS4w1nU8vCTI6/VyyeIVUGmmZICQktKZG1uqq9u3jYY4+fPhAfHyCp6dX7959ARVoDpbSbHkp/lLtQoyeom7evNYxPgGYjhO+fu0fUZMrLTjJqtG0TBIWFzc6SoyQRolM+atb1xRzSG1tDSzysJqrqSH5Flg57v1hJ2yXYEGmfoAETUuMVstLcbxKIgnq0iX56282l5QWy2RVn25cW69UNMcmJj4B21OzX7dd330DbRpzOJTplZdnwbbiypXLsBKEaZa+NXfjZx9a+5ZBA4dXV1fBkgt1BBRBXNl0GFflUPyx3l6+OqFT4oyZLzw/6WmVSpk2oGXx6Px5SwMDJKPGPDV0eKpW2wCb1+YoaAwuW7pq994dMPazz9eFh0UsWbLS2ldAuXv06B0VGRMbS9mnEqyLXNfnRXkIQlH28DCppVlrmYmk4RGfbNhq7UJoncD/rQIjIqJ+P/FXq0CYQ2E+nTljAaCOyWx2VdOBYzQnVpxERUU5tI1++nlvdHQsjZILTLYUvaaDTuFFBQjKpmHWE5lHli6bC3sdK95eQ08Hgu4yPZq5D2O2xuWNRcvBg2PqlFfhf8AAlKIl2wwtsxlOy7vXTCWc58VdNs8LKz6cTXXfQ4RWyysArKr7mGPa9uUqwwU3ELh7LVAzDcC5rM9L98tYDM3cQKvwAoCwdzsNHYzDVS4brEdoju6wF9Mf5Kq6z/hLud1UkevW9wECuJl8tKGX+9xtopI2dJoAgQdP6OlWbYfIUyDydNVMW1C4yKB1q+ynkuvF3nQOiKYjX/+xEr0eq63AgLugqNY/+VQAoA7NMpjYy+/wjiLgFuz7pNA/WNipu/3Jv7bQ35BadEtzeEd5cIRXZLwY4d9/k1Zjt60+ogTAEStP03oWxbyWCyYn2XbRdNsW/9Cm1ISlS2pTGmN3Fmk9m4aivPJcZXmBpv0T4oGTggEtGG2HLriqOXOwSlNv0GnJZ5mbPV/fF4mYVzSRAMfd8Kat5JZXtby3ENjYy7cUyRhijLWUDwHwdijprfgixNODH9/dp+/oQEAXtjvXHjFixPfff88516YJ596YEZx8jGC5tycu9zGC1fLBZg3HcR6PvRv/OW8xjODkYwTn6okRXO5jBCcfIzj5GMHVfYzgch8jOPkYwcnHCE4+RnDyMYKTjxGcfIzg5GMEgEYNzgAAB19JREFUZzYzgst9jODkYwTbvcUEB9Oc/3cNrJYPw7DKykrAYjhfRYzg5GMEJx8jOPkYwcnHCE4+RrBdPmi7ABbD5T5GcPIxgu3ywUEXwGK43McITj5GcPIxgpOPEZx8jODkYwQbdxUtWLAgKyur+TQqFEVxHIcfL168CFgGG3c1L1q0KCIiAm0CmBSMiooC7ION8nXo0KFfv36WxQJmvbS0NMA+2OtcOzKy5cRX+H7ChAmAfbBUPqlUOnjwYPN7WPGlpKSYPUWzDfae6DB58mSzd3f4OmnSJMBKHqThIq/Eqsq1Oo0Bb9uY379PvHFPMtk25Zad54hoWJ/Xf284+USnLuqq4KuVCkB6iUWIxbVN28otHoGPAlSA+ocIQyKE4AHB1HDJvay+cERWI9OZT6vm8VHjRrQ2Z0si958s3bTLnOy46daHsNI8lJX8UpOsKM94VqRfoKBjd5+UYXROgGi5H235fv+f7PZFhd6ACz34XgEekgg/T78H9qs6FZ0OyIvlyhq1Vq0nMFzawXP0rHBACzry1RYZ9m0pMmBEgNQ/rJM/eJSpK1VX3q3B9Fi3p/xTn6F8qAFl+Y7tqrx9WSEJ8wtLon+CAtuoLdOU37jnFySYupyacU5Nvt/2VOVeVnZ+io0dAObk/lHK5+OvrIpx/BIK8v20qayiqCFxYDRwX+6cLYWN36vpjv6Njtp9GdvvVZZo3Vs7SPy/pIDH+3ZNoYPpHZKv4Jqm8IYyIc09y2wrYnuGNSixwzvuOZLYIfmO7CoPin60W1hKdEqLzruidCSlffkyvqmAAx4h7R8j+SBefh47VtsvwvblK7qpCm7P0jOQnEdcz1C1wgC7obaT2ZHvz4wa2MMKlIoBK1Gqape+0/vylePACQg8+cd2l9tOY0e+Wzn1Iu9Hoyv2wAkI860u19lOY0c+lcIQGE7F4ZkbERTrazAQteW2yq+tAau6e0Y3IP5OK7mK+upDhzcWFP+j0zV0ik8dkvZaSLDRriy/l7dh05SFs7Znnv726o1Tfr4hXZ8Y+szQeebjhC79c+zIiS81GkViQv+0vlOBM+Hz0Stn6wY8Z7Xqt5X78q4qnOfCHcOwrdvn5hXkPDdq+ZL5u73FgZ9/9ZqsugRG8XnGjVj7Dqzt9uTwD9/NmjIh/VT2939fM1Zw5fdyd+9fldLtmeVv/JjS9dkDGRuAM0H4PFm51kYCW/LJq3TOO1z9btHlSlnBCxPSEzr28fWRjBqxUOzlf+aPvc0JkpMGJXcZzOcL2sd2lwRIS0pvwsCz53709wsd+tR0Ly/fDnE9eqeMBc6EIDCV3Fb1Z6vwmmdXgXMoKPybxxPExzW6sINfBGXKL7jUnCAivHPzew8PH02D0eeirKY4tF1cc3ikNBE4E6NnA5vlz5Z8AhHfeceDa2DPCNNDs8My0FvcMvZL6sxMrVYESVpm4IRCT+BMCAyxXXvZki8gRIA7TT4fbwn841+bel/lhdpzuATLrF7f0PxRq1UBp4LjYh+RjXhb8iV08z39s7O2lEnDOup0Gn//dkGBjTOQ1TWllrmPlAD/sOs3z8BaxSz09VtZwJnACZzQaA8bCWz92kJvgPLR6sJ64ATi2/dMiO+z75f3a+sqlKq67HP7P9v6yvmcQ7avSk4aAnsav2RsgBZVbv7Fs+f2A2cCf6fuQ211WO1MVPr4C2rL6yXRPsAJvPbiJ39c+Om7/60sLL4SHBTdPXlE/z525nM7xfceOXzBH+d/WrYqFTbBU59P3/z1LCe5v7h3q5YvQG17ErYz2vzPaUXWIVniIDcfJSXldlZJu0jhmNlhNtLYqaqfHOALK5l7uXXg8UPXoLetHXBklUHHHj63LsrbdSAf74O1w6q1Q0mjDAYdtOxILcfQ4Lj5M7eBB8c3uxbfLfqbNEqv1woEJK2nUOCx6q0MYIW8P8sDQ2y1uWYcmiratuKuV4BYmkReiSoUMtJwrU4jsmKX8Xh8sfhBjr+q1HLMQL4DRKNVeYrIuu0IAns75JcoDHcvlMz92L6fX4fk06nAtlW5SUNiwePB9czC5P7+jrgCcGiuQygG3QcGXc8sAI8BudmlQWEiB90oODpR2Wekf7eBAddO3AVuzY2ThZJwwcTFUgfTU1tlcOmk/I9fq9v3Dhd5s/pwH3rc/L0oIFQwaTGFdZiU17hcOqnIOlDpHeAB50OBu1B2o6a2RBHZUTx6diilC2kuUNv+bkGDGvfyE8X0oPZ9bKPserX8npLHQ0bPlIbGUp7Vob++73aO6swvlep6A4+HwllRrwAPn3ZiT9YXap0GU8k0CplaU6/FdJhAhCSl+tP2t8N4WwwGMv5bUVmiVdXrzethUYCQD3NZWyZKFm5aymtvpJbkwvuDyBKgKBxIRIQiXpBU2HuEJCzWvm1sgwe/q0ijNE5kWHxDi9ulFidWJK+Wi5abXVs1rVGGoCa3TZY3bBVi7t603KpJ/+YQHvD04j3YxfBsd/XEctzLx7jL4eRjBCcfIzj5GMHJxwhOPkb8PwAAAP//mk6eSQAAAAZJREFUAwDDFs4LCJrWDAAAAABJRU5ErkJggg==", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from IPython.display import Image, display\n", "arch = GraphMemoryAgent(traversal_depth=2)\n", "graph = arch.build()\n", "display(Image(graph.get_graph().draw_mermaid_png()))" ] }, { "cell_type": "markdown", "id": "a4e8f9c2", "metadata": { "papermill": { "duration": 0.058808, "end_time": "2026-05-27T11:53:54.439762+00:00", "exception": false, "start_time": "2026-05-27T11:53:54.380954+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 8 · Live run — ingest then query\n", "\n", "Ingest a short corpus about Anthropic + OpenAI, then ask multiple structural questions over the resulting graph." ] }, { "cell_type": "code", "execution_count": 4, "id": "bdf6a3ba", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T11:53:55.666929Z", "iopub.status.busy": "2026-05-27T11:53:55.662932Z", "iopub.status.idle": "2026-05-27T11:54:47.169363Z", "shell.execute_reply": "2026-05-27T11:54:47.162731Z" }, "papermill": { "duration": 52.106535, "end_time": "2026-05-27T11:54:47.169363+00:00", "exception": false, "start_time": "2026-05-27T11:53:55.062828+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
Ingesting corpus into knowledge graph ─────────────────────────────────────────────────────────────────────────────\n",
       "
\n" ], "text/plain": [ "\u001b[1;36mIngesting corpus into knowledge graph\u001b[0m \u001b[92m─────────────────────────────────────────────────────────────────────────────\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ " extracted: 18 triples\n", " total entities in graph: 17\n", "\n" ] }, { "data": { "text/html": [ "
 Q: Who founded Anthropic and when?\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1mQ: Who founded Anthropic and when?\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
A: Anthropic was founded by Dario Amodei and Daniela Amodei in 2021. \n",
       "I used the following triples: \n",
       "- (Anthropic, founded_by, Dario Amodei)\n",
       "- (Anthropic, founded_by, Daniela Amodei)\n",
       "- (Anthropic, founded_in, 2021)\n",
       "
\n" ], "text/plain": [ "A: Anthropic was founded by Dario Amodei and Daniela Amodei in \u001b[1;36m2021\u001b[0m. \n", "I used the following triples: \n", "- \u001b[1m(\u001b[0mAnthropic, founded_by, Dario Amodei\u001b[1m)\u001b[0m\n", "- \u001b[1m(\u001b[0mAnthropic, founded_by, Daniela Amodei\u001b[1m)\u001b[0m\n", "- \u001b[1m(\u001b[0mAnthropic, founded_in, \u001b[1;36m2021\u001b[0m\u001b[1m)\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ " (matched 1 entity/ies, used 9 fact(s))\n", "\n" ] }, { "data": { "text/html": [ "
 Q: Where is OpenAI headquartered?\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1mQ: Where is OpenAI headquartered?\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
A: OpenAI is headquartered in San Francisco. \n",
       "I used the triple: (OpenAI, headquartered_in, San Francisco)\n",
       "
\n" ], "text/plain": [ "A: OpenAI is headquartered in San Francisco. \n", "I used the triple: \u001b[1m(\u001b[0mOpenAI, headquartered_in, San Francisco\u001b[1m)\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ " (matched 1 entity/ies, used 7 fact(s))\n", "\n" ] }, { "data": { "text/html": [ "
 Q: Which companies have funded Anthropic?\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1mQ: Which companies have funded Anthropic?\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
A: Anthropic has received funding from Google and Amazon. \n",
       "I used the following triples: \n",
       "- (Anthropic, received_funding_from, Google)\n",
       "- (Anthropic, received_funding_from, Amazon)\n",
       "
\n" ], "text/plain": [ "A: Anthropic has received funding from Google and Amazon. \n", "I used the following triples: \n", "- \u001b[1m(\u001b[0mAnthropic, received_funding_from, Google\u001b[1m)\u001b[0m\n", "- \u001b[1m(\u001b[0mAnthropic, received_funding_from, Amazon\u001b[1m)\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ " (matched 1 entity/ies, used 9 fact(s))\n", "\n" ] }, { "data": { "text/html": [ "
 Q: Who is the CEO of each company?\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1mQ: Who is the CEO of each company?\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
A: The CEOs of the companies mentioned are:\n",
       "- Dario Amodei, CEO of Anthropic [(Dario Amodei, ceo_of, Anthropic)]\n",
       "- Sam Altman, CEO of OpenAI [(Sam Altman, ceo_of, OpenAI)]\n",
       "\n",
       "Note: There is no information about the CEOs of other companies, such as Amazon or Google, in the provided graph \n",
       "facts.\n",
       "
\n" ], "text/plain": [ "A: The CEOs of the companies mentioned are:\n", "- Dario Amodei, CEO of Anthropic \u001b[1m[\u001b[0m\u001b[1m(\u001b[0mDario Amodei, ceo_of, Anthropic\u001b[1m)\u001b[0m\u001b[1m]\u001b[0m\n", "- Sam Altman, CEO of OpenAI \u001b[1m[\u001b[0m\u001b[1m(\u001b[0mSam Altman, ceo_of, OpenAI\u001b[1m)\u001b[0m\u001b[1m]\u001b[0m\n", "\n", "Note: There is no information about the CEOs of other companies, such as Amazon or Google, in the provided graph \n", "facts.\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ " (matched 17 entity/ies, used 18 fact(s))\n", "\n" ] }, { "data": { "text/html": [ "
 Q: What products did Anthropic produce?\n",
       "
\n" ], "text/plain": [ "\u001b[1;35m›\u001b[0m \u001b[1mQ: What products did Anthropic produce?\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
A: Anthropic produced Claude. \n",
       "I used the triple: (Anthropic, produced, Claude).\n",
       "
\n" ], "text/plain": [ "A: Anthropic produced Claude. \n", "I used the triple: \u001b[1m(\u001b[0mAnthropic, produced, Claude\u001b[1m)\u001b[0m.\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ " (matched 1 entity/ies, used 9 fact(s))\n", "\n" ] } ], "source": [ "CORPUS = \"\"\"\n", "Anthropic is an AI safety company founded in 2021 by Dario Amodei and Daniela Amodei.\n", "Dario Amodei is the CEO of Anthropic. Daniela Amodei is the President of Anthropic.\n", "The company is headquartered in San Francisco, California.\n", "Anthropic produced Claude, a family of large language models.\n", "Claude 3.5 Sonnet was released in 2024.\n", "Anthropic received funding from Google and Amazon.\n", "Sam Altman is the CEO of OpenAI. OpenAI is headquartered in San Francisco.\n", "OpenAI was founded by Sam Altman, Elon Musk, Ilya Sutskever, Greg Brockman, and others in 2015.\n", "Elon Musk left the OpenAI board in 2018.\n", "\"\"\"\n", "\n", "print_header(\"Ingesting corpus into knowledge graph\")\n", "triples = arch.ingest(CORPUS)\n", "print(f\" extracted: {len(triples)} triples\")\n", "print(f\" total entities in graph: {arch._count_entities() if hasattr(arch, '_count_entities') else len(arch.semantic.backend._g.nodes)}\")\n", "print()\n", "\n", "QUESTIONS = [\n", " \"Who founded Anthropic and when?\",\n", " \"Where is OpenAI headquartered?\",\n", " \"Which companies have funded Anthropic?\",\n", " \"Who is the CEO of each company?\",\n", " \"What products did Anthropic produce?\",\n", "]\n", "\n", "results = []\n", "for q in QUESTIONS:\n", " r = arch.run(q)\n", " results.append((q, r))\n", " print_step(f\"Q: {q}\", f\"A: {r.output}\")\n", " print(f\" (matched {r.metadata['matched_entities']} entity/ies, used {r.metadata['facts_retrieved']} fact(s))\")\n", " print()" ] }, { "cell_type": "markdown", "id": "78eabf43", "metadata": { "papermill": { "duration": 0.050146, "end_time": "2026-05-27T11:54:47.271962+00:00", "exception": false, "start_time": "2026-05-27T11:54:47.221816+00:00", "status": "completed" }, "tags": [] }, "source": [ "### 8.0 · What just happened, briefly\n", "\n", "Three signals to inspect:\n", "\n", "- **Triples extracted** — should grow roughly linearly with sentence count. 13 triples from ~10 sentences = healthy.\n", "- **Entities matched per question** — `1-3` is typical for well-formed questions. `0` means entity-match failed (synonyms / case sensitivity).\n", "- **Facts retrieved per question** — at depth=2, expect 3-10 facts. >20 = too much noise; <2 = traversal didn't expand enough.\n", "\n", "### 8.1 · Inspect the full graph" ] }, { "cell_type": "code", "execution_count": 5, "id": "d4daa6c3", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T11:54:47.797965Z", "iopub.status.busy": "2026-05-27T11:54:47.791023Z", "iopub.status.idle": "2026-05-27T11:54:47.854603Z", "shell.execute_reply": "2026-05-27T11:54:47.845269Z" }, "papermill": { "duration": 0.544217, "end_time": "2026-05-27T11:54:47.865601+00:00", "exception": false, "start_time": "2026-05-27T11:54:47.321384+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Nodes: 17\n", "Edges: 18\n", "\n", "All triples in the knowledge graph:\n", " (Anthropic) --[founded_by]--> (Dario Amodei)\n", " (Anthropic) --[founded_by]--> (Daniela Amodei)\n", " (Anthropic) --[founded_in]--> (2021)\n", " (Anthropic) --[headquartered_in]--> (San Francisco)\n", " (Anthropic) --[produced]--> (Claude)\n", " (Anthropic) --[received_funding_from]--> (Google)\n", " (Anthropic) --[received_funding_from]--> (Amazon)\n", " (Dario Amodei) --[ceo_of]--> (Anthropic)\n", " (Daniela Amodei) --[president_of]--> (Anthropic)\n", " (Claude 3.5 Sonnet) --[released_in]--> (2024)\n", " (Sam Altman) --[ceo_of]--> (OpenAI)\n", " (OpenAI) --[headquartered_in]--> (San Francisco)\n", " (OpenAI) --[founded_by]--> (Sam Altman)\n", " (OpenAI) --[founded_by]--> (Elon Musk)\n", " (OpenAI) --[founded_by]--> (Ilya Sutskever)\n", " (OpenAI) --[founded_by]--> (Greg Brockman)\n", " (OpenAI) --[founded_in]--> (2015)\n", " (Elon Musk) --[left_board_in]--> (2018)\n" ] } ], "source": [ "from agentic_architectures.memory.graph import NetworkXGraphMemory\n", "backend = arch.semantic.backend\n", "if isinstance(backend, NetworkXGraphMemory):\n", " print(f\"Nodes: {len(backend._g.nodes)}\")\n", " print(f\"Edges: {len(backend._g.edges)}\")\n", " print()\n", " print(\"All triples in the knowledge graph:\")\n", " for s, o, d in backend._g.edges(data=True):\n", " print(f\" ({s}) --[{d.get('predicate', '?')}]--> ({o})\")" ] }, { "cell_type": "markdown", "id": "da544b1c", "metadata": { "papermill": { "duration": 0.510228, "end_time": "2026-05-27T11:54:48.819104+00:00", "exception": false, "start_time": "2026-05-27T11:54:48.308876+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 9 · What we just observed\n", "\n", "The cells above ran ingest + 5 query rounds against the Anthropic / OpenAI corpus.\n", "\n", "### 9.1 · Graph statistics\n", "\n", "| Metric | Value |\n", "|---|---|\n", "| Triples extracted from corpus | **18** |\n", "| Total nodes in graph | **17** |\n", "| Total edges in graph | **18** |\n", "| Distinct predicates | **9** |\n", "\n", "### 9.2 · Per-question results\n", "\n", "| # | Question | Entities matched | Facts used | Answer (truncated) |\n", "|---|---|---|---|---|\n", "| 1 | Who founded Anthropic and when? | 1 | 9 | Anthropic was founded by Dario Amodei and Daniela Amodei in 2021. I used the following triples: - (Anthropic, founded_by, Dario Amodei) - (Anthropic, founded_by, Daniela Amodei) - (Anthropic, founded_… |\n", "| 2 | Where is OpenAI headquartered? | 1 | 7 | OpenAI is headquartered in San Francisco. I used the triple: (OpenAI, headquartered_in, San Francisco) |\n", "| 3 | Which companies have funded Anthropic? | 1 | 9 | Anthropic has received funding from Google and Amazon. I used the following triples: - (Anthropic, received_funding_from, Google) - (Anthropic, received_funding_from, Amazon) |\n", "| 4 | Who is the CEO of each company? | 17 | 18 | The CEOs of the companies mentioned are: - Dario Amodei, CEO of Anthropic [(Dario Amodei, ceo_of, Anthropic)] - Sam Altman, CEO of OpenAI [(Sam Altman, ceo_of, OpenAI)] Note: There is no information a… |\n", "| 5 | What products did Anthropic produce? | 1 | 9 | Anthropic produced Claude. I used the triple: (Anthropic, produced, Claude). |\n", "\n", "### 9.3 · Sample of stored triples\n", "\n", "| Subject | Predicate | Object |\n", "|---|---|---|\n", "| Anthropic | founded_by | Dario Amodei |\n", "| Anthropic | founded_by | Daniela Amodei |\n", "| Anthropic | founded_in | 2021 |\n", "| Anthropic | headquartered_in | San Francisco |\n", "| Anthropic | produced | Claude |\n", "| Anthropic | received_funding_from | Google |\n", "| Anthropic | received_funding_from | Amazon |\n", "| Dario Amodei | ceo_of | Anthropic |\n", "| Daniela Amodei | president_of | Anthropic |\n", "| Claude 3.5 Sonnet | released_in | 2024 |\n", "| Sam Altman | ceo_of | OpenAI |\n", "| OpenAI | headquartered_in | San Francisco |\n", "| OpenAI | founded_by | Sam Altman |\n", "| OpenAI | founded_by | Elon Musk |\n", "| OpenAI | founded_by | Ilya Sutskever |\n", "\n", "### 9.4 · Patterns surfaced in this run\n", "\n", "- **Healthy extraction**: 18 triples from the ~10-sentence corpus (1.8 triples/sentence).\n", "\n", "- **Entity-match hit rate**: 5/5 questions matched ≥1 entity. This is solid — most questions found their referent in the graph.\n", "\n", "- **Predicate fragmentation detected**: 8 predicate pair(s) look like synonyms (e.g. `founded_by` ↔ `founded_in`). In a large corpus this fragments the graph. Production version needs predicate normalisation.\n", "\n", "### 9.5 · The takeaway\n", "\n", "A *healthy* Graph Memory run has:\n", "\n", "1. **Triples-per-sentence ≥ 1** during ingest.\n", "2. **Every question matches ≥1 entity** during query.\n", "3. **Answers cite specific triples** (not \"I think\" / \"probably\").\n", "4. **Distinct predicates is small** (<25 for a small corpus) — predicate fragmentation kept under control.\n", "\n", "When entity match fails, the answer either says \"no information\" (good) or hallucinates from parametric knowledge (bad). The synthesis prompt explicitly forbids the latter — see the verbatim answers above to verify." ] }, { "cell_type": "markdown", "id": "cd19ac44", "metadata": { "papermill": { "duration": 0.53937, "end_time": "2026-05-27T11:54:49.865604+00:00", "exception": false, "start_time": "2026-05-27T11:54:49.326234+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 10 · Try Neo4j backend" ] }, { "cell_type": "code", "execution_count": 6, "id": "766587d2", "metadata": { "execution": { "iopub.execute_input": "2026-05-27T11:54:50.869995Z", "iopub.status.busy": "2026-05-27T11:54:50.867019Z", "iopub.status.idle": "2026-05-27T11:54:50.930780Z", "shell.execute_reply": "2026-05-27T11:54:50.928868Z" }, "papermill": { "duration": 0.554511, "end_time": "2026-05-27T11:54:50.930780+00:00", "exception": false, "start_time": "2026-05-27T11:54:50.376269+00:00", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[skip] NEO4J_PASSWORD not set in .env — staying on NetworkX backend.\n" ] } ], "source": [ "from agentic_architectures.memory import SemanticMemory\n", "\n", "if settings.neo4j_password and settings.neo4j_password.get_secret_value():\n", " try:\n", " neo4j_arch = GraphMemoryAgent(semantic=SemanticMemory(backend=\"neo4j\"))\n", " neo4j_arch.semantic.reset()\n", " neo4j_arch.ingest(\"My dog is named Buddy. Buddy is a Golden Retriever. Golden Retrievers are friendly dogs.\")\n", " r = neo4j_arch.run(\"What breed is Buddy?\")\n", " print_md(f\"**A:** {r.output}\")\n", " print(f\" (Neo4j backend confirmed working — {r.metadata['total_entities_in_graph']} entities)\")\n", " except Exception as e:\n", " print(f\"[Neo4j skip] {type(e).__name__}: {str(e)[:200]}\")\n", "else:\n", " print(\"[skip] NEO4J_PASSWORD not set in .env — staying on NetworkX backend.\")" ] }, { "cell_type": "markdown", "id": "7d73f8cb", "metadata": { "papermill": { "duration": 0.481988, "end_time": "2026-05-27T11:54:51.827136+00:00", "exception": false, "start_time": "2026-05-27T11:54:51.345148+00:00", "status": "completed" }, "tags": [] }, "source": [ "## 11 · Failure modes, safety, extensions\n", "\n", "### 11.1 · Where this breaks\n", "\n", "| Failure | Mechanism | Mitigation |\n", "|---|---|---|\n", "| **Predicate fragmentation** | `works_at` / `is_employed_by` / `employed_at` are 3 different edges | Fixed predicate vocabulary; or normalise via embeddings post-extraction |\n", "| **Entity drift** | \"OpenAI\" / \"Open AI\" / \"OpenAI Inc\" = 3 nodes | Entity-linking pass; or lowercase + strip suffixes |\n", "| **Brittle query-side matching** | \"the CEO\" doesn't match node \"Sam Altman\" | LLM-based query-time entity extraction; alias table |\n", "| **N-hop noise** | Depth=3 expansion pulls in irrelevant edges | Cap depth at 2 by default; score retrieved facts by relevance |\n", "| **No support for global questions** | \"What are the themes of the corpus?\" can't be answered by entity match | Use **GraphRAG (nb 27)** which adds community summaries |\n", "\n", "### 11.2 · Production safety\n", "\n", "- **Versioned ingest.** Tag every triple with `source_doc_id` + `extraction_timestamp` so you can roll back a bad ingest.\n", "- **Audit the LLM extractor.** Triples are user-input-controlled; an adversarial corpus could inject misleading facts. Allow-list predicates and entity types.\n", "- **Backend choice.** NetworkX for prototypes (<10K nodes), Neo4j AuraDB Free for production small-medium (<5GB), self-hosted for large.\n", "\n", "### 11.3 · Three extensions\n", "\n", "1. **Predicate canonicalisation.** Run extracted predicates through an embedding-clustering pass to normalise synonyms.\n", "2. **Multi-hop query optimisation.** Compile the question into a Cypher path query rather than depth-N traversal — much faster for large graphs.\n", "3. **Provenance tracking.** Store `(source_text, char_offset)` alongside each triple; surface in the answer for verifiability.\n", "\n", "### 11.4 · What to read next\n", "\n", "- [**08 · Episodic + Semantic Memory**](./08_episodic_semantic_memory.ipynb) — dual-memory companion using the same `SemanticMemory` backend.\n", "- [**23 · Agentic RAG**](./23_agentic_rag.ipynb) — vector counterpart for semantic queries.\n", "- [**27 · GraphRAG**](./27_graphrag.ipynb) — graph + community summaries for global questions.\n", "\n", "### 11.5 · References\n", "\n", "1. Hogan, A. et al. *Knowledge Graphs.* ACM Computing Surveys 54(4), 2021. [arXiv:2003.02320](https://arxiv.org/abs/2003.02320)\n", "2. Microsoft Research. *From Local to Global: A Graph RAG Approach.* 2024. [arXiv:2404.16130](https://arxiv.org/abs/2404.16130)\n", "3. Neo4j AuraDB Free — [neo4j.com/cloud/aura-free](https://neo4j.com/cloud/aura-free/)\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.0" }, "papermill": { "default_parameters": {}, "duration": 132.503033, "end_time": "2026-05-27T11:54:55.521505+00:00", "environment_variables": {}, "exception": null, "input_path": "all-agentic-architectures/notebooks/12_graph_memory.ipynb", "output_path": "all-agentic-architectures/notebooks/12_graph_memory.ipynb", "parameters": {}, "start_time": "2026-05-27T11:52:43.018472+00:00", "version": "2.7.0" } }, "nbformat": 4, "nbformat_minor": 5 }