Agentic Architectures

Thirty-five production-grade agentic AI patterns. End to end.

A library and a living textbook — real LLM outputs, provider-agnostic, deterministic-picker discipline throughout, and a comparative benchmark leaderboard that ranks every architecture against every relevant task.

`35`

_{ARCHITECTURES}

`283`

_{PASSING TESTS}

`17`

_{BENCHMARK TASKS}

`9`

_{LLM PROVIDERS}

`0`

_{MOCKED RUNS}

Overview

A single Python library that packages every major agentic AI pattern from the literature as a runnable Architecture class with a uniform contract. Each pattern ships with a fully executed Jupyter notebook whose theory is written against the captured run — not synthetic examples. The library is multi-provider (Nebius, OpenAI, Anthropic, Groq, Ollama, Together, Fireworks, Mistral, Google) and built on top of LangGraph state machines.

The central technical discipline of the repository is the deterministic-picker pattern — every LLM-as-Scorer surface has the LLM commit to categorical features (booleans, enums) and lets Python compose the deciding signal. This is the universal escape from the LLM-as-Scorer flat-band pathology, applied in 13 of 35 architectures; 9 more are architecturally immune by design.

Quickstart

pip install "agentic-architectures[nebius,faiss,tavily]"

from agentic_architectures import get_llm
from agentic_architectures.architectures import Reflection

arch = Reflection(llm=get_llm(), max_iterations=2, target_score=8)
result = arch.run("Write a haiku about a glacier.")

print(result.output)
print("score:", result.metadata["final_score"], "/ 10")

Same .run(task) interface across all 35 architectures. Same ArchitectureResult return shape. Swap the class, swap the pattern — your downstream code does not change.

Set up a virtualenv from a fresh clone

git clone https://github.com/FareedKhan-dev/all-agentic-architectures
cd all-agentic-architectures

python -m venv .venv
.venv\Scripts\activate              # Windows
source .venv/bin/activate           # macOS / Linux

pip install -e ".[dev,test,docs,nebius,faiss,tavily,networkx]"
cp .env.example .env                # then fill in NEBIUS_API_KEY etc.

pytest -q                           # 283 tests pass in ~30s

Architecture families

Reasoning & Reflection

Self-critique loops that drive answer quality up through iteration.

_{Reflection · Reflexion · Chain-of-Verification · Self-Discover · Constitutional AI}

Sampling & Search

Sample many paths or grow a tree with rewards.

_{Self-Consistency · Tree of Thoughts · LATS · Mental Loop · Ensemble}

Retrieval (RAG)

Ground every claim — five retrieval shapes.

_{Agentic RAG · Corrective RAG · Self-RAG · Adaptive RAG · GraphRAG}

Memory

Learn across calls — pick the storage shape.

_{Episodic + Semantic · Graph Memory · MemGPT · Voyager · Agent Workflow Memory}

Tools & Actions

From one search tool to a real Chromium browser.

_{Tool Use · ReAct · Planning · PEV · SWE-Agent · BrowserAgent}

Multi-Agent

Specialists, debate, multi-perspective research.

_{Multi-Agent · Blackboard · Debate · STORM · Meta-Controller}

Safety & Routing

Categorical actions through deterministic Python gates.

_{Dry-Run · Reflexive Metacognitive · Computer Use}

Specialty

Patterns with a unique shape.

_{RLHF Self-Improvement · Cellular Automata}

Cross-cutting

Patterns that appear across families.

_{Deterministic-picker · Memory variants}

The 35 architectures

Reasoning & Reflection

Architecture	Pattern	Reference
Reflection	Generate → critique → refine	Madaan 2023
Reflexion	Verbal reflections in episodic memory	Shinn 2023
Chain-of-Verification (CoVe)	Verify each baseline claim independently	Dhuliawala 2023
Self-Discover	SELECT → ADAPT → IMPLEMENT → SOLVE	Zhou 2024
Constitutional AI	Per-rule pass/fail → revise	Bai 2022

Sampling & Search

Architecture	Pattern	Reference
Self-Consistency	Sample N paths, majority-vote	Wang 2022
Tree of Thoughts	Beam search over thoughts	Yao 2023
LATS	MCTS tree with reward backup	Zhou 2024
Mental Loop	Simulate → score (deterministic-picker)	this repo
Ensemble	N voters, weighted aggregation	this repo

Retrieval (RAG)

Architecture	Pattern	Reference
Agentic RAG	Agent decides when & what to retrieve	LangGraph reference
Corrective RAG (CRAG)	Grade docs, fall back to web	Yan 2024
Self-RAG	Per-doc reflection tokens	Asai 2024
Adaptive RAG	Pre-route by query complexity	Jeong 2024
GraphRAG	KG + community summaries	Microsoft 2024

Memory

Architecture	Stored unit	Reference
Episodic + Semantic	Conversation turns + triples	Park 2023
Graph Memory	(subject, predicate, object) triples	this repo
MemGPT	OS-style context + archival tiers	Packer 2023
Voyager	Reusable Python skills (real subprocess)	Wang 2023
Agent Workflow Memory	High-level workflow recipes	Wang 2024

Tools & Actions

Architecture	Pattern	Reference
Tool Use	Agent with one tool	LangChain reference
ReAct	Thought → Action → Observation	Yao 2022
Planning	Decompose → execute → replan	Wei 2022
Plan-Execute-Verify (PEV)	Post-execution verification per step	this repo
SWE-Agent	Sandboxed file-system agent	Yang 2024
BrowserAgent	Real Playwright + safety gate	Anthropic Computer-Use 2024

Multi-Agent

Architecture	Pattern	Reference
Multi-Agent	Supervisor + specialists	LangGraph reference
Blackboard	Shared workspace + agents	classical AI
Debate	N agents × K rounds	Du 2023
STORM	Multi-perspective research → article	Shao 2024
Meta-Controller	Router over architectures	this repo

Safety, Routing & Specialty

Architecture	Pattern	Reference
Dry-Run	Propose → simulate → approval gate	this repo
Reflexive Metacognitive	Self-aware capability routing	this repo
RLHF Self-Improvement	Multi-dim deterministic scoring + archive	this repo
Cellular Automata	LLM rules over a grid	this repo

Provider compatibility

Provider	Install extra	Notes
Nebius _(default)	`[nebius]`	Llama-3.3-70B + Qwen3-Thinking; cheapest for the included demos
OpenAI	`[openai]`	All architectures work; highest quality for reasoning patterns
Anthropic	`[anthropic]`	Strong on long context; required for production Computer-Use
Groq	`[groq]`	Fast inference; great for high-volume Self-Consistency
Ollama _(local)	`[ollama]`	No API key; tool calling depends on the model
Together	`[together]`	Wide model catalogue
Fireworks	`[fireworks]`	Function-calling first-class
Mistral	`[mistral]`	EU-hosted option
Google	`[google]`	Gemini 2.x via Generative AI API

Switch via LLM_PROVIDER + the corresponding key in .env. No code changes.

Benchmarks

A 17-task suite runs every architecture and scores results. Most recent run, real Nebius Llama-3.3-70B, ~25 min, ~$1.50 in tokens:

Outcome	Architectures
Strong _{2/2 or 3/3}	_{Reflection SelfConsistency SelfDiscover BrowserAgent}
Perfect on attempted _1/1	_{21 more — see leaderboard}
Pattern-fit failures	_{LATS on arithmetic (wrong shape) · Debate + Ensemble on Sally trick (group-think) · Reflexion + AWM on raw-fact recall (wrong memory shape)}
Overall	33 / 42 correct _78%

_{Full leaderboard with per-task answer excerpts: docs/benchmarks.md}

Learning paths

Four curated reading orders, depending on what you're trying to do.

Path	For	Order
Beginner	Mental model	_{Reflection → Tool Use → ReAct → Planning → Self-Consistency}
RAG-focused	Production retrieval	_{Agentic RAG → CRAG → Self-RAG → Adaptive RAG → GraphRAG}
Multi-agent	Coordination	_{Multi-Agent → Blackboard → Debate → STORM → Meta-Controller}
Safety	Guardrails	_{Dry-Run → Constitutional AI → Reflexive Metacognitive → BrowserAgent (safety gate)}

Star history

Tested

pytest -q
283 passed, 37 skipped (env-gated integration), 1 warning in ~30s

Suite	Coverage
Registry sweep	All 35 architectures (metadata + instantiate + build)
Pure-Python helpers	Haiku checker, composite scorers, subprocess executor, safety gate, sandbox path
Notebook integrity	All 35 notebooks executed, no error outputs, §9 commentary tailored from real captured runs
Integration _(env-gated)	One real-LLM happy-path per architecture, gated via `RUN_INTEGRATION=1`

Documentation


Full docs site	Dark-mode site with embedded notebooks _{(live after first deploy)}
Quickstart	One-command install, 8-line example
Switching providers	Capability matrix; one env var to swap
Add your own architecture	5-step contributor recipe
Deterministic-picker pattern	The central technical pattern, explained once
Memory variants	Comparison of all 7 memory shapes
API reference	mkdocstrings auto-gen from docstrings _{(live after first deploy)}
Benchmarks	Full per-task leaderboard with answer excerpts

Contributing

Contributions welcome. Two paths:

Add a new architecture — follow the 5-step recipe. The PR template includes a deterministic-picker checklist.
Improve an existing one — bug fix, prompt tuning, performance, scoring rubric. Open an issue first to discuss scope.

See CONTRIBUTING.md for the dev setup, code style, and commit-message convention (Conventional Commits — release-please auto-generates the CHANGELOG).

Citation

@misc{khan2026agentic,
  title         = {Agentic Architectures: A Library of 35 Production-Grade Agentic AI Patterns},
  author        = {Khan, Fareed},
  year          = {2026},
  howpublished  = {\url{https://github.com/FareedKhan-dev/all-agentic-architectures}},
  note          = {MIT licensed Python library and runnable textbook}
}

License

_{Built on LangGraph
·
Docs powered by Material for MkDocs
·
Default LLM via Nebius}

_{★ Star
·
Fork
·
Open an issue
·
Discuss}