Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md
Production-Grade MCP Server + Agentic System

🏛️ Production-Grade MCP Server + Agentic System

A reference implementation of an MCP server designed to actually ship

Multi-tenant · Authenticated · Observable · Rate-limited · Cached · Circuit-broken · Governed

Python 3.11+ MCP 2026 License: MIT Docker


📖 Full Step-by-Step Blog Walkthrough

This repository is the companion codebase for a long-form blog post that walks through every single component end to end, with every line of code explained in context. Start there if you want to understand the "why" behind the architecture before reading the code.

🔗 Building a Production-Grade MCP Server Architecture with Agentic System →


🎯 What This Is

Most MCP tutorials end with a @tool decorator that returns "hello world". That is fine for a demo. It is not what ships.

This repository is a reference implementation of an MCP server designed to run in production: multi-tenant, authenticated, observable, rate-limited, cached, circuit-broken, and governed. It exposes a company's heterogeneous data layer (Postgres, Elasticsearch, S3, vector DB) to AI agents as a single, secure tool surface, and ships with a four-agent support copilot (Planner → Retriever → Synthesizer → Critic) that uses it end to end.

The codebase is deliberately organised around twelve components that keep showing up on the 3 AM pager when teams skip them. Each one lives in its own module and can be read, replaced, or extended independently.


🏗️ Architecture Overview

Full Architecture

The complete production-grade system: MCP server dispatch pipeline on the right, four-agent orchestrator on the left, data plane on top, observability on the bottom, identity and governance as crosscutting concerns.


🧩 The 12 Components

#ComponentLives inWhat it gives you
1🚪 Transport & Session Layerserver.pystdio for local, Streamable HTTP for remote, horizontal-scale-friendly sessions
2🔐 Authentication Serverauth/oauth.pyOAuth 2.1 + PKCE, short-lived JWTs, JWKS validation
3⚖️ Authorization & Policy Engineauth/policy.pyTool-level RBAC, tenant-scoped ABAC, deny-by-default
4📚 Tool Registry & Discoverytools/registry.pyDynamic toolsets, .well-known capability metadata
5Input Validation Layervalidation/schemas.pyPydantic schemas, enum constraints, agent-adversarial input as default threat model
6🔧 Tool Execution Enginetools/base.pyThree-level hierarchy (atomic / composed / workflow)
7🔄 Circuit Breaker & Retryreliability/Closed → open → half-open, Adaptive Timeout Budget Allocation
8🚦 Rate Limiting & Quotasratelimit/limiter.pyRedis token-bucket (Lua-atomic), per-tenant and per-tool
9Caching Layercache/manager.pyTwo-tier (L1 in-process, L2 Redis), stampede prevention
10🧱 Structured Error Frameworkerrors/framework.pyMachine-readable errors with retryable and hint fields
11🔭 Observability Stackobservability/OpenTelemetry traces, Prometheus metrics, audit logs
12🛡️ Governance & Multi-Tenancygovernance/Tenant isolation, approval gates, outbound HTTP allowlisting

📖 Diving Deeper, Section by Section

Each diagram below links back to the corresponding section in the blog, where every line of code is walked through in detail.

📦 Data Persistence Layer

Data Persistence Layer

Postgres + Row-Level Security · Tenant isolation at the DB layer

🚪 Transport & Session Layer

Transport Layer

Dual transport · Stateless session · Middleware chain

🔐 Authentication, Policy & Governance

Auth & Policy

OAuth 2.1 · YAML policies · Human-in-the-loop approvals

🔧 Tool Execution Engine

Tool Execution

Three-level hierarchy · Atomic · Composed · Workflow

🔄 Reliability Layer

Reliability

Circuit breakers · Retry with jitter · ATBA budget allocator

⚡ Rate Limiting & Caching

Rate Limit & Cache

Redis token bucket · Two-tier cache · Stampede lock

🔭 Observability Stack

Observability

OpenTelemetry · Prometheus · Audit logs · One trace ID

🤖 Multi-Agentic Architecture

Multi-Agent

Four-agent design · Planner · Retriever · Synthesizer · Critic

🎼 The Orchestrator Flow

Orchestrator

End-to-end agent orchestration with one bounded revise loop


🚀 Quick Start

Prerequisites

  • Docker & Docker Compose
  • Python 3.11+ (only for running the CLI locally)
  • An Anthropic API key (for the agent layer)

1. Clone and Configure

git clone https://github.com/FareedKhan-dev/production-grade-mcp-agentic-system.git cd production-grade-mcp-agentic-system cp .env.example .env

Edit .env and set at minimum:

  • ANTHROPIC_API_KEY — for the agent layer
  • ATLAS_AUTH_JWKS_URL — your OAuth 2.1 provider's JWKS endpoint (or leave default for dev)

2. Bring Up the Stack

docker compose up -d

That brings up the full local environment:

ServiceURLWhat it is
🏛️ MCP Serverhttp://localhost:8080/mcpStreamable HTTP endpoint
🔍 Discoveryhttp://localhost:8080/.well-known/mcp-serverUnauthenticated capability metadata
📊 Metricshttp://localhost:8080/metricsPrometheus scrape target
❤️ Healthhttp://localhost:8080/healthzLiveness probe
🔭 Jaegerhttp://localhost:16686Distributed tracing UI
📈 Grafanahttp://localhost:3000Metrics dashboards (admin / admin)
🗄️ MinIO Consolehttp://localhost:9001S3-compatible storage UI

3. Run the Support Copilot CLI

pip install -e . export ATLAS_MCP_URL=http://localhost:8080 export ATLAS_MCP_TOKEN=dev-token export ATLAS_TENANT=acme export ANTHROPIC_API_KEY=sk-ant-... atlas-copilot "Why was the refund on order o_9002 for CUST-1001 delayed?"

You will see the four agents run end-to-end, the final draft printed with [S1][S2] citations, and a full trace summary including token counts, tool calls, and the run_id that ties back to Jaeger.

4. Connect from Claude Desktop / Cursor

Add this to your MCP host config:

{ "mcpServers": { "production-mcp": { "type": "http", "url": "http://localhost:8080/mcp", "headers": { "Authorization": "Bearer ${ATLAS_MCP_TOKEN}", "X-Tenant-Id": "acme" } } } }

📂 Repository Structure

.
├── 📄 README.md
├── 🐳 docker-compose.yml          # Full local stack: app + data + observability
├── 🐳 Dockerfile                  # Two-stage build, non-root runtime
├── 📜 LICENSE
├── 📦 pyproject.toml              # Dependencies, dev tools, CLI entry points
├── ⚙️  .env.example                # Every setting documented by component
│
├── 🔧 config/                     # Runtime configuration (hot-reloadable)
│   ├── http_allowlist.yaml       # Per-tenant outbound HTTP allowlist
│   └── policy.yaml               # YAML-driven authorization policies
│
├── 🚢 deploy/                     # Deployment sidecar configs
│   ├── otel/config.yaml          # OpenTelemetry Collector pipeline
│   ├── prometheus/prometheus.yml # Prometheus scrape targets
│   └── sql/init.sql              # Schema + RLS policies + seed data
│
├── 📚 docs/                       # Deep-dive documentation
│   ├── AGENT_SYSTEM.md           # Multi-agent orchestrator internals
│   ├── ARCHITECTURE.md           # The 12 components in detail
│   └── DEPLOYMENT.md             # K8s, Cloudflare Workers, bare-metal
│
├── 🧠 src/atlas_mcp/              # Main application source
│   ├── config.py                 # Centralized typed settings
│   ├── server.py                 # ⚡ Component 1: Transport & dispatch
│   │
│   ├── 🤖 agents/                 # Four-agent support copilot
│   │   ├── planner.py            # Emits retrieval plan JSON
│   │   ├── retriever.py          # Bounded tool-calling loop
│   │   ├── synthesizer.py        # Drafts reply with citations
│   │   ├── critic.py             # Approves or sends one revise
│   │   ├── orchestrator.py       # Wires the four agents together
│   │   ├── mcp_client.py         # Thin JSON-RPC MCP client
│   │   ├── memory.py             # STM (Redis) + LTM (vector)
│   │   └── cli.py                # atlas-copilot CLI entry point
│   │
│   ├── 🔐 auth/                   # Components 2 + 3
│   │   ├── oauth.py              # JWT + JWKS validation
│   │   ├── middleware.py         # Bearer token extraction
│   │   └── policy.py             # YAML-driven policy engine
│   │
│   ├── 🛡️  governance/             # Component 12
│   │   ├── tenant.py             # Tenant pinning middleware
│   │   └── approval.py           # Human-in-the-loop gate
│   │
│   ├── 🔧 tools/                  # Components 4 + 6
│   │   ├── registry.py           # In-memory tool index + discovery
│   │   ├── base.py               # Tool abstract base + metadata
│   │   ├── atomic/               # Level 1: one backend each
│   │   ├── composed/             # Level 2: deterministic chains
│   │   └── workflow/             # Level 3: multi-step procedures
│   │
│   ├── 🔄 reliability/            # Component 7
│   │   ├── circuit_breaker.py    # 3-state machine per tool
│   │   ├── retry.py              # Exponential backoff + jitter
│   │   └── atba.py               # Adaptive Timeout Budget Allocation
│   │
│   ├── 🚦 ratelimit/              # Component 8
│   │   └── limiter.py            # Redis token bucket (Lua-atomic)
│   │
│   ├── ⚡ cache/                   # Component 9
│   │   └── manager.py            # L1 + L2 cache with stampede lock
│   │
│   ├── 🧱 errors/                 # Component 10
│   │   └── framework.py          # Structured Error Recovery (SERF)
│   │
│   ├── 🔭 observability/          # Component 11
│   │   ├── tracing.py            # OpenTelemetry spans
│   │   ├── metrics.py            # Prometheus instruments
│   │   └── audit.py              # Structured JSONL audit log
│   │
│   └── ✅ validation/             # Component 5
│       └── schemas.py            # Tool call envelope
│
└── 🧪 tests/                      # Narrow tests, load-bearing properties
    ├── test_circuit_breaker.py   # State machine transitions
    ├── test_errors.py            # SERF wire format + retry semantics
    └── test_policy.py            # Deny-beats-allow + default-deny

🎨 Tech Stack

LayerTechnology
LanguagePython 3.11+
Web frameworkStarlette + Uvicorn
MCP SDKmcp>=1.2.0
AuthPyJWT + Authlib (OAuth 2.1 resource server)
ValidationPydantic v2 + Pydantic Settings
Databaseasyncpg (PostgreSQL 16 with RLS)
SearchElasticsearch 8 (async client)
Vector DBQdrant
Object storageaioboto3 (MinIO / S3)
Cache + queuesRedis 7 (redis[hiredis])
Reliabilitytenacity (retries) + custom breaker + custom ATBA
TracingOpenTelemetry SDK + OTLP exporter
Metricsprometheus_client
Loggingstructlog (JSON)
LLMAnthropic Messages API (Claude)

🧪 Testing

The test suite is deliberately narrow, covering the three load-bearing safety properties:

pip install -e ".[dev]" pytest -v
  • test_circuit_breaker.py — state machine transitions, retryable vs deterministic error classification
  • test_errors.py — SERF wire format, retry semantics, MCP-level error data
  • test_policy.py — default-deny, deny-beats-allow, glob matching, PII condition blocking

🛣️ Production Deployment

For running this in an actual production environment (managed Postgres, real OAuth provider, SIEM integration, Kubernetes), see docs/DEPLOYMENT.md.

Key swaps between local dev and production:

Local (docker-compose)Production
Dev JWT issuerWorkOS AuthKit / Auth0 / Keycloak
MinIOAWS S3 / GCS / Azure Blob
Local PostgresAWS RDS / Cloud SQL / Supabase
Redis containerUpstash / ElastiCache / MemoryDB
Local OTel collectorDatadog / Honeycomb / Grafana Cloud
File-based audit logSplunk / Chronicle / SIEM of choice

📚 Documentation


📜 License

MIT. See LICENSE.


⭐ If this helped you, please consider starring the repo

Built with ☕ and a lot of 3 AM debugging

📖 Read the full blog walkthrough · 🐛 Report an issue · 💬 Start a discussion

关于 About

Building a Production-Grade MCP Server Architecture with a Multi-Agent System
claudeclaude-codemcp-servermulti-agent-systemsproduction

语言 Languages

Python98.9%
Dockerfile1.1%

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
74
Total Commits
峰值: 74次/周
Less
More

核心贡献者 Contributors