Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md

Maggy

From Claude Bootstrap to autonomous AI engineering.

🤖 AI agents (Claude Code, Codex, Gemini CLI): Read GETSTARTED.md first. Architecture, routing table, commands, conventions — everything you need in 2 minutes.

This project started as Claude Bootstrap — an opinionated set of skills, hooks, and rules for Claude Code. Over time it grew into something much bigger: a multi-model routing system, a persistent memory layer, an intent-tracking code graph, container-based orchestration, and a full engineering command center. The bootstrap scaffolding is still here, but the future of this project is Maggy — an autonomous engineering system that routes work across AI models, learns from outcomes, and manages the full development lifecycle.

We ship mWP (minimum wowable product, 5-7 on the 11-star scale), not MVP. Every feature should make you think "I need this" — not just "it works."

62 skills, TDD enforcement via Stop hooks, agent teams, persistent memory (Mnemos), intent tracking (iCPG), and multi-model AI command center. Works with Claude Code, Kimi CLI, and OpenAI Codex CLI.

Quick Start

git clone https://github.com/alinaqi/maggy.git cd maggy && ./install.sh # In any project directory claude > /initialize-project

Claude will validate tools, ask about your stack, create the repo structure, copy skills/rules/hooks, and spawn an agent team.

Maggy — Autonomous Engineering System

Maggy is the core of this project. It routes tasks across models, tracks performance, learns from outcomes, and manages the full development lifecycle from a local dashboard or CLI REPL.

  • Multi-model routing — semantic blast scoring routes tasks across Claude/Codex/Kimi/Ollama based on complexity, cost, and proven performance
  • Task blueprints — self-learning workflows; Maggy captures tool sequences from successful tasks and replays them with cheaper models
  • Chat — interactive sessions with markdown rendering, streaming, session persistence, and file upload
  • Execute — one-click TDD pipeline with iCPG context enrichment
  • Tasks — AI-prioritized inbox from GitHub Issues or Asana
  • Competitors — auto-discovered competitors + daily AI briefing
  • Insights — CLI session analysis, health signals, reviewer evaluation
  • Reviewer knowledge map — tracks which reviewer (CodeRabbit, Codex, local) is best at which finding category
cd maggy && pip install -e . maggy serve # dashboard at localhost:8080 maggy # CLI REPL (runs from any project directory)

See maggy/README.md for setup and routing details.

Bootstrap Layer

The original scaffolding that sets up any project for AI-assisted development:

LayerWhatWhy
Skills62 skills loaded via @include in CLAUDE.mdLanguage, framework, security, AI patterns
RulesConditional rules (activate by file path)Quality gates, TDD workflow, security — only when relevant
HooksStop hooks for TDD loopsTests run after every Claude response, failures feed back automatically
AgentsTeam Lead + Quality + Security + Review + Merger + FeatureCoordinated pipeline: spec → test → implement → review → PR
MemoryMnemos (typed graph on disk)Survives compaction, crashes, restarts
IntentiCPG (code property graph)Tracks why code exists, detects drift
ExploreiCPG-powered code explorertrace_path, search_graph, query_graph instead of grep
RoutingPlan-vs-execute classifierCLAUDE tier → PLAN FIRST. DEEPSEEK/GEMINI → EXECUTE DIRECTLY
PluginsEvent-driven plugin systemDrop folder into ~/.maggy/plugins/, auto-discovered on startup

Plugin System

Maggy has an mWP-first plugin architecture. Drop a folder with plugin.yaml + plugin.py into ~/.maggy/plugins/ or plugins/ — it's auto-discovered and loaded at startup. Works standalone with Claude Bootstrap (no Maggy server needed).

# plugin.yaml id: my-plugin version: 1 entrypoint: plugin.py hooks: - event: on_pr_merged handler: handle_pr_merged - event: on_feature_shipped handler: handle_feature_shipped

First plugin: Build-in-Public — autonomous storyteller that notices your work, synthesizes a narrative, and publishes across channels without you asking.

PR merged → AI extracts narrative arc → anonymizes sensitive names
→ formats per channel (LinkedIn teaches, X punches)
→ schedules via Buffer API
  • Multi-channel: LinkedIn (professional deep dives) + X (sharp one-liners) — different voice per platform
  • Auto-redaction: anonymize.yaml replaces company names, strips revenue/user data
  • AI-powered: DeepSeek synthesizes the story — not templates
  • Zero-click: Triggers from hooks, never asks for manual approval

See skills/build-in-public/SKILL.md for channel best practices.

Skills (62)

Core — TDD, memory, intent tracking, code review, agent teams, security, commit hygiene, cross-agent delegation, Polyphony orchestration

Languages — Python, TypeScript, Node.js, React, React Native, Android (Java/Kotlin), Flutter

Databases — Supabase, Firebase, Cloudflare D1, DynamoDB, Aurora, Cosmos DB

AI — Agentic development, LLM patterns, AI models reference

UI — Web (Tailwind), mobile, visual testing, Playwright, PWA

Integrations — Stripe, Reddit, Shopify, WooCommerce, Medusa, Klaviyo, Teams, PostHog

See full skills catalog for details.

Cross-Tool Compatibility

FeatureClaude CodeKimi CLICodex CLIDeepSeek V4
Skills.claude/skills/.kimi/skills/.codex/skills/via Claude Code
InstructionsCLAUDE.md(uses skills)AGENTS.mdvia Claude Code
Memory9-section XML summaryNoneEncrypted blob / text summaryMnemos typed graph
RoutingManualManualManual6-tier auto-routing

install.sh auto-detects installed tools. /sync-agents syncs config across tools on demand.

Memory: Mnemos vs. Codex vs. Claude Code

Every AI coding tool loses context on compaction. The difference is whether it prevents failure or just reacts to it.

Codex compaction is an opaque encrypted blob triggered by a single token counter. When it misfires, the agent enters documented "death spirals" — up to 26 compactions per session, re-reading the same files 10-20×, burning 160M+ tokens on work that used to cost 89M. No telemetry surfaces why compaction fired. No memory survives the session.

Claude Code uses 9-section XML summarization at a hardcoded ~95% token threshold. The summary is opaque to the user, discard decisions are invisible, and critical context (active errors, file contents) is silently dropped. No cross-session recall, no team context, no signal that the agent is struggling before the summary happens.

Maggy Mnemos treats memory as a typed graph where goals and constraints are never evicted, while ephemeral context decays by relevance. A 4-dimension fatigue model (token pressure, scope scatter, reread ratio, error density) triggers consolidation early — in the COMPRESS state at 40-60% load, long before a death spiral. Mnemos measures re-read ratio explicitly — the leading indicator of a compaction death spiral. When the agent starts re-reading files it already read, fatigue rises and consolidation triggers before the context window is full. Every eviction decision is auditable in SQLite. Cross-session memory via Engram.

CodexClaude CodeMaggy (Mnemos)
Compaction triggerConfigurable token threshold, blind to workloadHardcoded ~95% token threshold, blind4-dimension fatigue score — token-aware but not token-blind
What survivesOpaque AES-encrypted blob (both paths)9-section XML summaryTyped memory nodes with per-type eviction (goals/constraints never evicted)
TransparencyZero — cannot audit the summaryReadable but discard decisions invisibleFully auditable — SQLite + JSONL, every node and eviction on disk
Death spiral preventionNone — known to compact for hoursNone — no pre-failure signalRe-read ratio + fatigue scoring triggers consolidation at 40-60%, before the window is full
Cross-session memoryNoneNoneEngram store — typed, queryable, persists across sessions
Pre-compaction safetyNone — compacts reactivelyNoneCheckpoint written before compaction — critical nodes survive even if compaction fails

Routing: Maggy vs. the Landscape

Every AI tool claims to pick the right model. Here's how they actually compare:

OpenRouterMartianPortkeySemantic RouterMaggy
ApproachPerformance-based, user-defined fallbacksLLM-as-Classifier, trained router modelGateway: retries, load balancing, rule-basedEmbedding similarity, pre-defined routesLLM-as-Classifier with cascading fallback
Classification costNone (user picks)API call (~$0.001)None (rule-based)None (embeddings)$0 (local qwen3)
Classifier resilienceN/ASingle point of failureN/AN/ACascade: qwen3 → kimi → deepseek → cache
Fatigue-awareNoNoNoNoYes — 4-dimension fatigue, PRE_SLEEP/REM escalation
Mid-task switchingNoNoNoNoCheckpoint-based state transfer (in progress)
Memory-awareToken count onlyNoToken count onlyNoSemantic: typed nodes, per-type eviction, re-read ratio
Self-learningNoNoNoNoPer-project routing profiles with success/failure tracking

Three things only Maggy has:

  1. Fatigue-aware routing — nobody routes based on agent state. When Mnemos detects PRE_SLEEP (0.60), Maggy skips cheap tiers. At REM (0.75), it forces premium models. OpenRouter can't do this. Martian can't. No paper proposes it.

  2. Cascading classifier resilience — every other router has a single point of failure. If Martian's classifier is down, routing stops. Maggy cascades through qwen3 → kimi → deepseek-flash → cached tier. The classifier itself is multi-model.

  3. Semantic memory, not token counting — Portkey checks token_count > 8000 to switch context windows. Maggy tracks what KIND of memory matters: goals survive compaction, error traces decay, code-refs persist. Routes based on semantic importance, not a counter.

Core Concepts

TDD via Stop Hooks — tests run after every Claude response. Failures feed back automatically. No plugins needed. Details →

Mnemos Memory — typed graph on disk (goals, constraints, results, context). Survives compaction, crashes, multi-agent failures. 4-dimension fatigue model writes checkpoints before things go wrong. Details →

iCPG Intent Tracking — links every code change to a ReasonNode with intent, postconditions, and invariants. 6-dimension drift detection. Details →

Agent Teams — 6 agents with enforced pipeline (spec → test → implement → review → security → PR). Only Feature agents can edit code. Details →

Usage

# New project mkdir my-app && cd my-app claude > /initialize-project # Existing project cd my-existing-app claude > /initialize-project # auto-detects existing code # Update skills globally cd "$(cat ~/.claude/.bootstrap-dir)" git pull && ./install.sh

Docs

License

MIT — See LICENSE


Need help scaling AI in your org? Claude Code & MCP experts

关于 About

What started as an opinionated Claude Code setup kit is now an autonomous AI engineering command center
ai-codingclaudeclaude-codedeveloper-toolsproject-initializationpythonreactsecuritytypescript

语言 Languages

Python85.7%
Shell7.2%
JavaScript5.9%
HTML1.3%

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
310
Total Commits
峰值: 142次/周
Less
More

核心贡献者 Contributors