Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md

easy-agent logo

easy-agent

A white-box Python foundation for inspectable, testable, and extensible agent runtimes.

English | 简体中文

Python 3.12 uv managed License MIT Release line

easy-agent is the runtime layer underneath an agent product, not the product itself. It keeps orchestration, tool calling, persistence, approvals, federation, and evaluation explicit so teams can evolve their systems without hiding critical behavior behind opaque framework abstractions.

The latest published patch is 0.3.5.

What This Project Is

Most agent projects move quickly from "call a model" to "ship an application". The runtime layer in the middle then accumulates hidden assumptions around tools, memory, approvals, transport, and recovery.

easy-agent exists to keep that middle layer explicit:

  • It separates runtime engineering from product logic.
  • It keeps scheduling, orchestration, and protocol adaptation inspectable.
  • It lets you mount tools, skills, MCP servers, and plugins without rewriting the core.
  • It provides durable harnesses, checkpoints, and replay instead of relying on one oversized prompt.

Who It Is For

  • Engineering teams building agent products that need a reusable runtime instead of a one-off demo.
  • Developers who want direct control over tool calling, approvals, persistence, and resume behavior.
  • Projects that need to evolve with provider APIs, MCP, and multi-agent patterns over time.

Tech Stack

  • Runtime: Python 3.12, uv, AnyIO, Typer
  • Model surface: OpenAI-compatible, Anthropic-style, and Gemini-style payload adaptation
  • Persistence: SQLite + JSONL traces
  • Integration surface: direct tools, command skills, Python hook skills, MCP, plugins
  • Isolation surface: process, container, and microVM workbench executors

Features

  • White-box runtime layers for scheduler, orchestrator, tool registry, storage, and protocol adapters.
  • Support for single_agent, sub_agent, graph workflows, Agent Teams, and long-running harnesses.
  • Session memory, checkpoints, replay, branchable resume, and approval-aware recovery.
  • Guardrails, schema-aware tool validation, runtime event streaming, and persistent traces.
  • A2A-style remote federation with durable task state and signed callback verification.
  • Public evaluation helpers for benchmark, BFCL, tau2 mock, provider-schema compatibility, and real-network regression tracking.

Human Loop, Replay, and MCP

easy-agent already ships the reliability controls that many projects leave as future work:

  • Sensitive tools, swarm handoffs, and resumptions can enter a durable approval flow.
  • Runs expose safe-point interrupts, checkpoint listing, replay, and forked resume.
  • MCP integrations support explicit roots, root snapshots, notifications/roots/list_changed, resources or prompts catalog management, durable resource subscriptions, resource-template snapshots, prompt-detail invalidation, elicitation approval state, streamable_http, and persisted OAuth state.

Reference:

A2A Remote Agent Federation

The federation layer publishes local agents, teams, and harnesses through a durable A2A-style surface:

  • Well-known discovery, richer cards, push or poll delivery, retry, and resubscribe flows.
  • OAuth/OIDC token acquisition and refresh for remote federation clients.
  • JWKS/JWS validation for signed cards and signed callbacks.
  • Stricter tenant/task authorization boundaries before federated state is revealed or mutated.

Operational detail and comparison notes are documented in reference/en/test-results.md.

Executor / Workbench Isolation

The executor/workbench layer gives long-lived tools and MCP subprocesses a reusable runtime boundary:

  • Named executors for process, container, and microvm.
  • Persistent workbench sessions, manifests, snapshots, and TTL cleanup.
  • Real-network regression coverage for warm-start latency and snapshot drift.

Detailed operational notes are documented in reference/en/usage-guide.md.

Architecture

The runtime is intentionally modular and observable:

  • scheduler coordinates direct-agent and graph execution.
  • orchestrator runs agent and team turns.
  • harness manages initializer, worker, and evaluator loops.
  • registry exposes tools, skills, MCP tools, and mounted plugins.
  • storage persists runs, checkpoints, approvals, sessions, federation state, and workbench state.
flowchart LR User[User] --> CLI[Typer CLI] CLI --> Runtime[EasyAgentRuntime] Runtime --> Scheduler[GraphScheduler] Runtime --> Harness[HarnessRuntime] Scheduler --> Orchestrator[AgentOrchestrator] Harness --> Orchestrator Orchestrator --> Registry[ToolRegistry] Orchestrator --> Store[SQLiteRunStore] Orchestrator --> Client[HttpModelClient] Client --> Adapter[ProtocolAdapter] Adapter --> Provider[Provider API]

Long-Running Harness Design

Harnesses are first-class runtime objects rather than prompt conventions. Each harness defines:

  • an initializer_agent
  • a worker_target
  • an evaluator_agent
  • an explicit completion_contract

The worker loop persists artifacts and checkpoints so long-running tasks can continue, replan, or resume without discarding state.

Protocol and Tool Model

  • Model protocols: OpenAI-compatible chat-completions or Responses API payload normalization, Anthropic-style payloads, and Gemini-style payload normalization.
  • Tool calling: strict schema transport, nullable/optional modeling, validation-repair loops, provider-neutral tool-choice controls, and provider-schema compatibility telemetry.
  • Web-search eval hardening: SerpApi /search.json, grounded source ledgers, cache-first contents reuse, replay-backed contents fallback, raw official BFCL manifest normalization, and single-call regression guards.

Provider behavior details and structured-output notes live in reference/en/next-reinforcement.md.

Project Layout

src/ agent_cli/ agent_common/ agent_config/ agent_graph/ agent_integrations/ agent_protocols/ agent_runtime/ skills/ configs/ tests/ reference/ en/ zh/

Quick Start

uv venv --python 3.12 uv sync --dev uv run easy-agent --help uv run easy-agent doctor -c easy-agent.yml

Detailed setup, local credentials, CLI commands, and examples:

What a Harness Run Produces

A harness run persists durable artifacts under the configured artifact directory and durable session storage, including:

  • bootstrap and progress markdown
  • feature snapshots
  • checkpoints and replay state
  • workbench session metadata

Artifact details are documented in reference/en/usage-guide.md.

Verification

The latest published patch is 0.3.5. This release refreshes the benchmark, public-eval, and real-network snapshots on April 14, 2026, while extending the runtime with grounded web-search source ledgers, cache-first grounded contents recovery, and stricter Responses or structured-output regression coverage. Methodology notes, public comparison rows, and detailed matrices live in reference/en/test-results.md.

Score Summary

Test SetScore
benchmark.overall100.0
public_eval.bfcl_overall100.0
public_eval.tau2_mock100.0

Real Network Test Set Results

The real-network matrix is reported as score-only in this README. Durations, telemetry, warm-start budgets, and snapshot-drift detail are tracked in reference/en/test-results.md.

Test SetScore
real_network.overall100.0

Next Reinforcement

The next reinforcement track is documented in full at reference/en/next-reinforcement.md. The near-term focus remains:

  • turning the shipped chat-completions and Responses API parity into live provider-specific compatibility evidence
  • extending BFCL web-search from grounded replay safety toward richer source-ledger, source-aware, and multihop official coverage
  • deepening MCP notification parity around resource updates, prompt-detail refresh, and template diff telemetry

Design References

Acknowledgements

  • Linux.do for community discussion and open knowledge sharing.
  • DeepSeek for the real verification baseline and model endpoint.

License

MIT. See LICENSE.

关于 About

easy-agent is a white-box Python foundation for building agent systems that you can actually inspect, test, and extend.

语言 Languages

Python99.9%
PowerShell0.0%
Batchfile0.0%

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
34
Total Commits
峰值: 17次/周
Less
More

核心贡献者 Contributors