Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md

Dash

A self-learning data agent built with systems engineering principles. It grounds answers in 6 layers of context and improves with every query.

Chat with Dash via Slack, the terminal, or the AgentOS web UI.

Quick Start

# Clone the repo git clone https://github.com/agno-agi/dash.git && cd dash cp example.env .env # Edit .env and add your OPENAI_API_KEY # Start the system docker compose up -d --build # Generate sample data and load knowledge docker exec -it dash-api python scripts/generate_data.py docker exec -it dash-api python scripts/load_knowledge.py

Confirm Dash is running at http://localhost:8000/docs.

Connect to the Web UI

  1. Open os.agno.com and login
  2. Add OS → Local → http://localhost:8000
  3. Click "Connect"

Try it (SaaS metrics dataset):

  • What's our current MRR?
  • Which plan has the highest churn rate?
  • Show me revenue trends by plan over the last 6 months
  • Which customers are at risk of churning?

Deploy to Railway

Railway deployment uses .env.production to keep production credentials separate from local dev.

cp example.env .env.production # Edit .env.production — set OPENAI_API_KEY

Step 1: Deploy infrastructure

This creates the Railway project, database, and app service. The app will crash-loop until the JWT key is added in the next step — that's expected.

railway login ./scripts/railway_up.sh

Step 2: Get your JWT key

Production requires a JWT_VERIFICATION_KEY from AgentOS. You need the Railway domain from step 1 to set this up.

  1. Copy your Railway domain from the output of step 1 (e.g. dash-production-xxxx.up.railway.app)
  2. Open os.agno.com and login
  3. Add OS → Live → paste your Railway URL
  4. Go to Settings and generate a key pair
  5. Add the public key to .env.production (wrap in single quotes):
JWT_VERIFICATION_KEY='-----BEGIN PUBLIC KEY----- MIIBIjANBgkq... -----END PUBLIC KEY-----'

Step 3: Push environment and redeploy

./scripts/railway_env.sh ./scripts/railway_redeploy.sh

railway_env.sh reads .env.production and sets each variable on the Railway service. Safe to run repeatedly. Handles multiline values (PEM keys) correctly.

Production operations

Database scripts must run inside Railway's network (the internal hostname pgvector.railway.internal isn't reachable from your local machine). Use SSH to connect to the running container:

railway ssh --service dash # Inside the container: python scripts/generate_data.py python scripts/load_knowledge.py

Other operations run locally:

railway logs --service dash railway open

Why Dash Exists

Ask a question in English, get a correct, meaningful answer. That's the goal. But raw LLMs writing SQL hit a wall fast: schemas lack meaning, types are misleading, tribal knowledge is missing, there's no way to learn from mistakes, and results lack interpretation.

The root cause is missing context and missing memory. Dash solves this with six layers of grounded context, a self-learning loop that improves with every query, and a focus on delivering insights you can act on.

Architecture: Five Layers, One System

Agentic software is just software with the business logic replaced by agents. Everything else is systems engineering. Dash is built across five layers that reinforce each other.

Agent Engineering     →  dash/team.py + dash/agents/
Data Engineering      →  knowledge/ + Agno Learning Machine + PostgreSQL
Security Engineering  →  AgentOS auth + RBAC + read-only SQL enforcement
Interface Engineering →  app/main.py (FastAPI) + Slack + AgentOS
Infrastructure        →  Dockerfile + compose.yaml + scripts/

1. Agent Engineering

The agent team and execution flow. Model, instructions, tools, knowledge, and the self-learning loop.

AgentOS (app/main.py)  [scheduler=True, tracing=True]
 ├── FastAPI / Uvicorn
 ├── Slack Interface (optional)
 └── Dash Team (dash/team.py, coordinate mode)
     ├─ Analyst (dash/agents/analyst.py)         reads public + dash
     │  ├─ SQLTools (read-only)  → public schema (company data)
     │  ├─ introspect_schema     → both schemas
     │  ├─ save_validated_query  → knowledge base
     │  └─ ReasoningTools
     ├─ Engineer (dash/agents/engineer.py)       reads public, writes dash
     │  ├─ SQLTools (full)       → dash schema (agent-managed)
     │  ├─ introspect_schema     → both schemas
     │  ├─ update_knowledge      → knowledge base (schema changes)
     │  └─ ReasoningTools
     │
     Leader tools: SlackTools (optional)
     Knowledge:    dash_knowledge (table schemas, queries, business rules, dash views)
     Learnings:    dash_learnings (error patterns, type gotchas, fixes)

2. Data Engineering

Context is data. Memory is data. Knowledge is data. All managed with data engineering principles: well-designed schemas, structured querying, databases for fast read/writes.

Six layers of grounded context:

LayerPurposeSource
Table UsageSchema, columns, relationshipsknowledge/tables/*.json
Human AnnotationsMetrics, definitions, business rulesknowledge/business/*.json
Query PatternsSQL that is known to workknowledge/queries/*.sql
Institutional KnowledgeDocs, wikis, external referencesMCP (optional)
LearningsError patterns and discovered fixesAgno Learning Machine
Runtime ContextLive schema changesintrospect_schema tool

The self-learning loop:

User Question
     ↓
Retrieve Knowledge + Learnings
     ↓
Reason about intent
     ↓
Generate grounded SQL
     ↓
Execute and interpret
     ↓
 ┌────┴────┐
 ↓         ↓
Success    Error
 ↓         ↓
 ↓         Diagnose → Fix → Save Learning
 ↓                           (never repeated)
 ↓
Return insight
 ↓
Optionally save as Knowledge

Two complementary systems:

SystemStoresHow It Evolves
KnowledgeValidated queries and business contextCurated by you + Dash
LearningsError patterns and fixesManaged by Learning Machine automatically

Dual schema enforcement: A structural boundary between company data and agent-managed data.

SchemaOwnerAccess
publicCompany (loaded externally)Read-only — never modified by agents
dashEngineer agentViews, summary tables, computed data

The Engineer builds reusable data assets (dash.monthly_mrr, dash.customer_health_score, dash.churn_risk) and records them to knowledge. The Analyst discovers and prefers these views over raw table queries.

3. Security Engineering

Auth uses RBAC with JWT verification in production. Every query is scoped to user_id. Read-only access is a tool configuration, not a prompt instruction. The Analyst agent's SQL tools are scoped to read-only at the system level.

See Security for setup details.

4. Interface Engineering

One agent definition, multiple surfaces. Dash is reachable via REST API (FastAPI), Slack threads, and the AgentOS web UI. Each surface has its own identity system: a Slack user ID maps to sessions via thread timestamps, the API uses JWT-backed auth.

5. Infrastructure Engineering

Dockerfile, Docker Compose, one-command deployment. Scheduled tasks for proactive behavior. The infrastructure layer is boring on purpose. 95% of running an agent is identical to running any other service.

Slack

Dash can receive Slack DMs, @mentions, and thread replies, and can also post to channels proactively.

Quick setup:

  1. Run Dash and give it a public URL (ngrok locally, or your Railway domain).
  2. Follow docs/SLACK_CONNECT.md to create and install the Slack app from the manifest.
  3. Set SLACK_TOKEN and SLACK_SIGNING_SECRET, then restart Dash.
  4. In Slack, confirm Event Subscriptions is verified and send a DM or @mention to test it.

Each Slack thread maps to one Dash session. For the manifest, ngrok commands, Railway deployment, permissions, and troubleshooting, see docs/SLACK_CONNECT.md.

Data Model (SaaS Metrics)

Synthetic B2B SaaS dataset (~900 customers, 2 years of data):

TableDescription
customersCompany info, industry, size, acquisition source, status
subscriptionsPlan, MRR, seats, billing cycle, lifecycle status
plan_changesUpgrades, downgrades, cancellations with MRR impact
invoicesBilling records, payment status, billing periods
usage_metricsDaily API calls, active users, storage, reports
support_ticketsPriority, category, resolution time, satisfaction

Adding Knowledge

Dash works best when it understands how your organization talks about data.

knowledge/
├── tables/      # Table meaning and caveats
├── queries/     # Proven SQL patterns
└── business/    # Metrics and language

Table Metadata

{ "table_name": "customers", "table_description": "B2B SaaS customer accounts with company info and lifecycle status", "use_cases": ["Churn analysis", "Cohort segmentation", "Acquisition reporting"], "data_quality_notes": [ "signup_date is DATE (not TIMESTAMP) — no time component", "status values: active, churned, trial", "company_size is self-reported" ] }

Query Patterns

-- <query monthly_mrr> -- <description>Monthly MRR from active subscriptions</description> -- <query> SELECT DATE_TRUNC('month', started_at) AS month, SUM(mrr) AS total_mrr FROM subscriptions WHERE ended_at IS NULL GROUP BY 1 ORDER BY 1 DESC -- </query>

Business Rules

{ "metrics": [ { "name": "MRR", "definition": "Sum of active subscriptions excluding trials" } ], "common_gotchas": [ { "issue": "Active subscription detection", "solution": "Filter on ended_at IS NULL, not status column" } ] }

Load Knowledge

python scripts/load_knowledge.py # Upsert changes python scripts/load_knowledge.py --recreate # Fresh start

Evaluations

Five eval categories using Agno's eval framework:

CategoryEval TypeWhat It Tests
accuracyAccuracyEval (1-10)Correct data and meaningful insights
routingReliabilityEvalTeam routes to correct agent/tools
securityAgentAsJudgeEval (binary)No credential or secret leaks
governanceAgentAsJudgeEval (binary)Refuses destructive SQL operations
boundariesAgentAsJudgeEval (binary)Schema access boundaries respected
python -m evals # Run all evals python -m evals --category accuracy # Run specific category python -m evals --verbose # Show response details

Local Development

./scripts/venv_setup.sh && source .venv/bin/activate docker compose up -d dash-db python scripts/generate_data.py python scripts/load_knowledge.py python -m dash # CLI mode python -m app.main # AgentOS mode (web UI at os.agno.com)

Environment Variables

VariableRequiredDefaultPurpose
OPENAI_API_KEYYesOpenAI API key
SLACK_TOKENNo""Slack bot token (interface + tools)
SLACK_SIGNING_SECRETNo""Slack signing secret (interface only)
DB_HOSTNolocalhostPostgreSQL host
DB_PORTNo5432PostgreSQL port
DB_USERNoaiPostgreSQL user
DB_PASSNoaiPostgreSQL password
DB_DATABASENoaiPostgreSQL database
PORTNo8000API port
RUNTIME_ENVNoprddev enables hot reload
AGENTOS_URLNohttp://127.0.0.1:8000Scheduler callback URL (production)
JWT_VERIFICATION_KEYProductionRBAC public key from os.agno.com

Security

Production deployments require authentication via Agno AgentOS. Dash enables RBAC authorization when RUNTIME_ENV=prd (the default). Without a valid JWT_VERIFICATION_KEY, production endpoints will reject all requests.

Local development (RUNTIME_ENV=dev, set by Docker Compose) runs without auth so you can iterate freely.

Auth Setup

See Deploy to Railway for the full setup flow, including how to get your JWT_VERIFICATION_KEY from AgentOS. The Agno control plane handles JWT issuance, session management, traces, metrics, and the web UI. See the AgentOS Security docs for details.

Schema-Level Enforcement

Beyond API-level auth, Dash enforces data access at the database level:

  • Analyst connects with default_transaction_read_only=on — PostgreSQL rejects any write attempt
  • Engineer writes are scoped to the dash schema — a SQLAlchemy event listener blocks any DDL/DML targeting public
  • Leader has no direct database access

These are infrastructure guardrails, not prompt instructions. They hold regardless of what the model generates.

Learn More

Built on Agno · the runtime for agentic software

关于 About

A self-learning data agent built with systems engineering principles. It grounds answers in 6 layers of context and improves with every query.

语言 Languages

Python87.9%
Shell11.3%
Dockerfile0.8%

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
62
Total Commits
峰值: 35次/周
Less
More

核心贡献者 Contributors