Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md

中文 · 日本語 · Español · English

CatchMe Logo

CatchMe: Make Your AI Agents Truly Personal

Capture Your Entire Digital Footprint: Lightweight & Vectorless & Powerful.

License Python Platform Blog Report
Feishu WeChat Discord

Features  ·  How It Works  ·  LLM Config  ·  Get Started  ·  Cost  ·  Community

Just do your thing. CatchMe captures everything else — stored locally to ensure privacy and security.

CatchMe Terminal Demo

🦞 Makes Your Agents Truly Personal. CatchMe ships as an agent-compatible skill for CLI agents (OpenClaw, NanoBot, Claude, Cursor, etc.). Run CatchMe independently. Your agents query memories via CLI commands only.

🎯 Enrich Your Personal Digital Context

Coding

💻 Personal Coding Assistant

"What was I coding in Claude Code today?"

• Code session replay
• Recall your edited files
• Trace what you typed
Research

🔍 Personal Deep Research

"What was I reading about AI yesterday?"

• Web/PDF viewed
• Search queries typed
• Reading info tracked
Files

📁 Personal Files Manager

"Which files did I change today?"

• File changes tracked
• Docs accessed
• Edits reviewed
Digital Life

🧩 Digital Life Overview

"How did I spend my afternoon?"

• App usage tracked
• Workflows replayed
• Activities recalled

✨ Key Features

📹 Always-On Event Capture

  • Event-Driven Recording: No timer or delays - catch mouse actions with crosshair annotation instantly.
  • Comprehensive Context: Five recorders track windows, keyboard, clipboard, notifications, and files around mouse actions.

🌲 Intelligent Memory Hierarchy

  • Auto-Organization: Raw streams structure into five tiers: Day → Session → App → Location → Action.
  • Smart Summaries: LLM summaries at each level, transforming logs into searchable knowledge trees.

🔍 Tree-Based Retrieval

  • No Vector Complexity: Skip embeddings and VDBs — our system uses tree-based reasoning for navigation.
  • Top-Down Search: LLM reads summaries, selects relevant branches, and drills down to evidence.

🤖 Zero-Config Agent Integration

  • One-File Setup: Drop a single skill file into any AI agent for instant integration.
  • Immediate Access: CLI-based screen history queries with zero configuration required.

🪶 Ultralight & Privacy-First

  • Minimal Footprint: ~0.2GB runtime RAM with efficient SQLite + FTS5 storage.
  • Local & Offline: All data stays on your machine with full offline mode via Ollama/vLLM/LM Studio.

🖥️ Rich Web Interface

  • Visual Exploration: Interactive timelines, memory tree navigation, and real-time system monitoring.
  • Natural Conversation: Chat with your complete digital footprint using natural language.

CatchMe Web Dashboard

💡 CatchMe Architecture

CatchMe transforms raw digital activity into structured, searchable memory through three concurrent stages:

🔄 Record → Organize → Reason: Turn digital chaos into queryable memory

Capture. Six background recorders silently track your activity. They monitor window focus, keystrokes, mouse movement, screenshots, clipboard, and notifications.

Index. Raw events auto-organize into a Hierarchical Activity Tree: Day → Session → App → Location → Action. Each node gets LLM-generated summaries. Fast, meaningful recall without vector embeddings.

Retrieve. You ask a question. The LLM traverses your memory tree top-down. It selects relevant nodes and inspects raw data like screenshots or keystrokes. Then synthesizes a precise answer.

CatchMe Pipeline: Capturing → Indexing → Retrieving

🌲 Hierarchical Activity Tree

The Activity Tree is CatchMe's memory core. It provides structured, multi-level views of your digital life. Browse high-level summaries or dive into granular details.

Hierarchical Activity Tree Structure

🔍 Intelligent Tree Retrieval

CatchMe skips traditional vector search. Instead, the LLM directly navigates your Activity Tree. This enables complex, cross-day reasoning. Precise evidence gathering from raw activity history.

Tree-based Retrieval Process

📖 Learn More: Detailed design insights and technical deep-dive available in our blog.

🧠 LLM Configuration

❗️ Data Privacy Notice

100% Local Storage: All raw data (screenshots, keystrokes, activity trees) stays in ~/data/ and never leaves your machine.

Offline-First Options: Local LLMs (Ollama, vLLM, LM Studio) enable fully offline operation without any cloud dependency.

⚠️Cloud Provider Caution: If used, cloud APIs will be used to summarize your daily activities. Untrusted endpoints may expose private data — review data policies of your provider carefully.

📋 Requirements

Multimodal support: Your model should be able to handle text + images.

Context window: Make sure the context window of your model exceed max_tokens limits in config.json.

Cost control: For forced cost control, set limits via llm.max_calls or increase filter.mouse_cluster_gap to reduce summarization frequency.

CatchMe requires an LLM for background summarization and intelligent retrieval. Use catchme init (in Get Started)for guided setup or follow the manual configuration steps below.

For cloud API services:

{
    "llm": {
        "provider": "openrouter",
        "api_key": "sk-or-...",
        "api_url": null,
        "model": "google/gemini-3-flash-preview"
    }
}

For local/offline operation:

{
    "llm": {
        "provider": "ollama",
        "api_key": null,
        "api_url": null,
        "model": "gemma3:4b"
    }
}
Supported LLM Providers
ProviderConfig nameDefault API URLGet Key
OpenRouter (gateway)openrouterhttps://openrouter.ai/api/v1openrouter.ai/keys
AiHubMix (gateway)aihubmixhttps://aihubmix.com/v1aihubmix.com
SiliconFlow (gateway)siliconflowhttps://api.siliconflow.cn/v1cloud.siliconflow.cn
OpenAIopenaihttps://api.openai.com/v1platform.openai.com
Anthropicanthropichttps://api.anthropic.com/v1console.anthropic.com
DeepSeekdeepseekhttps://api.deepseek.com/v1platform.deepseek.com
Geminigeminihttps://generativelanguage.googleapis.com/v1betaaistudio.google.com
Groqgroqhttps://api.groq.com/openai/v1console.groq.com
Mistralmistralhttps://api.mistral.ai/v1console.mistral.ai
Moonshot / Kimimoonshothttps://api.moonshot.ai/v1platform.moonshot.cn
MiniMaxminimaxhttps://api.minimax.io/v1platform.minimaxi.com
Zhipu AI (GLM)zhipuhttps://open.bigmodel.cn/api/paas/v4open.bigmodel.cn
DashScope (Qwen)dashscopehttps://dashscope.aliyuncs.com/compatible-mode/v1dashscope.console.aliyun.com
VolcEnginevolcenginehttps://ark.cn-beijing.volces.com/api/v3console.volcengine.com
VolcEngine Codingvolcengine_coding_planhttps://ark.cn-beijing.volces.com/api/coding/v3console.volcengine.com
BytePlusbyteplushttps://ark.ap-southeast.bytepluses.com/api/v3console.byteplus.com
BytePlus Codingbyteplus_coding_planhttps://ark.ap-southeast.bytepluses.com/api/coding/v3console.byteplus.com
Ollama (local)ollamahttp://localhost:11434/v1
vLLM (local)vllmhttp://localhost:8000/v1
LM Studio (local)lmstudiohttp://localhost:1234/v1

Any OpenAI-compatible endpoint works — just set api_url and api_key directly.

All Configuration Parameters
SectionParameterDefaultDescription
webhost127.0.0.1Dashboard bind address
port8765Dashboard port
llmproviderLLM provider name (see table above)
api_keyAPI key for the provider
api_url(auto)Custom endpoint; auto-set per provider if omitted
modelModel name (provider-specific)
wire_api(omit)Set to "responses" for providers that only expose POST /v1/responses instead of chat completions
max_calls0Max LLM calls per cycle (0 = unlimited; set to limit costs)
max_images_per_cluster5Max screenshots sent per event cluster
filterwindow_min_dwell3.0Min window dwell time (sec) before recording
keyboard_cluster_gap3.0Keyboard event clustering gap (sec)
mouse_cluster_gap3.0Time gap (sec) to merge mouse events; larger values reduce LLM summaries
summarizelanguageenSummary output language (en, zh, etc.)
max_tokens_l0l31200Max tokens per tree level (L0=Action … L3=Session)
temperature0.4LLM temperature for summarization
max_workers2Concurrent summarization workers
debounce_sec3.0Debounce before triggering summary
save_interval_sec5.0Tree auto-save interval
retrievemax_prompt_chars42000Max chars in retrieval prompt
max_iterations15Max tree traversal iterations
max_file_chars8000Max chars from extracted files
max_select_nodes7Max nodes selected per iteration
max_tokens_step4096Max tokens per retrieval step
max_tokens_answer8192Max tokens for final answer
temperature_select0.3Temperature for node selection
temperature_answer0.5Temperature for answer generation
temperature_time_resolve0.1Temperature for time resolution
max_tokens_time_resolve1000Max tokens for time resolution

🚀 Get Started

📦 Install

git clone https://github.com/HKUDS/catchme.git && cd catchme

conda create -n catchme python=3.11 -y && conda activate catchme

pip install -e .

macOS — grant Accessibility, Input Monitoring, Screen Recording in System Settings → Privacy & Security Windows — run as Administrator for global input monitoring

⚡ Init

catchme init                  # interactive setup: provider, API key, llm model

🔥 Run

catchme awake                 # start recording
catchme web                   # visualize and chat

# or through cli
catchme ask -- "What am I doing today?"
Full CLI Reference
CommandDescription
catchme awakeStart the recording daemon
catchme web [-p PORT]Launch web dashboard (default http://127.0.0.1:8765)
catchme ask -- "question"Query your activity in natural language
catchme costShow LLM token usage (last 10 min / today / all time)
catchme diskShow storage breakdown & event count
catchme ramShow memory usage of running processes
catchme initInteractive setup: LLM provider, API key & model

🦞 CatchMe Makes Your Agents Truly Personal

CatchMe ships as an agent-compatible skill for CLI agents (OpenClaw, NanoBot, Claude, Cursor, etc.).

🪶 Agent Integration: Run CatchMe independently. Your agents query memories via CLI commands only.

# 1. Start CatchMe yourself
catchme awake

# 2. Give the light skill to your agent
cp CATCHME-light.md ~/.cursor/skills/catchme/SKILL.md

Option B — Full Skill (agent manages the full CatchMe lifecycle autonomously):

cp CATCHME-full.md ~/.cursor/skills/catchme/SKILL.md

🔧 Integrate into your current workflow

from catchme import CatchMe
from catchme.pipelines.retrieve import retrieve

# 1. One-line search — fast keyword lookup over all recorded activity
with CatchMe() as mem:
    for e in mem.search("meeting notes"):
        print(e.timestamp, e.data)

# 2. LLM-powered retrieval — natural language Q&A over your screen history
for step in retrieve("What was I working on this morning?"):
    if step["type"] == "answer":
        print(step["content"])

📊 Cost & Efficiency

Benchmarked with 2 hours of intensive, continuous computer use on MacBook Air M4.

MetricValue
Runtime RAM~0.2 GB
Disk Usage~ 200 MB
Token Throughputinput ~ 6 M , output ~ 0.7 M
LLM costqwen-3.5-plus~ $0.42 via Aliyun DashScope
LLM costgemini-3-flash-preview~ $5.00 via OpenRouter
Full Retrieval Speed (depends on question)5 - 20s per query using gemini-3-flash-preview

🚀 Roadmap

CatchMe evolves with community input. Upcoming features include:

Multi-Device Recording. Capture and unify GUI activities across all your machines via LAN synchronization.

Dynamic Clustering. Adaptive clustering algorithms that better reflect your actual work patterns and flows, reducing unnecessary costs.

Enhanced Data Utilization. Unlock deeper insights from screenshots and metadata beyond current processing pipelines.

🌟 Star this repo to follow our future updates — your interest keeps us motivated!

We welcome contributions of any kind - whether it's a comment, a bug report, a feature idea, or a pull request. See CONTRIBUTING.md to get started.

🤝 Community

Acknowledgments !

CatchMe is inspired by these excellent open-source projects:

ProjectInspiration
ActivityWatchPioneering open-source activity tracking
ScreenpipeScreen recording infrastructure for AI agents
WindrecorderPersonal screen recording & search on Windows
OpenRecallOpen-source alternative to Windows Recall
SelfspyClassic daemon-style activity logging
PageIndexTree-structured document retrieval without embeddings
MineContextProactive context-aware AI partner & screen capture

🏛️ Ecosystem

CatchMe is part of the HKUDS agent ecosystem — building the infrastructure layer for personal AI agents:

NanoBot
Ultra-Lightweight Personal AI Assistant
CLI-Anything
Making All Software Agent-Native
ClawWork
AI Assistant → AI Coworker Evolution
ClawTeam
Agent Awarm Intelligence for Full Team Automation

Thanks for visiting ✨ CatchMe

visitors

关于 About

"CatchMe: Make Your AI Agents Truly Personal"
ai-agentclawdbot-pluginllmrecall-airetrieval-systemsscreen-recorder

语言 Languages

Python50.2%
JavaScript37.9%
CSS11.1%
HTML0.7%

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
21
Total Commits
峰值: 11次/周
Less
More

核心贡献者 Contributors