Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md

TryVoice

Hands-free voice runtime for AI agents. Talk to your AI coding assistant without touching the keyboard.

TryVoice wraps AI agents (like OpenClaw and Claude Code) into a voice interface with wake word activation, push-to-talk, and real-time streaming — all running in your browser.

Early Preview (v0.1.0-alpha) — actively developed, expect rough edges.

TryVoice Demo — multi-bot voice interaction with cross-device sync

What It Does

  • Wake word activation — say a keyword to start talking, no hands needed (powered by OpenWakeWord)
  • Push-to-talk — hold a button to speak, release to send
  • Real-time streaming — hear the AI respond as it generates, with interruptible playback
  • Multi-bot slots — run multiple independent agent sessions side by side
  • Mobile-ready — PWA support, works on phone browsers
  • Pluggable adapters — connect any AI agent via the Adapter SDK

Prerequisites

TryVoice is a voice layer on top of existing AI agents. You need at least one of:

  • Claude Code — installed on the same machine (claude CLI available in PATH)
  • OpenClaw — running with a gateway endpoint

More agent adapters coming soon. See Building an Adapter to connect your own agent.

Quick Start

Option A: Install from PyPI (recommended)

pip install tryvoice tryvoice # Start the server and open browser # First launch shows Setup Wizard in browser — configure adapter, TTS, etc. # If "command not found", try: python3 -m backend.cli

Option B: Install from source

git clone https://github.com/AaronZ021/tryvoice-oss.git cd tryvoice bash scripts/setup.sh # Creates venv, installs packages, builds frontend source .venv/bin/activate tryvoice # Start the server and open browser # First launch shows Setup Wizard in browser

Configure

On first launch, the browser opens a Setup Wizard that walks you through:

  1. API Keys (optional but recommended) — enter a Groq API key for faster speech-to-text (lower latency than local Whisper), and an Azure Speech key for high-quality text-to-speech
  2. Adapter — choose Claude Code or OpenClaw and enter connection details
  3. Wake word — pick a keyword (e.g., "jarvis", "americano") for hands-free voice activation

All settings can be changed later from the in-app settings panel.

Docker

git clone https://github.com/AaronZ021/tryvoice-oss.git cd tryvoice docker compose up # Open https://localhost:7860 — Setup Wizard runs on first launch

Architecture

┌─────────────┐     WebSocket      ┌──────────────────┐
│  Browser UI  │◄──────────────────►│   TryVoice       │
│  (PWA)       │                    │   Runtime         │
│              │                    │                   │
│  Wake Word   │                    │  ┌────────────┐   │
│  STT / TTS   │                    │  │  Adapter    │   │──► Claude Code
│  Audio I/O   │                    │  │  Registry   │   │──► OpenClaw
│              │                    │  │  (plugin)   │   │──► Your adapter
└─────────────┘                    └──┴────────────┴───┘

Voice flow: Wake word / PTT → STT (browser Web Speech API or Groq Whisper) → Adapter → Agent → Streaming text → TTS (Edge TTS) → Audio playback

Configuration

VariableDefaultDescription
TRYVOICE_ACTIVE_ADAPTERechoActive adapter (claude-code, openclaw, or custom)
GROQ_API_KEYGroq API key for server-side STT (optional, browser fallback)
EDGE_TTS_VOICEzh-CN-XiaoxiaoNeuralEdge TTS voice (300+ voices available)
PORT7860Server port

See .env.example for all options, or run tryvoice --setup for an interactive wizard.

Built-in Adapters

AdapterUse Case
claude-codeVoice control for Claude Code terminal sessions
openclawVoice interface to OpenClaw agent gateway
echoTesting and demo (echoes your speech back)

Building an Adapter

Connect TryVoice to any AI agent by implementing the Adapter protocol:

from backend.adapter_sdk import AdapterCapabilities, AdapterEvent class MyAdapter: def report_capabilities(self) -> AdapterCapabilities: return AdapterCapabilities(supports_stream=True, ...) async def stream_user_turn(self, session_key, text, ...): # Call your agent, yield AdapterEvent chunks yield AdapterEvent(kind="token", text="Hello!") yield AdapterEvent(kind="turn_end")

Register via entry point in pyproject.toml:

[project.entry-points."tryvoice.adapters"] my-agent = "my_package.adapter:MyAdapter"

Development

Prerequisites

  • Python 3.9+ (3.11 recommended)
  • Node.js 20+ (for frontend build)

Setup

git clone https://github.com/AaronZ021/tryvoice-oss.git cd tryvoice bash scripts/setup.sh source .venv/bin/activate tryvoice

Project structure

tryvoice/
├── apps/
│   ├── host-runtime/      # Python FastAPI backend (adapter layer, session FSM, voice providers)
│   └── client-web/        # TypeScript frontend (Vite, state machine, wake word, audio)
├── scripts/               # Setup and build scripts
├── pyproject.toml          # Python package config
├── Dockerfile              # Multi-stage build (Node + Python)
└── docker-compose.yml      # Single-command deployment

License

Apache License 2.0 — see LICENSE.

关于 About

No description, website, or topics provided.

语言 Languages

TypeScript53.5%
Python35.7%
HTML5.5%
CSS4.4%
Shell0.8%
JavaScript0.1%
Dockerfile0.1%

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
2
Total Commits
峰值: 2次/周
Less
More

核心贡献者 Contributors