# ChromeClaw
A lightweight [OpenClaw](https://github.com/openclaw)-inspired AI agent running entirely in the Chrome browser sandbox โ with multi-provider LLM support, messaging channels (WhatsApp, Telegram), voice (TTS/STT), memory, agents, and browser automation.
[](https://chromewebstore.google.com/detail/chromeclaw-your-own-perso/lnahopfgnfhcfchffbckmbbkopcmojme)
## Overview
ChromeClaw brings the capabilities of a full AI agent platform into a Chrome extension that is super easy to install and set up โ just load the extension, add an API key, and start chatting. No server, no Docker, no CLI. Protected by the modern browser sandbox and inspired by the [OpenClaw](https://github.com/openclaw) project, it delivers a lightweight, self-contained alternative that runs entirely in the browser's side panel. It supports multiple LLM providers (OpenAI, Anthropic, Google, OpenRouter, and any OpenAI-compatible endpoint) using your own API keys. Beyond chat, it connects to WhatsApp and Telegram as messaging channels, speaks and listens via local or cloud TTS/STT, and remembers context across sessions with a hybrid memory system.
## Features
- **Multi-provider LLM support** โ OpenAI, Anthropic, Google, OpenRouter, custom endpoints
- **Streaming responses** โ Real-time text and reasoning deltas with markdown rendering
- **Messaging channels** โ WhatsApp (Baileys WebSocket client) and Telegram (Bot API long-polling) via offscreen document
- **Voice** โ TTS (Kokoro local ONNX + OpenAI cloud), STT (Whisper local via Transformers.js + OpenAI cloud)
- **Memory system** โ BM25 full-text search + optional vector embeddings with MMR re-ranking and temporal decay
- **Multi-agent system** โ Named agents with per-agent models, tools, workspace files, and custom JS tools
- **Tool calling** โ 25+ built-in tools including web search, documents, browser automation, Google services, and more
- **Google integration** โ Gmail, Calendar, Drive tools via OAuth (`chrome.identity`)
- **Deep research** โ Multi-step autonomous research with parallel search, fetch, and synthesize phases
- **Browser automation** โ Chrome DevTools Protocol with DOM snapshots, click/type, screenshots, JS evaluation
- **Local LLM** โ On-device inference via Transformers.js (WebGPU/WASM)
- **Cron/scheduler** โ Alarm-based one-shot, interval, and cron-expression tasks with optional channel delivery
- **Custom tools** โ Register workspace JS files as callable LLM tools with `@tool` metadata comments
- **Context compaction** โ Sliding-window + LLM summarization when approaching token limits; adaptive multi-part summarization for very long histories
- **Session journaling** โ Auto-converts chat transcripts to durable memory entries on session end
- **Artifacts** โ Create and view text, code, spreadsheets, and images
- **Chat history** โ Persistent IndexedDB storage with search, date grouping, and auto-titling
- **Reasoning display** โ Collapsible thinking/reasoning output for supported models
- **Workspace files** โ Attach AGENTS.md, SOUL.md, USER.md, IDENTITY.md, TOOLS.md, MEMORY.md, and custom files as persistent LLM context
- **Skills system** โ Configurable prompt templates with variable substitution
- **Firefox support** โ Cross-browser builds via a single flag
## Architecture
```
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Chrome Extension (Manifest V3, React + Vite + TypeScript + Tailwind) โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Side Panel โ โ Full-Page Chat โ โ Options โ โ
โ โ - Chat UI + Streaming โ โ - Push sidebar โ โ - Model config โ โ
โ โ - Artifacts โ โ mode โ โ - Tool management โ โ
โ โ - Chat history โ โ โ โ - Channel setup โ โ
โ โ - Voice input/output โ โ โ โ - Agent management โ โ
โ โโโโโโโโโโโโฌโโโโโโโโโโโโ โโโโโโโโโโฌโโโโโโโโโโโ โโโโโโโโโโโฌโโโโโโโโโโโโ โ
โ โ chrome.runtime.Port / sendMessage โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Background Service Worker โ โ
โ โ โ โ
โ โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโ โโโโโโโโโโโโโโโ โ โ
โ โ โ Agent โ โ Tools โ โ Memory โ โ Cron โ โ Channels โ โ โ
โ โ โ System โ โ (25+) โ โ (BM25 + โ โ Sched- โ โ Registry โ โ โ
โ โ โ โ โ โ โ vectors)โ โ uler โ โ โ โ โ
โ โ โโโโโโฌโโโโโโ โโโโโโฌโโโโโโ โโโโโโฌโโโโโโ โโโโโฌโโโโโ โโโโโโโโฌโโโโโโโ โ โ
โ โ โ โ โ โ โ โ โ
โ โ โโโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโดโโโโโโโโโโโโโโ โ โ
โ โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Provider Factory + Context Compaction โ โ โ
โ โ โ pi-mono streamSimple() / Local LLM โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Offscreen Document (persistent) โ โ
โ โ โ โ โ
โ โ โโโโโโโโโโโโโโ โโโโโโโโโโโดโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ WhatsApp โ โ Telegram โ โ Kokoro โ โ Whisper STT / โ โ โ
โ โ โ Worker โ โ Worker โ โ TTS โ โ Local LLM Worker โ โ โ
โ โ โ (Baileys) โ โ (Bot API) โ โ Worker โ โ (Transformers.js) โ โ โ
โ โ โโโโโโโฌโโโโโโโ โโโโโโโฌโโโโโโโ โโโโโโฌโโโโโโ โโโโโโโโโโโฌโโโโโโโโโโโ โ โ
โ โโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโ โ
โ โ โ โ โ โ
โโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโ
โ โ โ โ
โผ โผ โผ โผ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
โ WhatsApp โ โ Telegram โ โ Audio โ โ On-device โ
โ (WebSocket) โ โ Bot API โ โ Output โ โ Inference โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ External Services โ
โ โ
โ โโโโโโโโโโโโโโโโโโ โ
โ โ LLM Providers โ โ
โ โ - OpenAI โ โ
โ โ - Anthropic โ โ
โ โ - Google โ โ
โ โ - OpenRouter โ โ
โ โ - Custom โ โ
โ โโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Storage:
chrome.storage (local/session) โโ settings, tool configs
IndexedDB (Dexie.js) โโ chats, messages, artifacts, agents, models,
workspaceFiles, memoryChunks, scheduledTasks,
taskRunLogs, embeddingCache
```
## Tech Stack
| Category | Technology |
|----------|------------|
| UI | React 19, TypeScript, Tailwind CSS, shadcn/ui, Radix UI, Lucide icons, Framer Motion |
| AI/LLM & Agents | pi-mono (`@mariozechner/pi-ai`, `@mariozechner/pi-agent-core`) |
| AI/ML (local) | Transformers.js (local inference, embeddings, Whisper STT), ONNX Runtime Web (WebGPU/WASM) |
| Channels | Baileys 6.x (WhatsApp WebSocket client), Telegram Bot API (direct HTTP long-polling) |
| Voice | Kokoro-JS + Kokoro-82M ONNX (local TTS), OpenAI TTS API, Whisper ONNX (local STT) |
| Storage | Dexie.js 4 (IndexedDB), Chrome Storage API |
| Auth | Google OAuth (`chrome.identity`) |
| Build | Vite 6, Turborepo, pnpm workspaces |
| Testing | Vitest, Playwright |
| Code Quality | ESLint (flat config), Prettier, TypeScript strict mode |
## Getting Started
### Prerequisites
- **Node.js** โฅ 22.15.1
- **pnpm** 10.x
### Install & Build
```bash
pnpm install
pnpm build
```
### Install from Chrome Web Store
Install ChromeClaw directly from the [Chrome Web Store](https://chromewebstore.google.com/detail/chromeclaw-your-own-perso/lnahopfgnfhcfchffbckmbbkopcmojme) โ no build step required.
### Load from Source
1. Open `chrome://extensions`
2. Enable **Developer mode** (top-right toggle)
3. Click **Load unpacked**
4. Select the `dist/` directory
5. Open any page and click the ChromeClaw icon to open the side panel
### First Run
No login required. Open the Options page, add your API key for any supported provider, select a model, and start chatting.
## Project Structure
```
chrome-extension/ # Background service worker
โ โโโ src/background/
โ โโโ index.ts # Main background entry
โ โโโ local-llm-bridge.ts # Local model IPC bridge
โ โโโ agents/ # Agent system (loop, setup, model adapter, streaming)
โ โโโ channels/ # Channel registry + adapters (WhatsApp, Telegram)
โ โโโ context/ # Context compaction + summarization
โ โโโ cron/ # Scheduler service (alarms, executor, store)
โ โโโ errors/ # Error handling
โ โโโ logging/ # Logging utilities
โ โโโ media-understanding/ # Speech-to-text, media transcription
โ โโโ memory/ # Memory system (BM25, embeddings, MMR, journaling)
โ โโโ tts/ # TTS engine routing (Kokoro bridge, OpenAI)
โ โโโ tools/ # All tool implementations
pages/ # Extension UI pages
โโโ side-panel/ # Primary chat interface
โโโ full-page-chat/ # Full-page chat (push sidebar mode)
โโโ options/ # Settings & configuration
โโโ offscreen-channels/ # Offscreen document โ WhatsApp, Telegram,
# Kokoro TTS, Whisper STT, local LLM workers
packages/ # Shared monorepo packages
โโโ baileys/ # Bundled Baileys fork (WhatsApp Web client)
โโโ config-panels/ # Options page tab panels and tab group definitions
โโโ shared/ # Types, hooks, prompts, env config
โโโ skills/ # Skill template loading and parsing
โโโ storage/ # Chrome storage + IndexedDB (Dexie.js)
โโโ ui/ # shadcn/ui components
โโโ env/ # Build-time environment variables
โโโ i18n/ # Internationalization
โโโ ... # hmr, vite-config, tailwindcss-config, etc.
tests/ # E2E test suites (Playwright)
package.json
turbo.json
pnpm-workspace.yaml
```
## Development
### Watch Mode
```bash
pnpm dev
```
This cleans the `dist/` folder, builds all packages, then starts Vite in watch mode via Turborepo. After loading the extension once, changes are picked up automatically (reload the extension page to apply).
### Code Quality
```bash
pnpm lint # ESLint
pnpm format:check # Prettier check
pnpm type-check # TypeScript
pnpm test # Vitest unit tests
pnpm quality # All of the above
```
### E2E Tests
```bash
pnpm build && pnpm test:e2e # Build, then run Playwright tests (Chrome)
```
### Firefox Build
```bash
pnpm build:firefox
```
## Configuration
### Model Management
Add your API key and base URL on the Options page. Supported providers: OpenAI, Anthropic, Google, OpenRouter, and any OpenAI-compatible endpoint.
**Local models** โ Select a Transformers.js-compatible model for on-device inference via WebGPU or WASM. No API key required.
### Workspace Files
Workspace files provide persistent context to every conversation:
- `AGENTS.md` โ Agent behavior instructions
- `SOUL.md` โ Personality and tone
- `USER.md` โ User-specific context
- `IDENTITY.md` โ Agent identity
- `TOOLS.md` โ Tool usage guidance
- `MEMORY.md` โ Auto-curated memory summary
- Custom files via the workspace tool configuration
### Skills
Skills are reusable prompt templates with variable substitution (`{{variable}}`). Configure them on the Options page under the Skills tab. Skills appear as quick actions in the chat input.
### Suggested Actions
Configurable quick-action buttons shown below the chat input. Managed on the Options page.
## Channels
ChromeClaw can send and receive messages on WhatsApp and Telegram. Channel workers run in a persistent offscreen document; inbound messages are routed through the agent system and replies are sent back via the same channel.
### WhatsApp
- **Connection**: QR code pairing via Baileys WebSocket client
- **Auth storage**: Credentials persisted in `chrome.storage.local`
- **Sender control**: `allowedSenderIds` allowlist, `acceptFromMe` / `acceptFromOthers` flags
- **Per-channel model**: Assign a specific model to handle WhatsApp conversations
- **Voice messages**: Inbound audio is decrypted and transcribed via STT; outbound TTS replies are sent as PTT voice messages
- **Message limits**: Long messages are auto-split at 4096 characters
### Telegram
- **Connection**: Bot token with HTTP long-polling (25s poll timeout)
- **Sender control**: `allowedSenderIds` allowlist
- **Per-channel model**: Assign a specific model to handle Telegram conversations
- **Rate limiting**: Automatic retry on 429/409 responses
Both channels are configured on the Options page under the Channels section.
## Tools
Configured on the Options page under the Tools tab. Tools can be enabled/disabled globally and overridden per agent.
| Tool | Description |
|------|-------------|
| **Web Search** | Brave Search API (requires API key) |
| **Fetch URL** | Retrieve and extract content from web pages |
| **Create Document** | Create text, code, spreadsheet, and image artifacts |
| **Browser** | Chrome DevTools Protocol โ DOM snapshots, click/type, screenshots, JS eval, console/network logs |
| **Read / Write / Edit / List** | Workspace file operations |
| **Memory Search** | BM25 + vector search over memory chunks |
| **Memory Get** | Retrieve a specific memory entry |
| **Deep Research** | Multi-step autonomous research with parallel search and synthesis |
| **Agent Manager** | List, create, remove, and switch between named agents |
| **Scheduler** | Create one-shot, interval, and cron-expression tasks |
| **Execute JavaScript** | Run JS in a sandboxed tab; register custom tool files |
| **Gmail** | Search, read, send, and draft emails (OAuth) |
| **Calendar** | List, create, update, and delete events (OAuth) |
| **Drive** | Search, read, and create files (OAuth) |
| **Custom JS tools** | Workspace files with `@tool` metadata, registered per agent |
## Voice
### Text-to-Speech (TTS)
| Engine | Description |
|--------|-------------|
| **Kokoro** (local) | On-device synthesis via Kokoro-82M ONNX model. Supports streaming (per-sentence) and batched modes. Configurable voice and speed. |
| **OpenAI** (cloud) | OpenAI `/audio/speech` endpoint with Opus output. Works with any OpenAI-compatible TTS API. |
### Speech-to-Text (STT)
| Engine | Description |
|--------|-------------|
| **Whisper** (local) | On-device transcription via Whisper ONNX models (tiny/base/small). Audio resampled to 16kHz mono PCM. Supports language selection. |
### Auto-mode
TTS auto-mode controls when responses are spoken aloud:
- `off` โ TTS disabled
- `always` โ Every response is spoken
- `inbound` โ Only speak responses triggered by voice input or channel messages
## Memory
The memory system provides long-term context recall across sessions.
### Search
- **BM25 full-text search** over workspace file chunks (always available)
- **Optional vector embeddings** via OpenAI-compatible API for semantic search
- **Hybrid ranking** combines BM25 and vector scores with configurable weights
### Ranking
- **MMR re-ranking** (Maximal Marginal Relevance) โ reduces redundancy by balancing relevance against diversity (configurable lambda, default 0.7)
- **Temporal decay** โ exponential decay with configurable half-life (default 30 days). Dated entries (`memory/YYYY-MM-DD.md`) decay; evergreen files (`MEMORY.md`) do not
### Session Journaling
When the user switches chats, the LLM extracts durable memories from the conversation transcript and:
- Appends dated entries to `memory/YYYY-MM-DD.md`
- Curates the `MEMORY.md` summary (max 4000 chars)
- Deduplicates against existing memories before writing
## Environment Variables
Set in `.env` (copied from `.example.env` on install):
| Variable | Description |
|----------|-------------|
| `CEB_GOOGLE_CLIENT_ID` | Google OAuth2 client ID (for Gmail/Calendar/Drive tools) |
| `CEB_ENABLE_DOCUMENTS` | Enable documents tool (`false` by default) |
| `CEB_ENABLE_WHATSAPP` | Enable WhatsApp channel (`false` by default) |
| `CEB_ENABLE_WEBGPU_MODELS` | Enable WebGPU local models (`false` by default) |
| `CEB_ENABLE_DEBUGGER_TOOL` | Enable CDP debugger tool (`false` by default) |
| `CEB_DEV_LOCALE` | Force locale for development |
| `CEB_CI` | CI mode flag |
CLI flags (set on the command line):
| Variable | Description |
|----------|-------------|
| `CLI_CEB_DEV` | Enable development mode (set automatically by `pnpm dev`) |
| `CLI_CEB_FIREFOX` | Build for Firefox (set automatically by `pnpm build:firefox`) |
## Known Limitations
- **Side panel width** โ Chrome enforces a fixed side panel width; the UI is constrained to ~400px
- **MV3 service worker idle** โ The background service worker may be terminated after 30s of inactivity; long-running streams use keep-alive mechanisms
- **No Pyodide** โ Code execution in the browser is not supported; code artifacts are display-only
- **Local LLM performance** โ On-device inference speed depends on hardware; WebGPU is preferred over WASM for acceptable throughput
- **WhatsApp connection** โ Requires a persistent offscreen document to maintain the Baileys WebSocket connection; Chrome may reclaim the offscreen document under memory pressure
## License
MIT โ see [LICENSE](LICENSE).
### Third-party code
This repository includes a bundled fork of [Baileys](https://github.com/WhiskeySockets/Baileys) (`packages/baileys/`), a TypeScript/JavaScript API for WhatsApp Web by WhiskeySockets. Baileys is licensed under the [MIT License](https://github.com/WhiskeySockets/Baileys/blob/master/LICENSE).
## Star History
[](https://star-history.com/#algopian/chromeclaw&Date)