- TypeScript 45.6%
- JavaScript 33.5%
- Jinja 20.8%
- HTML 0.1%
| .claude | ||
| background | ||
| content | ||
| entrypoints | ||
| models | ||
| offscreen | ||
| public | ||
| scripts | ||
| shared | ||
| .gitattributes | ||
| .gitignore | ||
| CHANGELOG.md | ||
| CLAUDE.md | ||
| LICENSE | ||
| package.json | ||
| pnpm-lock.yaml | ||
| README.md | ||
| screenshot.png | ||
| screenshot2.jpg | ||
| tsconfig.json | ||
| wxt.config.ts | ||
Chrome Clippy
Your personal AI assistant living right inside the browser. Chrome Clippy runs Google's Gemma 4 model entirely on-device via WebGPU — no API keys, no cloud, no data leaving your machine. It can read pages, click buttons, fill forms, run JavaScript, and answer questions about any site you visit.
Requirements
- Chrome with WebGPU support
- Local model files under
models/ - A local HTTP server for those model files
Setup
- Install dependencies:
pnpm install
- Place the required q4f16 model files into:
models/gemma-4-e2b
models/gemma-4-e4b
Mapped upstream repos:
models/gemma-4-e2b←onnx-community/gemma-4-E2B-it-ONNXmodels/gemma-4-e4b←onnx-community/gemma-4-E4B-it-ONNX
- Start the local model server:
pnpm serve:models
- Build the extension:
pnpm build
The runtime now fetches models from http://127.0.0.1:8765/ using the same URL shape as the original Hugging Face downloads, so Chrome can cache them through the normal browser path.
Load the extension in chrome://extensions (developer mode) from .output/chrome-mv3-dev/.
Usage
- Navigate to any page
- Click the Chrome Clippy icon (bottom-right corner) to open the chat
- Wait for the model to initialize (progress shown on icon + chat)
- Ask questions about the page or request actions
Architecture
Offscreen Document Service Worker Content Script
(Gemma 4 + Agent Loop) <-> (Message Router) <-> (Chat UI + DOM Tools)
| |
WebGPU inference Screenshot capture
Token streaming JS execution
- Offscreen document: Hosts the model via
@huggingface/transformers+ WebGPU. Runs the agent loop. - Service worker: Routes messages between content scripts and offscreen document. Handles
take_screenshotandrun_javascript. - Content script: Injects the Chrome Clippy icon + shadow DOM chat overlay. Executes DOM tools (
read_page_content,click_element,type_text,scroll_page).
Tools
| Tool | Description | Runs in |
|---|---|---|
read_page_content |
Read text/HTML of the page or a CSS selector | Content script |
take_screenshot |
Capture visible page as PNG | Service worker |
click_element |
Click an element by CSS selector | Content script |
type_text |
Type into an input by CSS selector | Content script |
scroll_page |
Scroll up/down by pixel amount | Content script |
run_javascript |
Execute JS in the page context with full DOM access | Service worker |
Settings
Click the gear icon in the chat header:
- Model: Switch between Gemma 4 E2B and E4B served from the local model server. Selection persists across sessions.
- Thinking: Toggle native Gemma 4 thinking
- Max iterations: Cap on tool call loops per request
- Clear context: Reset conversation history for the current page
- Disable on this site: Disable the extension per-hostname (persisted)
Development
pnpm build # Development build (with logging, source maps)
pnpm build:prod # Production build (logging silenced, minified)
Tech Stack
- WXT — Chrome extension framework (Vite-based)
- @huggingface/transformers — Browser ML inference
- marked — Markdown rendering in chat
- Gemma 4 E2B / E4B (
onnx-community/gemma-4-E2B-it-ONNX,onnx-community/gemma-4-E4B-it-ONNX) — q4f16 quantization, 128K context
Localhost Model Server
- The extension points
transformers.jsathttp://127.0.0.1:8765/and keeps the original{model}/resolve/{revision}/...URL structure. pnpm serve:modelsserves the required q4f16 files frommodels/gemma-4-e2bandmodels/gemma-4-e4bwith CORS and range request support.- Chrome can then cache those responses through the normal browser cache path used by the original extension behavior.
Debugging
All logs are prefixed with [Chrome Clippy]. In development builds, info/debug/warn logs are active. Production builds only log errors.
- Service worker logs:
chrome://extensions→ Chrome Clippy → "Inspect views: service worker" - Offscreen document logs:
chrome://extensions→ Chrome Clippy → "Inspect views: offscreen.html" - Content script logs: Open DevTools on any page → Console
- All extension pages:
chrome://inspect#otherlists all inspectable extension contexts (service worker, offscreen document, etc.)
The offscreen document logs are the most useful — they show model loading, prompt construction, token counts, raw model output, and tool execution.
Notes
The agent/ directory has zero dependencies. It defines interfaces (ModelBackend, ToolExecutor) and can be extracted to a standalone library.

