A fully local chrome extension running pre-embedded Gemma 4 models. Forked from https://github.com/kessler/gemma-gem
  • TypeScript 45.6%
  • JavaScript 33.5%
  • Jinja 20.8%
  • HTML 0.1%
Find a file
2026-04-24 02:44:44 +02:00
.claude Initial clean import with LFS 2026-04-24 02:44:44 +02:00
background Initial clean import with LFS 2026-04-24 02:44:44 +02:00
content Initial clean import with LFS 2026-04-24 02:44:44 +02:00
entrypoints Initial clean import with LFS 2026-04-24 02:44:44 +02:00
models Initial clean import with LFS 2026-04-24 02:44:44 +02:00
offscreen Initial clean import with LFS 2026-04-24 02:44:44 +02:00
public Initial clean import with LFS 2026-04-24 02:44:44 +02:00
scripts Initial clean import with LFS 2026-04-24 02:44:44 +02:00
shared Initial clean import with LFS 2026-04-24 02:44:44 +02:00
.gitattributes Initial clean import with LFS 2026-04-24 02:44:44 +02:00
.gitignore Initial clean import with LFS 2026-04-24 02:44:44 +02:00
CHANGELOG.md Initial clean import with LFS 2026-04-24 02:44:44 +02:00
CLAUDE.md Initial clean import with LFS 2026-04-24 02:44:44 +02:00
LICENSE Initial clean import with LFS 2026-04-24 02:44:44 +02:00
package.json Initial clean import with LFS 2026-04-24 02:44:44 +02:00
pnpm-lock.yaml Initial clean import with LFS 2026-04-24 02:44:44 +02:00
README.md Initial clean import with LFS 2026-04-24 02:44:44 +02:00
screenshot.png Initial clean import with LFS 2026-04-24 02:44:44 +02:00
screenshot2.jpg Initial clean import with LFS 2026-04-24 02:44:44 +02:00
tsconfig.json Initial clean import with LFS 2026-04-24 02:44:44 +02:00
wxt.config.ts Initial clean import with LFS 2026-04-24 02:44:44 +02:00

Chrome Clippy

Your personal AI assistant living right inside the browser. Chrome Clippy runs Google's Gemma 4 model entirely on-device via WebGPU — no API keys, no cloud, no data leaving your machine. It can read pages, click buttons, fill forms, run JavaScript, and answer questions about any site you visit.

Requirements

  • Chrome with WebGPU support
  • Local model files under models/
  • A local HTTP server for those model files

Setup

  1. Install dependencies:
pnpm install
  1. Place the required q4f16 model files into:
models/gemma-4-e2b
models/gemma-4-e4b

Mapped upstream repos:

  • models/gemma-4-e2bonnx-community/gemma-4-E2B-it-ONNX
  • models/gemma-4-e4bonnx-community/gemma-4-E4B-it-ONNX
  1. Start the local model server:
pnpm serve:models
  1. Build the extension:
pnpm build

The runtime now fetches models from http://127.0.0.1:8765/ using the same URL shape as the original Hugging Face downloads, so Chrome can cache them through the normal browser path.

Load the extension in chrome://extensions (developer mode) from .output/chrome-mv3-dev/.

Usage

  1. Navigate to any page
  2. Click the Chrome Clippy icon (bottom-right corner) to open the chat
  3. Wait for the model to initialize (progress shown on icon + chat)
  4. Ask questions about the page or request actions

Architecture

Offscreen Document          Service Worker           Content Script
(Gemma 4 + Agent Loop)  <-> (Message Router)    <-> (Chat UI + DOM Tools)
       |                         |
  WebGPU inference          Screenshot capture
  Token streaming           JS execution
  • Offscreen document: Hosts the model via @huggingface/transformers + WebGPU. Runs the agent loop.
  • Service worker: Routes messages between content scripts and offscreen document. Handles take_screenshot and run_javascript.
  • Content script: Injects the Chrome Clippy icon + shadow DOM chat overlay. Executes DOM tools (read_page_content, click_element, type_text, scroll_page).

Tools

Tool Description Runs in
read_page_content Read text/HTML of the page or a CSS selector Content script
take_screenshot Capture visible page as PNG Service worker
click_element Click an element by CSS selector Content script
type_text Type into an input by CSS selector Content script
scroll_page Scroll up/down by pixel amount Content script
run_javascript Execute JS in the page context with full DOM access Service worker

Settings

Click the gear icon in the chat header:

  • Model: Switch between Gemma 4 E2B and E4B served from the local model server. Selection persists across sessions.
  • Thinking: Toggle native Gemma 4 thinking
  • Max iterations: Cap on tool call loops per request
  • Clear context: Reset conversation history for the current page
  • Disable on this site: Disable the extension per-hostname (persisted)

Development

pnpm build              # Development build (with logging, source maps)
pnpm build:prod         # Production build (logging silenced, minified)

Tech Stack

  • WXT — Chrome extension framework (Vite-based)
  • @huggingface/transformers — Browser ML inference
  • marked — Markdown rendering in chat
  • Gemma 4 E2B / E4B (onnx-community/gemma-4-E2B-it-ONNX, onnx-community/gemma-4-E4B-it-ONNX) — q4f16 quantization, 128K context

Localhost Model Server

  • The extension points transformers.js at http://127.0.0.1:8765/ and keeps the original {model}/resolve/{revision}/... URL structure.
  • pnpm serve:models serves the required q4f16 files from models/gemma-4-e2b and models/gemma-4-e4b with CORS and range request support.
  • Chrome can then cache those responses through the normal browser cache path used by the original extension behavior.

Debugging

All logs are prefixed with [Chrome Clippy]. In development builds, info/debug/warn logs are active. Production builds only log errors.

  • Service worker logs: chrome://extensions → Chrome Clippy → "Inspect views: service worker"
  • Offscreen document logs: chrome://extensions → Chrome Clippy → "Inspect views: offscreen.html"
  • Content script logs: Open DevTools on any page → Console
  • All extension pages: chrome://inspect#other lists all inspectable extension contexts (service worker, offscreen document, etc.)

The offscreen document logs are the most useful — they show model loading, prompt construction, token counts, raw model output, and tool execution.

Notes

The agent/ directory has zero dependencies. It defines interfaces (ModelBackend, ToolExecutor) and can be extracted to a standalone library.

Chrome Clippy in action Chrome Clippy in action