This Developer Built a Browser Extension That Gives Claude Memory of Everything You Browse
You've been researching a complex architectural problem for three days. You've opened dozens of tabs—Stack Overflow threads, GitHub issues, documentation pages, blog posts. Now you open Claude Desktop to synthesize what you've learned, and you face a familiar problem: Claude has no idea what you've been reading.
A developer just shared a solution to this exact problem on Dev.to, and the architecture is worth studying. They built a browser extension that streams your browsing history to Claude Desktop through the Model Context Protocol (MCP), creating a persistent memory layer that survives across conversations. The implementation uses a SQLite + ChromaDB hybrid search with a thoughtful fallback when the LLM isn't available.
The Architecture: MCP as the Bridge
The Model Context Protocol, Anthropic's standardized way to connect AI assistants to external data sources, is the foundation of this project. Here's how the pieces fit together:
The browser extension captures page visits (URLs, titles, text content) and sends them to a local MCP server. That server maintains two parallel storage systems:
SQLite for structured metadata — URLs, timestamps, visit counts, and page titles go into a relational database. This gives you fast exact lookups and chronological queries without touching the vector store.
ChromaDB for semantic search — Page content gets embedded and stored in a local vector database. When Claude needs to recall "that article about database indexing strategies," ChromaDB handles the semantic similarity search.
The MCP server exposes tools that Claude Desktop can call: search_browsing_history, get_recent_pages, find_by_domain. When you ask Claude about something you read last week, it queries the MCP server, which orchestrates the SQLite + ChromaDB lookup and returns the relevant context.
Hybrid Search: Why Two Databases?
The dual-storage approach solves a real problem. Vector search alone is expensive for simple queries ("What was that GitHub repo I looked at yesterday?"), and it can return semantically similar but wrong results. SQLite alone can't answer semantic queries ("Find pages about performance optimization").
The hybrid strategy gives you the best of both:
- Recency queries hit SQLite: "Show me pages from the last 3 hours" is a simple SQL
WHERE timestamp > Xquery. - Domain/URL filters stay in SQLite: "Find all GitHub repos I visited" is fast pattern matching.
- Semantic questions use ChromaDB: "What have I read about authentication patterns?" gets embedded and compared against stored vectors.
- Combined queries merge results: "Recent pages about Docker" filters by timestamp in SQLite, then does semantic search on that subset.
This isn't over-engineering—it's recognizing that different query patterns have different optimal data structures. The developer reports that cold-start queries (first search after opening Claude) take ~200ms, while warm queries are sub-50ms.
The No-LLM Fallback: Graceful Degradation
Here's the detail that separates a demo from production-ready software: the system works even when the embedding model isn't available.
ChromaDB requires an embedding model to convert text into vectors. Most implementations call out to OpenAI's API or run a local model. But what happens when your API quota is exhausted, or your local LLM isn't running?
This implementation falls back to SQLite full-text search (FTS5). The MCP server detects when ChromaDB can't generate embeddings and rewrites semantic queries as SQL MATCH operations. You lose semantic understanding—searching for "auth patterns" won't find pages about "authentication strategies"—but you get exact keyword matches instead of an error.
The fallback is transparent to Claude. From the AI's perspective, search_browsing_history always returns results. The quality degrades gracefully rather than failing hard.
Privacy and Storage
Everything stays local. The browser extension talks to localhost, the MCP server runs on your machine, and SQLite + ChromaDB are local files. No browsing history leaves your computer unless you explicitly share it in a Claude conversation.
The developer added filtering rules to exclude sensitive domains (banking, healthcare, internal tools) and implemented automatic cleanup—browsing data older than 90 days gets purged unless you star it.
Storage overhead is reasonable: ~2MB per 1,000 pages with embeddings. A heavy browser user (500 pages/week) accumulates about 4GB per year.
Why This Matters for AI Tooling
This project demonstrates a pattern we'll see more of: personal context engines that feed AI assistants.
Claude, ChatGPT, and other LLMs are stateless by default. Every conversation starts from zero. MCP creates a standard way to bolt on memory—not just browsing history, but email archives, Slack messages, code you've written, documentation you've bookmarked.
The architecture here is reusable:
- Replace the browser extension with an email client plugin, and you get "What did Sarah say about the deployment schedule?"
- Point it at your file system, and you get "Find that design doc I worked on in March."
- Connect it to your terminal history, and you get "How did I fix that DNS issue last time?"
The SQLite + vector DB hybrid, the graceful fallback, and the privacy-first local-only design are all patterns worth stealing.
Getting Started
The developer open-sourced the project (search Dev.to for "Claude memory browser extension" to find the post with repo links). The setup requires:
- Claude Desktop (or any MCP-compatible client)
- Node.js for the MCP server
- The browser extension (Chrome/Firefox)
- ~15 minutes of configuration
The README includes a Docker Compose setup if you don't want to install ChromaDB directly.
The Takeaway
This isn't just a clever hack—it's a reference implementation for how to build durable context into AI workflows. The technical decisions are sound: MCP for standardization, hybrid storage for query efficiency, graceful degradation for reliability.
If you're building tools that connect AI assistants to personal data, study this architecture. The code is open, the patterns are proven, and the problem it solves—giving AI memory of what you've actually been working on—is universal.
The future of AI assistants isn't just smarter models. It's better context. This project shows how to build it.