plan: add web-fetch-local-agent implementation plan (#2801)

## Summary - Add planning document for implementing a new `web_fetch` tool for local agent mode - This tool will fetch and read website content when users share URLs, available to all users (free + Pro) - Unlike the existing Pro-only `web_crawl` tool, `web_fetch` performs direct local HTTP fetch at zero infrastructure cost ## Test plan - Manual review of the plan document to ensure implementation strategy is clear - Implementation will follow the detailed testing strategy outlined in the plan 🤖 Generated with [Claude Code](https://claude.com/claude-code)  --- <a href="https://app.devin.ai/review/dyad-sh/dyad/pull/2801" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a>

plan: add web-fetch-local-agent implementation plan (#2801)
f6584527 · Will Chen · GitHub · 12c3456a · f6584527
--- a/plans/web-fetch-local-agent.md
+++ b/plans/web-fetch-local-agent.md
+# Web Fetch Tool for Local Agent Mode
+
+> Generated by swarm planning session on 2026-02-25
+
+## Summary
+
+Add a new `web_fetch` tool to the local agent that fetches and reads website content when users share URLs for reference. Unlike the existing Pro-only `web_crawl` tool (which uses Firecrawl for visual cloning with screenshots), `web_fetch` performs a direct local HTTP fetch from the user's machine, making it available to all users (free + Pro) at zero infrastructure cost.
+
+## Problem Statement
+
+When users paste a URL into the Dyad chat (e.g., "Help me integrate this API: https://docs.stripe.com/api"), the agent cannot access the content behind that URL. Users must manually copy-paste page content, breaking their flow. This is especially painful for developers building with APIs, following tutorials, or referencing documentation — the most common use cases for Dyad's target audience. The existing `web_crawl` tool only activates for "clone/copy/replicate" intent and requires Dyad Pro, leaving a gap for the broader "read this page for context" use case.
+
+## Scope
+
+### In Scope (MVP)
+
+- New `web_fetch` tool that fetches a URL and returns content as markdown
+- Available to **all users** (free + Pro) — no `isDyadPro` gate
+- LLM-triggered via standard tool call mechanism (not auto-detected)
+- HTML-to-markdown conversion using `turndown` + `@mozilla/readability` for content extraction
+- Content-Type detection: HTML → markdown, JSON → code block, text → as-is, PDF/images → "not supported" message
+- URL scheme validation (`http:` and `https:` only; block `file:`, `ftp:`, `data:`, `javascript:`, `blob:` schemes)
+- Private/localhost IPs allowed (consent dialog is sufficient protection)
+- Consent-gated with `"ask"` default
+- Content truncation at 16,000 characters (matching existing `MAX_TEXT_SNIPPET_LENGTH`)
+- Timeout at 10-15 seconds via `AbortController`
+- XML streaming preview via `<dyad-web-fetch>` tag
+- Clear error messages for timeout, 403/blocked, empty content, unsupported content types
+
+### Out of Scope (Follow-up)
+
+- Auto-detection of URLs in user input (pre-fetching before LLM runs)
+- JavaScript rendering / headless browser for SPAs
+- Screenshot capture
+- PDF content extraction
+- Caching of fetched pages within a session
+- Batch consent UI for multiple URLs in one message
+- Re-fetch / refresh button on completed cards
+- Link preview in chat input area
+
+## User Stories
+
+- As a developer building an app, I want to paste an API documentation URL and have the agent understand its contents, so that I can say "integrate this API" without manually copying docs.
+- As a user following a tutorial, I want to share a blog post or tutorial URL with the agent, so that it can follow the instructions and implement what the tutorial describes.
+- As a user referencing a design, I want to share a website URL for style reference (without cloning), so that the agent understands the content and direction I'm going for.
+- As a free-tier user, I want basic web fetching to work without a Pro subscription, so that I can reference external content in my workflow.
+
+## UX Design
+
+### User Flow
+
+1. User types a message that includes a URL (e.g., "Use the Stripe API docs at https://docs.stripe.com/api to add payments")
+2. The LLM recognizes the URL and determines it needs the page content to fulfill the request
+3. A consent dialog appears: `Fetch page content: "https://docs.stripe.com/api"`
+4. User approves (accept-once / accept-always / decline)
+5. A `<dyad-web-fetch>` card appears in the chat showing the URL being fetched with a loading state
+6. Content is fetched, processed through Readability + Turndown, truncated if needed, and returned as the tool result
+7. The card transitions to a completed state showing the page title (extracted by Readability) and URL
+8. The AI continues its response using the fetched content as context
+
+### Key States
+
+- **Loading**: Card with URL, spinner, "Fetching..." label (use existing `DyadStateIndicator` pattern)
+- **Completed (HTML)**: Card with page title (extracted by Readability) + URL in muted text, expandable to show markdown preview
+- **Completed (JSON)**: Card with `application/json` badge + URL, expandable content as code block
+- **Completed (text)**: Card with `text/plain` badge + URL, content displayed as-is
+- **Error — Timeout**: "This page couldn't be reached. Check the URL and try again."
+- **Error — Blocked (403)**: "This page blocked the request. You may need to copy-paste its content manually."
+- **Error — Empty/JS-only**: "This page returned no readable content. It may require JavaScript to render."
+- **Warning — Unsupported type**: Amber/warning state (not red error): "PDF files cannot be fetched as text. Try copying the relevant content and pasting it into the chat." (Use `<dyad-output type="warning">`)
+- **Truncated**: Show note on card: "Content truncated (showing first 16,000 characters)"
+
+### Interaction Details
+
+- Consent preview text: `Fetch page content: "https://..."` (action-focused, not implementation-detail-focused)
+- Card icon: Use `Link` from lucide-react (differentiated from `Globe` for web_search and `ScanQrCode` for web_crawl)
+- Badge color: Use `purple` to differentiate from the `blue` used by web_search and web_crawl
+- Completed card is collapsed by default with page title visible; expandable to show markdown preview
+- When truncation occurs, surface it in the card UI so users understand the AI only saw partial content
+
+### Accessibility
+
+- Consent dialog: keyboard-navigable via standard button focus (existing pattern)
+- Expandable cards: Enter/Space to toggle (existing `DyadCard` pattern)
+- Screen reader: announce "Web Fetch completed: [page title]" or "Web Fetch failed: [error]"
+
+## Technical Design
+
+### Architecture
+
+New tool following the established `ToolDefinition<T>` pattern. Performs a direct HTTP fetch from the Electron main process using Node.js `fetch()`, processes HTML through `@mozilla/readability` for content extraction, then converts to markdown via `turndown`. Returns the markdown string as the tool result. No changes to existing tools or the agent handler.
+
+**Dependency pipeline:** `fetch(url)` → `linkedom.parseHTML(html)` → `new Readability(doc).parse()` → `new TurndownService().turndown(article.content)` → `truncateText(markdown)`
+
+`linkedom` is required because both `@mozilla/readability` and `turndown` need a DOM document, and Electron's main process doesn't have one. `linkedom` is lightweight (~50KB) and much faster than JSDOM.
+
+### Components Affected
+
+- **New file:** `src/pro/main/ipc/handlers/local_agent/tools/web_fetch.ts` — Tool implementation
+- **Modified:** `src/pro/main/ipc/handlers/local_agent/tool_definitions.ts` — Import and register `webFetchTool` in `TOOL_DEFINITIONS` array
+- **Modified:** `package.json` — Add `turndown`, `@types/turndown`, `linkedom`, `@mozilla/readability` (or `defuddle`)
+- **New file (renderer):** `DyadWebFetch` component for rendering the `<dyad-web-fetch>` XML tag in chat
+- **No changes to:** `web_crawl.ts`, `engine_fetch.ts`, `local_agent_handler.ts`, `types.ts`
+
+### Data Model Changes
+
+None. The tool returns a string result via the existing `ToolResult` type. No schema or storage changes.
+
+### API Changes
+
+No external API changes. Internally:
+
+- New tool `web_fetch` added to `TOOL_DEFINITIONS` array
+- New XML tag `<dyad-web-fetch>` for renderer
+
+### Tool Description (Critical)
+
+The tool description guides LLM behavior and is the single biggest factor in feature success:
+
+```
+Fetch and read content from a URL. Works with web pages (returns cleaned markdown) and API endpoints (returns JSON).
+
+### When to Use
+Use this tool when the user shares a URL and wants you to reference, understand, or use information from that page. Examples:
+- User shares API documentation and asks you to integrate it
+- User shares a tutorial or blog post and wants you to follow it
+- User shares a web page and asks about its content
+- User shares an API endpoint URL and wants you to understand the response
+
+### When NOT to Use
+- User wants to CLONE / COPY / REPLICATE / RECREATE a website's visual design — use web_crawl instead
+- User mentions a URL in passing without wanting you to read it
+- You need to search the web for information (no specific URL) — use web_search instead
+
+### Limitations
+- Cannot render JavaScript — some dynamic/SPA pages may return limited content
+- Content is truncated to ~16,000 characters for very long pages
+- PDF and image files are not supported
+```
+
+### Key Implementation Details
+
+````typescript
+// web_fetch.ts - Core structure
+
+const webFetchSchema = z.object({
+  url: z.string().describe("URL to fetch"),
+});
+
+// URL validation: only http: and https: schemes
+// No private IP blocking (user decision: allow with consent)
+// Timeout: 10-15 seconds via AbortController
+// User-Agent: set a reasonable browser-like string
+
+// Content-Type handling:
+// text/html → Readability extraction → Turndown markdown → truncate
+// application/json → return as ```json code block → truncate
+// text/plain, text/markdown → return as-is → truncate
+// application/pdf, image/* → return "not supported" message
+// other → attempt text extraction, fall back to "not supported"
+
+// Truncation: reuse MAX_TEXT_SNIPPET_LENGTH (16,000 chars) pattern
+
+export const webFetchTool: ToolDefinition<z.infer<typeof webFetchSchema>> = {
+  name: "web_fetch",
+  description: DESCRIPTION,
+  inputSchema: webFetchSchema,
+  defaultConsent: "ask",
+  // No isEnabled gate — available to all users
+
+  getConsentPreview: (args) => `Fetch page content: "${args.url}"`,
+
+  buildXml: (args, isComplete) => {
+    if (!args.url) return undefined;
+    let xml = `<dyad-web-fetch url="${escapeXmlContent(args.url)}">`;
+    if (isComplete) xml += "</dyad-web-fetch>";
+    return xml;
+  },
+
+  execute: async (args, ctx) => {
+    // 1. Validate URL scheme (http/https only)
+    // 2. Fetch with timeout (AbortController, 15s)
+    // 3. Check Content-Type header
+    // 4. For HTML: parse with Readability, convert with Turndown
+    // 5. For JSON: wrap in code block
+    // 6. For text: return as-is
+    // 7. For unsupported: return clear message
+    // 8. Truncate to MAX_TEXT_SNIPPET_LENGTH
+    // 9. Return markdown string as tool result
+  },
+};
+````
+
+## Implementation Plan
+
+### Phase 1: Core Tool
+
+- [ ] Add dependencies: `turndown`, `@types/turndown`, `linkedom`, `@mozilla/readability` (evaluate `defuddle` as alternative)
+- [ ] Create `src/pro/main/ipc/handlers/local_agent/tools/web_fetch.ts` with:
+  - URL scheme validation
+  - Fetch with AbortController timeout (15 seconds)
+  - Content-Type detection and routing
+  - Readability extraction for HTML
+  - Turndown markdown conversion
+  - JSON/text/unsupported content handling
+  - Truncation using existing pattern
+  - Proper error messages for common failure modes
+- [ ] Register `webFetchTool` in `tool_definitions.ts` TOOL_DEFINITIONS array
+- [ ] Write tool description with clear when-to-use / when-not-to-use guidance
+
+### Phase 2: Renderer Component
+
+- [ ] Create `DyadWebFetch` component to render `<dyad-web-fetch>` XML tags
+- [ ] Implement loading state (URL + spinner)
+- [ ] Implement completed state (page title + URL, expandable markdown preview)
+- [ ] Implement error states
+- [ ] Show truncation indicator when content was truncated
+- [ ] Register in the markdown parser's XML tag handler
+
+### Phase 3: Testing
+
+- [ ] Unit tests for URL validation (scheme checking, malformed URLs)
+- [ ] Unit tests for Content-Type handling (HTML, JSON, text, PDF, images)
+- [ ] Unit tests for HTML-to-markdown conversion (simple pages, complex pages, empty bodies)
+- [ ] Unit tests for truncation behavior
+- [ ] Unit tests for timeout/error handling (mock fetch failures, non-200 responses)
+- [ ] Integration test: verify tool appears in `buildAgentToolSet` output (no `isEnabled` gate)
+- [ ] Manual E2E testing with real URLs in local agent chat
+
+## Testing Strategy
+
+- [ ] Unit test URL scheme validation: verify `file://`, `ftp://`, `data:` are rejected; `http://` and `https://` are accepted
+- [ ] Unit test Content-Type routing: verify HTML → readability+turndown, JSON → code block, text → as-is, PDF → error message
+- [ ] Unit test HTML conversion with various inputs: simple pages, pages with scripts/styles, empty bodies, non-UTF-8 encoding
+- [ ] Unit test truncation: verify content over 16K chars is truncated with indicator
+- [ ] Unit test error handling: mock network failures, timeouts, 403/404 responses, non-200 status codes
+- [ ] Integration test: verify `webFetchTool` is included in tool set for both Pro and non-Pro contexts
+- [ ] Manual test: verify consent dialog, loading card, completed card, error states in the actual UI
+- [ ] Manual test: verify tool is NOT triggered for clone/replicate intent (web_crawl should be used instead)
+
+## Risks & Mitigations
+
+| Risk                                                                  | Likelihood | Impact | Mitigation                                                                                     |
+| --------------------------------------------------------------------- | ---------- | ------ | ---------------------------------------------------------------------------------------------- |
+| JS-rendered SPAs return minimal content                               | Medium     | Medium | Clear tool description noting limitation; LLM can explain to user; Pro users can use web_crawl |
+| LLM confuses web_fetch with web_crawl or web_search                   | Low        | Medium | Precise, mutually-exclusive tool descriptions with explicit when/when-not guidance             |
+| Large HTML pages block Electron main process during conversion        | Low        | Medium | Truncate raw HTML before processing; move to worker thread in follow-up if needed              |
+| Content quality varies across sites (paywalls, anti-bot)              | Medium     | Low    | Return clear error messages; user can fall back to manual copy-paste                           |
+| New dependencies (turndown, readability) introduce maintenance burden | Low        | Low    | Both are mature, stable libraries with large install bases                                     |
+| "Accept always" consent enables unbounded fetch loops                 | Low        | Medium | Monitor; consider per-turn fetch limit in follow-up if abuse is observed                       |
+
+## Open Questions
+
+- **Readability vs. Defuddle:** Evaluate `defuddle` (by Jina AI) as a potential alternative to `@mozilla/readability`. Defuddle may offer better extraction for modern web pages. Decision can be made during implementation based on testing.
+- **DOM library:** `linkedom` is included as the DOM implementation since both `@mozilla/readability` and `turndown` require a DOM document and Electron's main process doesn't provide one. `linkedom` is lightweight (~50KB) and much faster than JSDOM.
+- **Multiple URLs per message:** When a user pastes 2-5 URLs, the LLM may call `web_fetch` multiple times. Each triggers a separate consent dialog. If this proves disruptive, consider batch consent UI in a follow-up.
+- **Stale content:** Fetched content is point-in-time. For long conversations, consider adding timestamps to fetch cards and a re-fetch capability in a follow-up.
+
+## Decision Log
+
+| Decision                                                 | Reasoning                                                                                                                                                                        |
+| -------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| New tool (`web_fetch`) rather than extending `web_crawl` | Use cases are fundamentally different (read vs. clone). Separate tools = cleaner code, clearer LLM descriptions, independent consent settings. All 3 roles agreed independently. |
+| Available to all users (free + Pro)                      | Local fetch has zero infrastructure cost. Differentiates free tier. Natural upsell to Pro for enhanced crawl+screenshot.                                                         |
+| LLM-triggered, not auto-detected                         | Consistent with existing tool architecture. Auto-detection would require new handler-layer logic and might fetch URLs users didn't intend.                                       |
+| Allow private/localhost IPs                              | Dyad runs locally; SSRF is a server-side threat model. Fetching localhost:3000 or internal docs is a legitimate use case. Consent dialog provides sufficient protection.         |
+| Include @mozilla/readability in v1                       | Dramatically better content extraction (strips nav, footer, ads). Small marginal cost (one extra dependency). All roles agreed.                                                  |
+| Handle Content-Type gracefully                           | ~15 lines of code prevents confusing failures for JSON, text, PDF URLs. Better UX for minimal effort.                                                                            |
+| Consent default: "ask"                                   | Consistent with web_crawl and web_search. Network requests to arbitrary external URLs warrant explicit approval.                                                                 |
+| Truncation at 16K characters                             | Matches existing `MAX_TEXT_SNIPPET_LENGTH`. Prevents context window overflow while providing substantial content.                                                                |
+| Tool name: `web_fetch`                                   | Consistent with `web_search`, `web_crawl` naming convention. Clear, concise, action-oriented.                                                                                    |
+
+---
+
+_Generated by dyad:swarm-to-plan_