Unverified 提交 a30c45d7 authored 作者: Will Chen's avatar Will Chen 提交者: GitHub

Add dangerous action safeguards plan (#2733)

## Summary - Add planning documentation for dangerous-action guardrails and implementation approach. - Document detection and mitigation strategies for potentially destructive operations. - Define acceptance criteria and rollout/testing recommendations. ## Test plan - Manual review of the plan document for completeness and consistency. - Validate markdown formatting and consistency with repository conventions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/dyad-sh/dyad/pull/2733" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a> <!-- devin-review-badge-end -->
上级 5b7497fb
# Dangerous Action Guards
> Generated by swarm planning session on 2026-02-14
## Summary
Add automatic safety guards that detect and warn users before executing dangerous actions -- destructive SQL queries, malicious npm packages, and suspicious code patterns -- even when auto-approve is enabled. Includes a "dangerous approval override" toggle for power users who want to bypass all safety checks.
## Problem Statement
Users building apps with Dyad can inadvertently (or through prompt injection) execute destructive actions. Today, Dyad's only defense is the consent banner ("Allow once / Always allow / Decline"), which users frequently bypass with auto-approve or "Always allow" settings. Once bypassed, there is **zero validation**:
- SQL queries run as-is -- a single `DROP TABLE` can destroy hours of work
- Package names are passed directly to shell commands with no validation (and there is an **existing command injection vulnerability** in `executeAddDependency.ts`)
- File writes from the LLM are completely unscanned
The LLM is an untrusted actor. Prompt injection, hallucination, and model errors can generate destructive operations the user never intended. Auto-approve removes the last line of defense. Users trust Dyad to help them build safely.
## Scope
### In Scope (MVP)
1. **Dangerous SQL detection** -- Heuristic pattern matching for destructive SQL operations (DROP, TRUNCATE, DELETE without WHERE, etc.). Force an enhanced consent prompt even if auto-approve is enabled.
2. **Malicious npm package detection** -- Input sanitization (fix command injection vulnerability), registry existence check pre-install, `npm audit` post-install for known CVEs.
3. **Narrow code injection scanning** -- High-confidence pattern detection for reverse shells, crypto miners, credential exfiltration, and obfuscated eval payloads. Near-zero false positive tolerance.
4. **Enhanced consent banner** -- Danger variant with red/destructive styling, human-readable explanations, and two-button design (no "Always allow" for dangerous actions).
5. **Dangerous approval override** -- Settings toggle to skip all danger checks, with confirmation dialog requiring typed acknowledgment and persistent UI indicator when active.
6. **package.json write detection** -- When `write_file` or `search_replace` targets `package.json`, run the same package validation on newly-added dependencies.
7. **Telemetry** -- Track danger detections, categories, and user decisions (allow/decline) to tune false positive rates.
### Out of Scope (Follow-up)
- LLM-based SQL semantic analysis (expensive, latency, provider dependency)
- Comprehensive code security scanning beyond the narrow pattern set
- MCP tool danger detection (MCP tools are opaque -- we don't control their behavior)
- Typosquatting detection (requires maintaining/fetching popular package lists)
- Sandboxed SQL execution / dry-run mode
- Build-mode proposal security risk interception (separate code path from tool consent)
- Per-category danger guard enable/disable in settings
## User Stories
- As a user with auto-approve enabled, I want Dyad to still warn me before executing destructive SQL so that I don't accidentally lose data.
- As a user building with Supabase, I want to see exactly why a SQL query was flagged as dangerous so that I can make an informed decision to proceed or decline.
- As a user adding dependencies, I want Dyad to warn me if a package is known-malicious or has known vulnerabilities so that I don't introduce security issues into my app.
- As a user, I want to see a clear explanation of why an action was flagged so that I can dismiss false positives confidently.
- As a power user, I want to disable danger checks entirely so that I can work without interruption when I know what I'm doing.
- As a user reviewing agent actions (auto-approve OFF), I want danger context in the consent banner so that I can make better-informed decisions about which actions to allow.
## UX Design
### User Flow
**Flow 1: Dangerous SQL detected (auto-approve ON)**
1. User has auto-approve enabled and is iterating on their app
2. Agent generates a SQL query (e.g., `DROP TABLE users`)
3. `dangerCheck` on the SQL tool detects destructive pattern
4. Instead of auto-executing, the system intercepts and shows a **danger consent banner**
5. Banner shows: "Auto-approve paused: this query will permanently delete the `users` table and all its data"
6. User clicks "Allow anyway" (destructive style) or "Decline" (default focus)
7. If approved, execution continues; if declined, the agent gets feedback that the action was blocked
**Flow 2: Dangerous SQL detected (auto-approve OFF)**
1. Agent generates a destructive SQL query
2. `dangerCheck` detects the pattern
3. The normal consent banner is shown but with **enhanced danger styling** (red border, ShieldAlert icon, explanation text)
4. User reviews and decides with better context than the standard consent banner provides
**Flow 3: Malicious npm package detected**
1. Agent attempts to install a package
2. Package name is validated (sanitization regex) -- invalid names are rejected immediately
3. Registry existence check confirms the package exists
4. If the consent banner fires (ask mode or danger-escalated), it includes package metadata
5. After installation, `npm audit --json` runs and parses results
6. If vulnerabilities found: critical/high severity shows red `danger` banner; moderate/low shows amber `warning` banner
7. User reviews advisory details and decides
**Flow 4: Suspicious code detected**
1. Agent writes code via `write_file`, `edit_file`, or `search_replace`
2. Content is scanned against the high-confidence pattern set
3. If a pattern matches, a danger banner appears showing the filename, flagged snippet, and a specific explanation (e.g., "This code appears to open a reverse shell connection to an external server")
4. User reviews and decides
**Flow 5: Enabling dangerous approval override**
1. User navigates to Settings > Safety section
2. Finds "Skip all danger checks" toggle (default: OFF)
3. Toggling ON opens a confirmation dialog: "This will skip all safety warnings for dangerous SQL, suspicious packages, and potentially malicious code. Actions will proceed without review."
4. Dialog requires typing "I understand" to confirm
5. Once enabled, a persistent shield-off icon appears in the chat header/status bar
6. Icon is clickable to jump back to the setting
7. All danger checks are bypassed; normal consent flow still applies per tool settings
### Key States
- **Default (no danger)**: Invisible. Zero friction. Actions proceed normally per consent settings.
- **Danger detected (auto-approve ON)**: Red/destructive banner with ShieldAlert icon, explanation, two buttons. Auto-approve paused.
- **Danger detected (auto-approve OFF)**: Enhanced consent banner with red styling and danger explanation. Same two buttons.
- **Warning detected (lower severity)**: Amber banner with AlertTriangle icon. Moderate/low npm advisories, DELETE with WHERE clause, etc.
- **Checking safety (async)**: Brief inline indicator ("Checking packages...") only for async checks like npm registry lookup. Not shown for instant checks (SQL regex).
- **Override active**: Persistent shield-off indicator in chat header. All danger checks bypassed.
- **Check failed/unavailable**: Fail-open with subtle notification: "Safety check unavailable -- proceeding." User knows the guard wasn't active.
### Interaction Details
**Danger consent banner:**
- Visually distinct from standard consent banner: red/destructive color scheme, ShieldAlert icon (not Bot icon)
- Includes: category label ("Dangerous SQL" / "Vulnerable Package" / "Suspicious Code"), human-readable explanation, expandable content preview
- Two buttons only: "Allow anyway" (destructive variant) and "Decline" (default style)
- No "Always allow" option -- you cannot permanently approve dangerous actions by category
- Not dismissible via X button -- only explicit button clicks
- Takes priority in consent queue (dangerous items shown first)
- When auto-approve is ON, banner copy reads "Auto-approve paused: [explanation]"
**Keyboard navigation:**
- "Decline" is default focused (Enter = safe action)
- "Allow anyway" requires Tab + Enter (deliberate action)
**Queue behavior:**
- If agent fires 5 actions with auto-approve, 4 safe ones auto-execute, 1 dangerous one pauses
- Multiple dangerous actions in parallel: show sequentially with queue count
**Danger explanation quality (required templates):**
| Pattern | Explanation Template |
| ----------------------------- | ------------------------------------------------------------------------------------- |
| `DROP TABLE x` | "This query will permanently delete the `{table}` table and all its data" |
| `DROP DATABASE x` | "This query will permanently delete the entire `{database}` database" |
| `TRUNCATE x` | "This query will delete all rows from the `{table}` table" |
| `DELETE FROM x` (no WHERE) | "This query will delete all rows from the `{table}` table" |
| `ALTER TABLE x DROP COLUMN y` | "This query will permanently remove the `{column}` column from the `{table}` table" |
| `GRANT` / `REVOKE` | "This query modifies database permissions" |
| npm critical/high advisory | "Package `{name}` has a known vulnerability: {advisory_title} (severity: {severity})" |
| npm moderate/low advisory | "Package `{name}` has a known advisory: {advisory_title} (severity: {severity})" |
| Reverse shell pattern | "This code appears to open a reverse shell connection to an external server" |
| Crypto miner pattern | "This code contains patterns associated with cryptocurrency mining" |
| Credential exfiltration | "This code appears to send environment variables to an external URL" |
| Obfuscated eval | "This code contains an obfuscated execution pattern (base64-decoded eval)" |
### Accessibility
- Not color-alone: danger banner differs via icon (ShieldAlert vs Bot), text label ("Potentially dangerous" vs standard), AND color
- `aria-live="polite"` on danger banner (not "assertive" -- the agent is paused, no urgency to interrupt)
- Focus moves to danger banner when it appears; returns to chat input on resolution
- "Skip all danger checks" toggle associated with `aria-describedby` pointing to warning text
- Confirmation dialog is keyboard-navigable and screen-reader announced
## Technical Design
### Architecture
Add a `dangerCheck` method to the existing `ToolDefinition` interface. This runs before consent and can escalate the consent level from "always" to forced-ask with danger context. The detection logic is per-tool (each tool knows its domain), while the consent escalation is centralized in `buildAgentToolSet`.
```
Tool invocation → dangerCheck() → if dangerous, force consent with dangerInfo
→ if safe, proceed with normal consent flow
```
New module: `src/pro/main/ipc/handlers/local_agent/danger_detection/` containing:
- `sql_heuristics.ts` -- SQL pattern matching
- `npm_validation.ts` -- Package name sanitization + registry/audit checks
- `code_scanning.ts` -- High-confidence malicious code patterns
- `types.ts` -- Shared types (`DangerCheckResult`)
### Components Affected
| Component | File(s) | Change Type |
| --------------------- | ---------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
| Tool definition types | `tools/types.ts` | Add `dangerCheck` to `ToolDefinition` interface |
| Tool set builder | `tool_definitions.ts` | Wire `dangerCheck` into execute wrapper, pass `dangerInfo` to consent request |
| SQL tool | `tools/execute_sql.ts` | Add SQL danger heuristics via `dangerCheck` |
| Dependency tool | `tools/add_dependency.ts` | Add package validation via `dangerCheck` |
| Dependency processor | `executeAddDependency.ts` | **Fix command injection**: use `execFile` with array args; add post-install `npm audit` |
| File write tools | `tools/write_file.ts`, `tools/edit_file.ts`, `tools/search_replace.ts` | Add code scanning via `dangerCheck`; add package.json filename detection |
| Settings schema | `src/lib/schemas.ts` | Add `dangerousApprovalOverride` field |
| Settings UI | New "Safety" section in settings | Toggle with confirmation dialog |
| Consent banner | `AgentConsentBanner.tsx` | Danger variant (red styling, two buttons, explanation, priority queue) |
| Consent types | IPC payload types | Add `dangerInfo` to consent request |
| Chat UI | Chat header/status area | Persistent shield-off indicator when override is active |
| Telemetry | Agent handler | Emit danger detection events |
### Data Model Changes
**UserSettings additions (in `schemas.ts`):**
```typescript
dangerousApprovalOverride: z.boolean().optional(), // default: false
```
**New types:**
```typescript
interface DangerCheckResult {
level: "warning" | "danger";
category: "destructive_sql" | "malicious_package" | "suspicious_code";
message: string; // Human-readable explanation (required, specific)
details?: string; // Extended details (full query, advisory URL, etc.)
}
```
**Extended consent request payload:**
```typescript
// In agent-tool:consent-request IPC event
{
requestId: string;
chatId: number;
toolName: string;
toolDescription: string;
inputPreview: string;
dangerInfo: DangerCheckResult | null; // NEW
}
```
**Extended ToolDefinition interface:**
```typescript
interface ToolDefinition<T> {
// ... existing fields ...
dangerCheck?: (
args: T,
ctx: AgentContext,
) => Promise<DangerCheckResult | null>;
}
```
### API Changes
- **Modified `buildAgentToolSet` execute wrapper**: Before calling `requireConsent`, run `dangerCheck`. If result is non-null and `dangerousApprovalOverride` is not enabled, force consent to "ask" and include `dangerInfo` in the consent request payload.
- **Modified consent request IPC**: Add `dangerInfo` field to `agent-tool:consent-request` event.
- **Modified consent response**: When `dangerInfo` is present, only accept `"accept-once"` or `"decline"` (no `"accept-always"`).
- **New telemetry events**: `danger_check:detected` and `danger_check:override` with category, tool name, and user decision.
### SQL Danger Heuristics
Patterns to detect (case-insensitive, ignoring SQL comments):
| Pattern | Level | Template |
| ----------------------------- | ------- | ----------------------------------------------------- |
| `DROP TABLE` | danger | "permanently delete the `{table}` table" |
| `DROP DATABASE` | danger | "permanently delete the entire `{database}` database" |
| `TRUNCATE TABLE` | danger | "delete all rows from the `{table}` table" |
| `DELETE FROM` without `WHERE` | danger | "delete all rows from the `{table}` table" |
| `ALTER TABLE ... DROP COLUMN` | warning | "permanently remove the `{column}` column" |
| `GRANT` / `REVOKE` | warning | "modifies database permissions" |
| `DROP SCHEMA` / `DROP INDEX` | warning | "permanently delete database object" |
Implementation notes:
- Strip SQL comments (`--`, `/* */`) before pattern matching to prevent bypass
- Handle multi-statement queries (split on `;` and check each)
- Sub-millisecond execution (regex only, no parsing)
### npm Package Validation
**Pre-install (in `dangerCheck`):**
1. Validate package name against npm naming rules: `^(@[a-z0-9-~][a-z0-9-._~]*/)?[a-z0-9-~][a-z0-9-._~]*(@.*)?$`
2. Reject any name that doesn't match (prevents command injection AND invalid packages)
3. Fetch `https://registry.npmjs.org/{package}` to confirm existence and check `deprecated` flag
**Post-install (in `executeAddDependency`):**
1. Run `npm audit --json` or `pnpm audit --json` in the app directory
2. Parse output for new vulnerabilities
3. If critical/high: show `danger` banner with advisory details
4. If moderate/low: show `warning` banner
5. Cache advisory data locally with 24-hour TTL for repeated installs
**Command injection fix (immediate, independent):**
- Replace `exec(\`pnpm add ${packageStr}\`)`with`execFile("pnpm", ["add", ...packages])` or equivalent
- Validate all package name strings before any shell interaction
### Code Injection Patterns
High-confidence, near-zero false positive patterns:
```typescript
const DANGER_PATTERNS = [
// Reverse shells
{
pattern: /\b(nc|ncat|netcat)\s+-[a-z]*e\s/i,
message: "reverse shell connection",
},
{ pattern: /\/dev\/tcp\//, message: "reverse shell connection" },
{
pattern: /child_process.*?(exec|spawn).*?(bash|sh|cmd|powershell)/s,
message: "shell execution",
},
// Crypto miners
{
pattern: /\b(coinhive|cryptonight|stratum\+tcp|xmrig)\b/i,
message: "cryptocurrency mining",
},
// Credential exfiltration
{
pattern: /process\.env\b.*?\bfetch\s*\(/s,
message: "environment variable exfiltration",
},
{
pattern: /process\.env\b.*?\bhttp/s,
message: "environment variable exfiltration",
},
// Obfuscated payloads
{ pattern: /\batob\s*\(.*?\beval\b/s, message: "obfuscated code execution" },
{
pattern: /Buffer\.from\s*\([^)]+,\s*['"]base64['"]\).*?\beval\b/s,
message: "obfuscated code execution",
},
];
```
Applied to content in `write_file`, `edit_file` (edit sketch content), and `search_replace` (replacement content). Not applied to the full file to avoid false positives from existing code.
## Implementation Plan
### Phase 0: Security Fix (Independent, Ship Immediately)
- [ ] Fix command injection in `executeAddDependency.ts` -- replace string interpolation with `execFile` array args or validate package names with regex before shell execution
- [ ] Add unit tests for package name validation
### Phase 1: Foundation
- [ ] Add `dangerCheck` field to `ToolDefinition` interface in `tools/types.ts`
- [ ] Add `DangerCheckResult` type to `danger_detection/types.ts`
- [ ] Wire `dangerCheck` into `buildAgentToolSet` execute wrapper -- run before consent, force "ask" if dangerous
- [ ] Extend consent request IPC payload with `dangerInfo: DangerCheckResult | null`
- [ ] Update `AgentConsentBanner.tsx` with danger variant: red styling, ShieldAlert icon, explanation text, two-button layout (no "Always allow"), priority queue ordering, not X-dismissible
- [ ] Add `aria-live="polite"`, focus management, keyboard defaults (Decline focused)
- [ ] Add danger detection telemetry: `danger_check:detected`, `danger_check:user_decision`
### Phase 2: SQL Danger Heuristics
- [ ] Implement `sql_heuristics.ts` with pattern matching for destructive operations
- [ ] Add `dangerCheck` to `executeSqlTool` that calls SQL heuristics
- [ ] Handle SQL comment stripping, multi-statement queries
- [ ] Add human-readable explanation templates with table/column name extraction
- [ ] Unit tests: corpus of dangerous and safe SQL, edge cases (DROP in comments, DELETE with complex WHERE, multi-statement)
### Phase 3: npm Package Validation
- [ ] Implement `npm_validation.ts` with package name sanitization regex
- [ ] Add pre-install registry existence check (`https://registry.npmjs.org/{package}`)
- [ ] Add `dangerCheck` to `addDependencyTool` for pre-install validation
- [ ] Add post-install `npm audit --json` / `pnpm audit --json` parsing in `executeAddDependency.ts`
- [ ] Map npm advisory severity to danger levels (critical/high = danger, moderate/low = warning)
- [ ] Add local caching for advisory data (24-hour TTL)
- [ ] Handle `@version` suffix in package names
- [ ] Unit tests: valid/invalid names, known vulnerable packages (mocked registry), severity mapping
### Phase 4: Code Injection Scanning
- [ ] Implement `code_scanning.ts` with high-confidence pattern set
- [ ] Add shared `scanContentForDangers(content: string)` function
- [ ] Add `dangerCheck` to `writeFileTool`, `editFileTool`, `searchReplaceTool`
- [ ] For `edit_file`: scan the edit sketch content, not the final merged file
- [ ] Add package.json detection: if target file is `package.json`, parse diff and run npm validation on new dependencies
- [ ] Per-pattern explanation templates
- [ ] Unit tests: known malicious patterns, legitimate code that looks suspicious (build tools, base64 in tests)
- [ ] Performance benchmark: verify sub-millisecond execution for regex patterns
### Phase 5: Dangerous Approval Override
- [ ] Add `dangerousApprovalOverride: boolean` to `BaseUserSettingsFields` in `schemas.ts` (default: false)
- [ ] Wire override check into `buildAgentToolSet` -- skip `dangerCheck` when enabled
- [ ] Add "Safety" section in settings UI, visually separated from auto-approve
- [ ] Implement confirmation dialog with "I understand" text input requirement
- [ ] Add persistent shield-off indicator in chat header when override is active (clickable to jump to setting)
- [ ] Add telemetry for override enable/disable events
- [ ] Consider auto-expiry on app update (re-prompt user to re-enable)
## Testing Strategy
- [ ] **Unit tests for SQL heuristics**: Corpus of 50+ dangerous and safe SQL queries. Edge cases: DROP inside comments, DELETE with complex WHERE clauses, multi-statement queries, case variations, GRANT/REVOKE.
- [ ] **Unit tests for npm validation**: Valid package names, invalid/malicious names, `@scope/package` format, `package@version` format, names with special characters (command injection attempts).
- [ ] **Unit tests for code scanning**: Known malicious patterns, legitimate code that resembles patterns (build tools using eval, base64 in unit tests, process.env in config files).
- [ ] **Integration tests**: Verify that `dangerCheck` results flow through the consent system correctly -- forced consent shows danger banner even when consent is "always", danger info appears in banner, "accept-always" is not an option.
- [ ] **E2E tests**: Simulate agent attempting dangerous SQL with auto-approve ON; verify danger banner appears with correct explanation. Test override toggle flow.
- [ ] **Regression tests**: Ensure existing auto-approve workflows are not broken for non-dangerous operations. Verify zero-friction happy path.
- [ ] **Performance tests**: Benchmark SQL heuristics and code scanning to verify sub-millisecond execution on typical inputs.
## Risks & Mitigations
| Risk | Likelihood | Impact | Mitigation |
| ---------------------------------------------------------------------- | ---------- | ------ | -------------------------------------------------------------------------------------------------------------------------------- |
| False positives erode user trust | HIGH | HIGH | Start with very high-confidence patterns only. Track override rates via telemetry. Remove patterns that produce false positives. |
| Command injection via package names (EXISTING) | HIGH | HIGH | Fix immediately in Phase 0, independent of feature work. Use `execFile` with array args. |
| Override + auto-approve = zero guardrails | MEDIUM | HIGH | Track this state in telemetry. Consider auto-expiry on app update. Persistent UI indicator. |
| Narrow code scanning creates false sense of security | MEDIUM | MEDIUM | Honest messaging: "checks for common malicious patterns" not "security scanning." Document known limitations. |
| npm audit coverage gaps (no typosquats, zero-days) | MEDIUM | MEDIUM | Accept as known limitation. Document. Consider Socket.dev integration in v2. |
| Performance impact on file writes from code scanning | LOW | MEDIUM | Regex-only patterns (sub-millisecond). Benchmark before shipping. |
| Bypass via indirect paths (write benign script that downloads malware) | MEDIUM | LOW | Fundamental limitation of static analysis. Accept and document. |
| npm registry/audit API unavailable (offline/outage) | LOW | LOW | Fail-open with notification: "Safety check unavailable -- proceeding." |
| Pattern list goes stale as threats evolve | LOW | MEDIUM | Keep pattern set small and high-signal. Easy to update (single file). |
| MCP tools bypass all danger checks | LOW | LOW | Document as known limitation. Out of scope for v1. |
## Open Questions
- **Build mode coverage**: The `autoApproveChanges` setting in build mode bypasses the proposal flow, including existing `SecurityRisk` warnings. This feature only covers local-agent mode. Should build mode be covered in v2?
- **`npm:` protocol aliases**: `package.json` edits could use `"my-pkg": "npm:malicious-pkg@1.0.0"` to bypass name validation. Should we parse these in the package.json detection?
- **Per-category danger guard settings**: Should users be able to disable SQL checks but keep npm checks? The `category` field on `DangerCheckResult` enables this in the future, but it's not in the MVP.
- **MCP tool danger detection**: MCP tools are opaque but could execute SQL or install packages. Future option: let MCP server authors declare danger levels in tool metadata.
## Decision Log
| Decision | Reasoning |
| ------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Heuristic SQL detection over LLM-based | LLM adds latency, cost, and provider dependency (violates Backend-Flexible principle). Heuristics catch 95%+ of destructive patterns with zero false positives on the obvious cases. |
| npm audit advisories over Socket.dev | Free, official, no API key needed. Socket.dev is more comprehensive but adds external dependency. Can upgrade later. |
| Include narrow code injection scanning in MVP | User decided. Scoped to near-zero false positive patterns (reverse shells, crypto miners, credential exfiltration). Performance impact is minimal (regex-only). |
| Include dangerous approval override in MVP | User decided. Mitigated with confirmation dialog (typed "I understand"), persistent UI indicator, and telemetry tracking. |
| Always show danger context (even with auto-approve OFF) | Enhances decision quality for all users. Same consent banner component, just with upgraded styling when danger is detected. |
| Advisory (forced consent) over blocking | Users can still proceed past warnings. This respects user autonomy while ensuring informed consent. The override toggle is the escape hatch from even this. |
| Two buttons only on danger banner (no "Always allow") | Permanently auto-approving dangerous actions defeats the purpose. Users approve per-instance or use the global override. |
| `dangerCheck` per-tool over centralized detection | Each tool knows its domain best. SQL heuristics are completely different from npm validation. Co-locating detection with the tool is cleaner and more extensible. |
| Fix command injection independently | This is a security bug that exists today, not a feature. Ship the fix immediately without waiting for the full danger guards feature. |
| Fail-open when checks are unavailable | Fail-closed would mean a third-party API outage blocks the user's work. Fail-open with notification is the right balance for a local-first tool. |
| `aria-live="polite"` over "assertive" | The agent is paused waiting for consent -- there's no urgency. "Assertive" would disruptively interrupt screen reader users. |
---
_Generated by dyad:swarm-to-plan_
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论