Replace prompt-based stop hook with Sonnet-powered analysis (#2331)

## Summary - Replace broken prompt-based Stop hook with command-based hook using Claude Sonnet - Add .claude/hooks/stop-hook.py that reads conversation transcript and uses Sonnet to analyze task completion - Includes infinite loop prevention via stop_hook_active check - Add unit tests for the stop hook ## Test plan - [x] Run pytest .claude/hooks/tests/test_stop_hook.py -v - all 9 tests pass - [ ] Manual testing: verify stop hook fires and correctly analyzes task completion #skip-bugbot  --- ## Summary by cubic Replaced the broken prompt-based stop hook with a command-based hook that blocks when tasks remain and uses Sonnet analysis as a fallback. Adds loop protection and tests. - **New Features** - Added .claude/hooks/stop-hook.py that blocks when TaskCreate/TaskUpdate show remaining tasks, returning {"decision":"block","reason":...}. If none remain, it analyzes a 32k transcript (middle truncation) with Sonnet. - Added unit tests and a stop_hook_active guard to prevent infinite loops. - **Refactors** - Updated .claude/settings.json to use the command-based hook (30000 ms timeout) instead of the prompt hook. - Added --no-session-persistence to Claude CLI calls in stop-hook.py and permission-request-hook.py. <sup>Written for commit 575426cee9efb0fa7e1f4be64a8405ae2e717a3b. Summary will update on new commits.</sup>   --- <a href="https://app.devin.ai/review/dyad-sh/dyad/pull/2331"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a>  --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

Replace prompt-based stop hook with Sonnet-powered analysis (#2331)
6ba51165 · Will Chen · GitHub · 67fe5048 · 6ba51165 · 6ba51165
--- a/.claude/hooks/permission-request-hook.py
+++ b/.claude/hooks/permission-request-hook.py
@@ -84,6 +84,7 @@ Analyze this request and provide your safety assessment. Respond with ONLY a JSO
                "--print",
                "--output-format", "text",
                "--model", "sonnet",
+                "--no-session-persistence",
                prompt
            ],
            capture_output=True,

--- a/.claude/hooks/stop-hook.py
+++ b/.claude/hooks/stop-hook.py
--- a/.claude/hooks/tests/fixtures/completed_tasks.jsonl
+++ b/.claude/hooks/tests/fixtures/completed_tasks.jsonl
+{"type": "user", "message": {"content": "Fix all the PR review comments on this branch"}}
+{"type": "assistant", "message": {"content": [{"type": "text", "text": "I'll help you fix the PR review comments. Let me create a task list."}, {"type": "tool_use", "name": "TaskCreate", "input": {"subject": "Determine the PR", "activeForm": "Determining PR"}}]}}
+{"type": "tool_result", "content": "Task 1 created"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskCreate", "input": {"subject": "Fetch review threads", "activeForm": "Fetching threads"}}]}}
+{"type": "tool_result", "content": "Task 2 created"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskCreate", "input": {"subject": "Address comments", "activeForm": "Addressing comments"}}]}}
+{"type": "tool_result", "content": "Task 3 created"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "1", "status": "in_progress"}}]}}
+{"type": "tool_result", "content": "Task 1 in_progress"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "1", "status": "completed"}}]}}
+{"type": "tool_result", "content": "Task 1 completed"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "2", "status": "in_progress"}}]}}
+{"type": "tool_result", "content": "Task 2 in_progress"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "2", "status": "completed"}}]}}
+{"type": "tool_result", "content": "Task 2 completed"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "3", "status": "in_progress"}}]}}
+{"type": "tool_result", "content": "Task 3 in_progress"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "Edit", "input": {"file_path": "src/main.ts"}}]}}
+{"type": "tool_result", "content": "File edited"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "3", "status": "completed"}}]}}
+{"type": "tool_result", "content": "Task 3 completed"}
+{"type": "assistant", "message": {"content": [{"type": "text", "text": "I've completed all the PR review comments. Here's a summary:\n\n1. Found PR #123\n2. Fetched 2 review threads\n3. Fixed both issues by updating src/main.ts\n\nAll tasks are complete. The changes have been committed and pushed."}]}}
--- a/.claude/hooks/tests/fixtures/incomplete_tasks.jsonl
+++ b/.claude/hooks/tests/fixtures/incomplete_tasks.jsonl
+{"type": "user", "message": {"content": "Fix all the PR review comments on this branch"}}
+{"type": "assistant", "message": {"content": [{"type": "text", "text": "I'll help you fix the PR review comments. Let me create a task list to track progress."}, {"type": "tool_use", "name": "TaskCreate", "input": {"subject": "Determine the PR to work on", "description": "Get the current branch's PR", "activeForm": "Determining PR"}}]}}
+{"type": "tool_result", "content": "Task 1 created"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskCreate", "input": {"subject": "Fetch unresolved PR review threads", "description": "Get all review threads", "activeForm": "Fetching review threads"}}]}}
+{"type": "tool_result", "content": "Task 2 created"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskCreate", "input": {"subject": "Process and address review comments", "description": "Fix each comment", "activeForm": "Processing comments"}}]}}
+{"type": "tool_result", "content": "Task 3 created"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskCreate", "input": {"subject": "Verify, commit, and push changes", "description": "Run lint, commit, push", "activeForm": "Verifying and pushing"}}]}}
+{"type": "tool_result", "content": "Task 4 created"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskCreate", "input": {"subject": "Provide summary to user", "description": "Report what was done", "activeForm": "Providing summary"}}]}}
+{"type": "tool_result", "content": "Task 5 created"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "1", "status": "in_progress"}}]}}
+{"type": "tool_result", "content": "Task 1 updated to in_progress"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "Bash", "input": {"command": "gh pr view --json number"}}]}}
+{"type": "tool_result", "content": "PR #123"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "1", "status": "completed"}}]}}
+{"type": "tool_result", "content": "Task 1 completed"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "2", "status": "in_progress"}}]}}
+{"type": "tool_result", "content": "Task 2 updated to in_progress"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "Bash", "input": {"command": "gh api graphql"}}]}}
+{"type": "tool_result", "content": "Found 2 review threads"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "2", "status": "completed"}}]}}
+{"type": "tool_result", "content": "Task 2 completed"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "3", "status": "in_progress"}}]}}
+{"type": "tool_result", "content": "Task 3 updated to in_progress"}
+{"type": "assistant", "message": {"content": [{"type": "text", "text": "I found 2 review comments to address."}, {"type": "tool_use", "name": "Edit", "input": {"file_path": "src/main.ts", "old_string": "foo", "new_string": "bar"}}]}}
+{"type": "tool_result", "content": "File edited"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "3", "status": "completed"}}]}}
+{"type": "tool_result", "content": "Task 3 completed"}
+{"type": "assistant", "message": {"content": [{"type": "tool_use", "name": "TaskUpdate", "input": {"taskId": "4", "status": "in_progress"}}]}}
+{"type": "tool_result", "content": "Task 4 updated to in_progress"}
+{"type": "assistant", "message": {"content": [{"type": "text", "text": "Now running lint checks..."}, {"type": "tool_use", "name": "Bash", "input": {"command": "npm run lint"}}]}}
+{"type": "tool_result", "content": "Lint passed"}
--- a/.claude/hooks/tests/test_stop_hook.py
+++ b/.claude/hooks/tests/test_stop_hook.py
--- a/.claude/settings.json
+++ b/.claude/settings.json
@@ -154,9 +154,9 @@
      {
        "hooks": [
          {
-            "type": "prompt",
-            "prompt": "You are evaluating whether Claude should stop working. Your default bias should be to CONTINUE WORKING (ok: false). Only allow stopping when absolutely certain all work is complete.\n\nContext: $ARGUMENTS\n\nAnalyze the conversation and determine:\n\n1. TASK COMPLETION CHECK\n   - Has EVERY task the user requested been fully completed?\n   - Not just acknowledged, but actually done with verifiable output?\n   - Are there any implied subtasks that haven't been addressed?\n\n2. ERROR CHECK\n   - Did any errors occur that weren't resolved?\n   - Are there failed operations that need retry?\n   - Did Claude say it would do something but didn't actually do it?\n\n3. FOLLOW-UP CHECK\n   - Is there obvious next-step work that should be done?\n   - Did the user's request imply multiple steps where only some were completed?\n   - Would a thorough assistant do more here?\n\n4. QUALITY CHECK\n   - Was the work done to completion, not just partially?\n   - Are there placeholders, TODOs, or incomplete sections?\n   - Did Claude stop prematurely mid-task?\n\nDECISION RULES:\n- If ANY task is incomplete → continue (ok: false)\n- If ANY error is unresolved → continue (ok: false)\n- If there's reasonable follow-up work → continue (ok: false)\n- If Claude said \"I'll do X\" but didn't → continue (ok: false)\n- If work quality is partial/rushed → continue (ok: false)\n- ONLY return ok: true if ALL tasks are genuinely, fully complete AND no errors exist AND no reasonable follow-up remains\n\nRespond with JSON:\n- {\"ok\": false, \"reason\": \"specific explanation of what still needs to be done\"} — DEFAULT, use this unless everything is truly complete\n- {\"ok\": true} — ONLY when absolutely certain all work is finished and nothing remains",
-            "timeout": 30
+            "type": "command",
+            "command": "\"$CLAUDE_PROJECT_DIR\"/.claude/hooks/stop-hook.py",
+            "timeout": 30000
          }
        ]
      }