Improve test failure reporting with snapshot detection (#2484)

## Summary Enhanced the Playwright test summary generation to better handle snapshot failures and provide more actionable test commands in PR comments. ## Key Changes - **Added snapshot failure detection**: New `isSnapshotFailure()` function that identifies snapshot-related test failures by checking error messages for common snapshot comparison keywords - **Simplified test command output**: Removed the separate `generateCommands()` function and consolidated to a single command per test - **Smart command generation**: Automatically appends `--update-snapshots` flag only for snapshot failures, reducing confusion about when to use it - **Improved readability**: - Reformatted test commands into a single code block with better organization - Added error preview (first 120 chars) to help developers understand failure context - Added `export PLAYWRIGHT_HTML_OPEN=never` to prevent browser windows from opening during test runs - Changed section description from "run or update snapshots" to "re-run failing tests locally" for clarity ## Implementation Details - The snapshot detection checks for 10 different snapshot-related error message patterns (case-insensitive) - Test commands are now grouped in a single bash code block for easier copy-pasting - Error messages are truncated to 120 characters to keep the comment concise while still providing useful context https://claude.ai/code/session_014DzYzvnt4559rZEGLwbVm5  --- <a href="https://app.devin.ai/review/dyad-sh/dyad/pull/2484"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a>   --- ## Summary by cubic Improved Playwright PR test failure reporting by detecting snapshot mismatches and generating clearer, single-run commands. Makes it faster and safer to re-run failures locally and reduces confusion around snapshot updates. - **New Features** - Detect snapshot failures from error text and auto-append `--update-snapshots` only when needed. - Consolidate all failing test commands into one bash block with `PLAYWRIGHT_HTML_OPEN=never` set once. - Add a short error preview (first 120 chars) as a comment above each command. - Sanitize spec paths and escape test names/error previews to prevent injection. <sup>Written for commit 0515c892605dc303c60d1dbae4a2c0cd3a537481. Summary will update on new commits.</sup>   --- > [!NOTE] > **Low Risk** > Low risk: changes only affect CI-generated PR comments and local rerun command formatting, with added input sanitization reducing injection risk. > > **Overview** > Improves the Playwright PR comment generator to **detect snapshot-related failures** and tailor rerun commands accordingly. > > The macOS “Test Commands” section is reformatted into a single copy-pasteable bash block (including `PLAYWRIGHT_HTML_OPEN=never`), adds a short error preview per test, and appends `--update-snapshots` *only* when a failure looks snapshot-related. Command generation now also sanitizes the spec path and removes the separate run/update command helper. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 0515c892605dc303c60d1dbae4a2c0cd3a537481. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup>  --------- Co-authored-by: Claude <noreply@anthropic.com>

Improve test failure reporting with snapshot detection (#2484)
ea3b676f · Will Chen · GitHub · 98c169b9 · ea3b676f
--- a/scripts/generate-playwright-summary.js
+++ b/scripts/generate-playwright-summary.js
@@ -70,21 +70,32 @@ function parseTestTitle(fullTitle) {
  return { specFile, testName };
 }
+// Detect if a test failure is due to a snapshot mismatch
+function isSnapshotFailure(errorMessage) {
+  if (!errorMessage) return false;
+  const lower = errorMessage.toLowerCase();
+  return [
+    "screenshot comparison failed",
+    "snapshot comparison failed",
+    "expected to match snapshot",
+    "tomatchsnapshot",
+    "tohavescreenshot",
+    "screenshots are different",
+    "snapshots don't match",
+    "snapshot mismatch",
+    "snapshot",
+    "ratio of different pixels",
+  ].some((pattern) => lower.includes(pattern));
+}
 // Generate copy-paste command for running a specific test
 function generateTestCommand(fullTitle) {
  const { specFile, testName } = parseTestTitle(fullTitle);
+  // Sanitize specFile to only allow safe path characters
+  const safeSpecFile = specFile.replace(/[^a-zA-Z0-9._\-/]/g, "");
  // Escape special characters in testName for the grep pattern
  const escapedTestName = testName.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
-  return `npm run e2e e2e-tests/${specFile} -- -g "${escapedTestName}"`;
+  return `npm run e2e e2e-tests/${safeSpecFile} -- -g "${escapedTestName}"`;
-}
-// Generate both run and update commands for a test
-function generateCommands(fullTitle) {
-  const testCmd = generateTestCommand(fullTitle);
-  return {
-    run: testCmd,
-    update: `${testCmd} --update-snapshots`,
-  };
 }
 function detectOperatingSystemsFromReport(report) {
@@ -379,21 +390,28 @@ async function run({ github, context, core }) {
    const macOsFailures = resultsByOs["macOS"]?.failures || [];
    if (macOsFailures.length > 0) {
      comment += "### 📋 Test Commands (macOS)\n\n";
-      comment +=
+      comment += "Copy and paste to re-run failing tests locally:\n\n";
-        "Copy and paste these commands to run or update snapshots for failed tests:\n\n";
      if (macOsFailures.length > 5) {
        comment += `<details>\n<summary>Show all ${macOsFailures.length} test commands</summary>\n\n`;
      }
+      comment += "```bash\n";
+      comment += "export PLAYWRIGHT_HTML_OPEN=never\n";
      for (const f of macOsFailures) {
-        const cmds = generateCommands(f.title);
+        const cmd = generateTestCommand(f.title);
-        comment += `**\`${f.title}\`**\n`;
+        const snapshot = isSnapshotFailure(f.error);
-        comment += "```bash\n";
+        const errorPreview =
-        comment += `# Run test\n${cmds.run}\n\n`;
+          f.error.length > 120 ? f.error.substring(0, 120) + "..." : f.error;
-        comment += `# Update snapshots\n${cmds.update}\n`;
+        comment += `\n# ${f.title.replace(/\n/g, " ").replace(/`/g, "'")}\n`;
-        comment += "```\n\n";
+        comment += `# Expected: ${errorPreview.replace(/\n/g, " ").replace(/`/g, "'")}\n`;
+        if (snapshot) {
+          comment += `${cmd} --update-snapshots\n`;
+        } else {
+          comment += `${cmd}\n`;
+        }
      }
+      comment += "```\n\n";
      if (macOsFailures.length > 5) {
        comment += "</details>\n";