• Will Chen's avatar
    Replace deflake-e2e-recent-prs with deflake-e2e-recent-commits (#2607) · 50a72da9
    Will Chen 提交于
    ## Summary
    - Replaced `deflake-e2e-recent-prs` command with
    `deflake-e2e-recent-commits` that scans CI workflow runs on main instead
    of PR comments
    - Downloads the `html-report` artifact (`results.json`) from each CI run
    to extract flaky test data, which works for push events that don't post
    PR comments
    - Updated `claude-deflake-e2e.yml` workflow to use the new command
    
    ## Test plan
    - [ ] Trigger the `Claude Deflake E2E` workflow manually and verify it
    correctly scans recent main branch CI runs
    - [ ] Verify flaky tests are correctly parsed from `results.json`
    artifacts
    
    🤖 Generated with [Claude Code](https://claude.com/claude-code)
    <!-- devin-review-badge-begin -->
    
    ---
    
    <a href="https://app.devin.ai/review/dyad-sh/dyad/pull/2607"
    target="_blank">
      <picture>
    <source media="(prefers-color-scheme: dark)"
    srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
    <img
    src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1"
    alt="Open with Devin">
      </picture>
    </a>
    <!-- devin-review-badge-end -->
    
    <!-- CURSOR_SUMMARY -->
    ---
    
    > [!NOTE]
    > **Low Risk**
    > Low risk doc/workflow tweak that changes how the deflake automation
    sources flaky tests (GitHub Actions runs/artifacts) but does not touch
    production code or test logic.
    > 
    > **Overview**
    > Updates the deflake automation to **scan recent `main` CI workflow
    runs** (push events) instead of PR Playwright summary comments, by
    downloading each run’s `html-report` artifact and parsing `results.json`
    to detect retry-passed tests with prior failures/timeouts.
    > 
    > Adjusts the scheduled `Claude Deflake E2E` workflow to accept
    `commit_count`, grant `actions: read`, and invoke
    `/dyad:deflake-e2e-recent-commits` rather than the old PR-based command.
    > 
    > <sup>Written by [Cursor
    Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
    0da1e67da43e509577d5b8dc1f155779742d1529. This will update automatically
    on new commits. Configure
    [here](https://cursor.com/dashboard?tab=bugbot).</sup>
    <!-- /CURSOR_SUMMARY -->
    
    <!-- This is an auto-generated description by cubic. -->
    ---
    ## Summary by cubic
    Switched the deflake command to scan recent main CI runs and parse
    html-report results.json to find flaky E2E tests. Updated the Claude
    Deflake E2E workflow to use commit_count and added actions: read
    permission.
    
    - **Refactors**
    - List completed main push runs via gh api, fetch 3x commit_count, and
    filter to success/failure.
    - Download non-expired html-report artifacts; parse results.json with a
    Node.js script to detect flakes (final passed after
    fail/timedOut/interrupted).
    - Build "<spec_path.spec.ts> > Suite > Test" titles; group and rank by
    frequency; clean up artifacts.
      - Skip runs without artifacts; note 3-day artifact retention.
    
    - **Bug Fixes**
      - Updated command doc to reference the TodoWrite tool.
    
    <sup>Written for commit 0da1e67da43e509577d5b8dc1f155779742d1529.
    Summary will update on new commits.</sup>
    
    <!-- End of auto-generated description by cubic. -->
    
    ---------
    Co-authored-by: 's avatarClaude Opus 4.6 <noreply@anthropic.com>
    50a72da9
deflake-e2e-recent-commits.md 6.1 KB