• Ryan Groch's avatar
    perf: reduce number of native git calls when extracting a codebase (#3105) · 0beeb501
    Ryan Groch 提交于
    Currently, `collectFiles` is calling `isGitIgnored` on each
    (non-excluded) recursion. Although there is caching, we're frequently
    executing Git just to check whether an individual file or directory is
    gitignored, meaning that the number of Git invocations scales with the
    number of files in the user's app.
    
    This amounts to a substantial number of Git invocations. For smaller
    projects it could be dozens; for larger projects it could be thousands.
    It's particularly a problem for native Git, because each `exec` call
    comes with a lot of overhead even though Git itself is quite fast.
    
    Although I'm not 100% sure, I suspect that this was the underlying cause
    of both #2795 and #1642, because:
    1. Both mention Dyad freezing when dealing with larger projects, and
    this issue is far more noticeable for large projects.
    2. Both specifically mention that the freeze happens upon opening their
    project, which is when `collectFiles` runs.
    3. I was able to replicate the crash consistently on Windows 10 and
    inconsistently on Linux Mint by importing a large project into Dyad. I
    don't yet have a good automated test for this, though.
    
    The solution that I wrote for this PR puts the responsibility of
    traversing the app's files onto native Git instead of doing it manually.
    This means that we'll only have one Git invocation per call to the
    function (formerly named) `collectFiles`.
    
    I've also done my best to keep the output of `collectFilesNativeGit` as
    close as possible to the original `collectFiles`. The ordering of the
    files will be different, but I don't think that should make a difference
    given that we later sort them anyway.
    
    Some alternatives I've thought of if we decide we want to keep the
    current traversal logic:
    - Run `git check-ignore` on batches of files (e.g. each result of
    `fsAsync.readdir`) rather than one at a time. This would still result in
    multiple Git calls, though.
    - Run `git check-ignore` on all of the files at once at the end of
    `collectFiles`. We wouldn't be able to prune gitignored directories in
    our traversal, but at least we'd still avoid the directories in
    `EXCLUDED_DIRS`, such as `node_modules` and `.next`.
    
    I've left the logic of `collectFiles` untouched for isomorphic-git for
    now. There might be a good way to optimize that as well, but it will
    likely be a bit different because isomorphic-git has different
    capabilities than native Git.
    <!-- devin-review-badge-begin -->
    
    ---
    
    <a href="https://app.devin.ai/review/dyad-sh/dyad/pull/3105"
    target="_blank">
      <picture>
    <source media="(prefers-color-scheme: dark)"
    srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
    <img
    src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1"
    alt="Open with Devin">
      </picture>
    </a>
    <!-- devin-review-badge-end -->
    
    ---------
    Co-authored-by: 's avatarClaude <noreply@anthropic.com>
    0beeb501
名称
最后提交
最后更新
.agents 正在载入提交数据...
.claude 正在载入提交数据...
.cursor/rules 正在载入提交数据...
.devcontainer 正在载入提交数据...
.github 正在载入提交数据...
.husky 正在载入提交数据...
.storybook 正在载入提交数据...
assets 正在载入提交数据...
docs 正在载入提交数据...
drizzle 正在载入提交数据...
e2e-tests 正在载入提交数据...
makers 正在载入提交数据...
packages/@dyad-sh 正在载入提交数据...
plans 正在载入提交数据...
rules 正在载入提交数据...
scaffold 正在载入提交数据...
scripts 正在载入提交数据...
shared 正在载入提交数据...
src 正在载入提交数据...
testing 正在载入提交数据...
tools 正在载入提交数据...
worker 正在载入提交数据...
workers/tsc 正在载入提交数据...
.cursorignore 正在载入提交数据...
.env.example 正在载入提交数据...
.eslintrc.json 正在载入提交数据...
.gitattributes 正在载入提交数据...
.gitignore 正在载入提交数据...
.npmrc 正在载入提交数据...
.oxfmtrc.json 正在载入提交数据...
.oxlintrc.json 正在载入提交数据...
.prettierignore 正在载入提交数据...
.prettierrc 正在载入提交数据...
AGENTS.md 正在载入提交数据...
CLA.md 正在载入提交数据...
CLAUDE.md 正在载入提交数据...
CONTRIBUTING.md 正在载入提交数据...
LICENSE 正在载入提交数据...
README.md 正在载入提交数据...
SECURITY.md 正在载入提交数据...
biome.json 正在载入提交数据...
components.json 正在载入提交数据...
drizzle.config.ts 正在载入提交数据...
forge.config.ts 正在载入提交数据...
forge.env.d.ts 正在载入提交数据...
index.html 正在载入提交数据...
lint-staged.config.js 正在载入提交数据...
merge.config.ts 正在载入提交数据...
package-lock.json 正在载入提交数据...
package.json 正在载入提交数据...
playwright.config.ts 正在载入提交数据...
tsconfig.app.json 正在载入提交数据...
tsconfig.json 正在载入提交数据...
tsconfig.node.json 正在载入提交数据...
vite.main.config.mts 正在载入提交数据...
vite.preload.config.mts 正在载入提交数据...
vite.renderer.config.mts 正在载入提交数据...
vite.worker.config.mts 正在载入提交数据...
vitest.config.ts 正在载入提交数据...
windowsSign.ts 正在载入提交数据...