Unverified 提交 3deb70b1 authored 作者: wwwillchen-bot's avatar wwwillchen-bot 提交者: GitHub

fix: limit grep results to prevent context_length_exceeded errors (#2510)

## Summary - Add a configurable `limit` parameter to the grep tool (default: 100, max: 500) to prevent context window overflow - Truncate individual line text to 500 characters to handle very long lines - Add clear truncation notice telling the AI to narrow its search when results are truncated - Add `total` and `truncated` attributes to XML output for visibility ## Test plan - [x] Build passes (`npm run build`) - [x] Lint passes (`npm run lint`) - [x] All 669 unit tests pass (`npm test`) - Manual testing: grep with a broad query like "import" should now return limited results with a truncation notice instead of overflowing the context window Fixes #2509 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/dyad-sh/dyad/pull/2510" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a> <!-- devin-review-badge-end --> <!-- This is an auto-generated description by cubic. --> --- ## Summary by cubic Limits grep results and line lengths to prevent context window overflows and context_length_exceeded errors. Addresses Linear #2509 with a configurable cap, consistent sorting, clearer UI output, and ignoring include_pattern "*" when it matches all files. - **Bug Fixes** - Added limit parameter (default 100, max 250) to cap matches. - Truncated each matched line to 500 chars. - Sorted results by path and line number, and ignored include_pattern "*" with a note to avoid broad searches. - Added truncation notice and "X of Y matches" in UI. - Exposed total and truncated flags in XML output for visibility. <sup>Written for commit ad5979b9352ffc754019e13ead9c6be2e7b24ce9. Summary will update on new commits.</sup> <!-- End of auto-generated description by cubic. --> --------- Co-authored-by: 's avatarWill Chen <willchen90@gmail.com> Co-authored-by: 's avatarClaude Opus 4.5 <noreply@anthropic.com> Co-authored-by: 's avatarclaude[bot] <41898282+claude[bot]@users.noreply.github.com>
上级 aa71f805
...@@ -96,7 +96,7 @@ ...@@ -96,7 +96,7 @@
"type": "function", "type": "function",
"function": { "function": {
"name": "grep", "name": "grep",
"description": "Search for a regex pattern in the codebase using ripgrep.\n\n- Returns matching lines with file paths and line numbers\n- By default, the search is case-insensitive\n- Use include_pattern to filter by file type (e.g. '*.tsx')\n- Use exclude_pattern to skip certain files (e.g. '*.test.ts')", "description": "Search for a regex pattern in the codebase using ripgrep.\n\n- Returns matching lines with file paths and line numbers\n- By default, the search is case-insensitive\n- Use include_pattern to filter by file type (e.g. '*.tsx')\n- Use exclude_pattern to skip certain files (e.g. '*.test.ts')\n- Results are limited to 100 matches by default (max 250). If results are truncated, narrow your search with include_pattern or a more specific query.",
"parameters": { "parameters": {
"$schema": "http://json-schema.org/draft-07/schema#", "$schema": "http://json-schema.org/draft-07/schema#",
"type": "object", "type": "object",
...@@ -116,6 +116,12 @@ ...@@ -116,6 +116,12 @@
"case_sensitive": { "case_sensitive": {
"description": "Whether the search should be case sensitive (default: false)", "description": "Whether the search should be case sensitive (default: false)",
"type": "boolean" "type": "boolean"
},
"limit": {
"description": "Maximum number of matches to return (default: 100, max: 250). Use include_pattern to narrow results if limit is reached.",
"type": "number",
"minimum": 1,
"maximum": 250
} }
}, },
"required": [ "required": [
......
...@@ -232,7 +232,7 @@ ...@@ -232,7 +232,7 @@
{ {
"type": "function", "type": "function",
"name": "grep", "name": "grep",
"description": "Search for a regex pattern in the codebase using ripgrep.\n\n- Returns matching lines with file paths and line numbers\n- By default, the search is case-insensitive\n- Use include_pattern to filter by file type (e.g. '*.tsx')\n- Use exclude_pattern to skip certain files (e.g. '*.test.ts')", "description": "Search for a regex pattern in the codebase using ripgrep.\n\n- Returns matching lines with file paths and line numbers\n- By default, the search is case-insensitive\n- Use include_pattern to filter by file type (e.g. '*.tsx')\n- Use exclude_pattern to skip certain files (e.g. '*.test.ts')\n- Results are limited to 100 matches by default (max 250). If results are truncated, narrow your search with include_pattern or a more specific query.",
"parameters": { "parameters": {
"$schema": "http://json-schema.org/draft-07/schema#", "$schema": "http://json-schema.org/draft-07/schema#",
"type": "object", "type": "object",
...@@ -252,6 +252,12 @@ ...@@ -252,6 +252,12 @@
"case_sensitive": { "case_sensitive": {
"description": "Whether the search should be case sensitive (default: false)", "description": "Whether the search should be case sensitive (default: false)",
"type": "boolean" "type": "boolean"
},
"limit": {
"description": "Maximum number of matches to return (default: 100, max: 250). Use include_pattern to narrow results if limit is reached.",
"type": "number",
"minimum": 1,
"maximum": 250
} }
}, },
"required": [ "required": [
......
...@@ -231,7 +231,7 @@ ...@@ -231,7 +231,7 @@
"type": "function", "type": "function",
"function": { "function": {
"name": "grep", "name": "grep",
"description": "Search for a regex pattern in the codebase using ripgrep.\n\n- Returns matching lines with file paths and line numbers\n- By default, the search is case-insensitive\n- Use include_pattern to filter by file type (e.g. '*.tsx')\n- Use exclude_pattern to skip certain files (e.g. '*.test.ts')", "description": "Search for a regex pattern in the codebase using ripgrep.\n\n- Returns matching lines with file paths and line numbers\n- By default, the search is case-insensitive\n- Use include_pattern to filter by file type (e.g. '*.tsx')\n- Use exclude_pattern to skip certain files (e.g. '*.test.ts')\n- Results are limited to 100 matches by default (max 250). If results are truncated, narrow your search with include_pattern or a more specific query.",
"parameters": { "parameters": {
"$schema": "http://json-schema.org/draft-07/schema#", "$schema": "http://json-schema.org/draft-07/schema#",
"type": "object", "type": "object",
...@@ -251,6 +251,12 @@ ...@@ -251,6 +251,12 @@
"case_sensitive": { "case_sensitive": {
"description": "Whether the search should be case sensitive (default: false)", "description": "Whether the search should be case sensitive (default: false)",
"type": "boolean" "type": "boolean"
},
"limit": {
"description": "Maximum number of matches to return (default: 100, max: 250). Use include_pattern to narrow results if limit is reached.",
"type": "number",
"minimum": 1,
"maximum": 250
} }
}, },
"required": [ "required": [
......
...@@ -34,7 +34,7 @@ ...@@ -34,7 +34,7 @@
- button "Copy": - button "Copy":
- img - img
- text: log - text: log
- code: "src/main.tsx:2: import App from \"./App.tsx\"; src/main.tsx:4: createRoot(document.getElementById(\"root\")!).render(<App />); src/App.tsx:1: const App = () => <div>Minimal imported app</div>; src/App.tsx:3: export default App;" - code: "src/App.tsx:1: const App = () => <div>Minimal imported app</div>; src/App.tsx:3: export default App; src/main.tsx:2: import App from \"./App.tsx\"; src/main.tsx:4: createRoot(document.getElementById(\"root\")!).render(<App />);"
- paragraph: I found the matches! The React app is initialized in src/main.tsx using createRoot, and the App component is defined in src/App.tsx and imported in src/main.tsx. - paragraph: I found the matches! The React app is initialized in src/main.tsx using createRoot, and the App component is defined in src/App.tsx and imported in src/main.tsx.
- button "Copy": - button "Copy":
- img - img
......
...@@ -5,4 +5,4 @@ ...@@ -5,4 +5,4 @@
- button "Copy": - button "Copy":
- img - img
- text: log - text: log
- code: "src/main.tsx:2: import App from \"./App.tsx\"; src/main.tsx:4: createRoot(document.getElementById(\"root\")!).render(<App />); src/App.tsx:1: const App = () => <div>Minimal imported app</div>; src/App.tsx:3: export default App;" - code: "src/App.tsx:1: const App = () => <div>Minimal imported app</div>; src/App.tsx:3: export default App; src/main.tsx:2: import App from \"./App.tsx\"; src/main.tsx:4: createRoot(document.getElementById(\"root\")!).render(<App />);"
\ No newline at end of file \ No newline at end of file
...@@ -21,6 +21,8 @@ interface DyadGrepProps { ...@@ -21,6 +21,8 @@ interface DyadGrepProps {
exclude?: string; exclude?: string;
"case-sensitive"?: string; "case-sensitive"?: string;
count?: string; count?: string;
total?: string;
truncated?: string;
}; };
}; };
} }
...@@ -39,6 +41,8 @@ export const DyadGrep: React.FC<DyadGrepProps> = ({ children, node }) => { ...@@ -39,6 +41,8 @@ export const DyadGrep: React.FC<DyadGrepProps> = ({ children, node }) => {
const excludePattern = node?.properties?.exclude || ""; const excludePattern = node?.properties?.exclude || "";
const caseSensitive = node?.properties?.["case-sensitive"] === "true"; const caseSensitive = node?.properties?.["case-sensitive"] === "true";
const count = node?.properties?.count || ""; const count = node?.properties?.count || "";
const total = node?.properties?.total || "";
const truncated = node?.properties?.truncated === "true";
const hasResults = count !== "" && count !== "0"; const hasResults = count !== "" && count !== "0";
// Build description // Build description
...@@ -55,7 +59,9 @@ export const DyadGrep: React.FC<DyadGrepProps> = ({ children, node }) => { ...@@ -55,7 +59,9 @@ export const DyadGrep: React.FC<DyadGrepProps> = ({ children, node }) => {
// Build result summary // Build result summary
const resultSummary = count const resultSummary = count
? `${count} match${count === "1" ? "" : "es"}` ? truncated && total
? `${count} of ${total} matches`
: `${count} match${count === "1" ? "" : "es"}`
: ""; : "";
// Dynamic border styling // Dynamic border styling
......
...@@ -523,6 +523,8 @@ function renderCustomTag( ...@@ -523,6 +523,8 @@ function renderCustomTag(
exclude: attributes.exclude || "", exclude: attributes.exclude || "",
"case-sensitive": attributes["case-sensitive"] || "", "case-sensitive": attributes["case-sensitive"] || "",
count: attributes.count || "", count: attributes.count || "",
total: attributes.total || "",
truncated: attributes.truncated || "",
}, },
}} }}
> >
......
...@@ -15,6 +15,10 @@ import log from "electron-log"; ...@@ -15,6 +15,10 @@ import log from "electron-log";
const logger = log.scope("grep"); const logger = log.scope("grep");
const DEFAULT_LIMIT = 100;
const MAX_LIMIT = 250;
const MAX_LINE_LENGTH = 500;
const grepSchema = z.object({ const grepSchema = z.object({
query: z.string().describe("The regex pattern to search for"), query: z.string().describe("The regex pattern to search for"),
include_pattern: z include_pattern: z
...@@ -31,6 +35,14 @@ const grepSchema = z.object({ ...@@ -31,6 +35,14 @@ const grepSchema = z.object({
.boolean() .boolean()
.optional() .optional()
.describe("Whether the search should be case sensitive (default: false)"), .describe("Whether the search should be case sensitive (default: false)"),
limit: z
.number()
.min(1)
.max(MAX_LIMIT)
.optional()
.describe(
`Maximum number of matches to return (default: ${DEFAULT_LIMIT}, max: ${MAX_LIMIT}). Use include_pattern to narrow results if limit is reached.`,
),
}); });
interface RipgrepMatch { interface RipgrepMatch {
...@@ -42,6 +54,7 @@ interface RipgrepMatch { ...@@ -42,6 +54,7 @@ interface RipgrepMatch {
function buildGrepAttributes( function buildGrepAttributes(
args: Partial<z.infer<typeof grepSchema>>, args: Partial<z.infer<typeof grepSchema>>,
count?: number, count?: number,
totalCount?: number,
): string { ): string {
const attrs: string[] = []; const attrs: string[] = [];
if (args.query) { if (args.query) {
...@@ -59,9 +72,20 @@ function buildGrepAttributes( ...@@ -59,9 +72,20 @@ function buildGrepAttributes(
if (count !== undefined) { if (count !== undefined) {
attrs.push(`count="${count}"`); attrs.push(`count="${count}"`);
} }
if (totalCount !== undefined && totalCount > (count ?? 0)) {
attrs.push(`total="${totalCount}"`);
attrs.push(`truncated="true"`);
}
return attrs.join(" "); return attrs.join(" ");
} }
function truncateLineText(text: string): string {
if (text.length <= MAX_LINE_LENGTH) {
return text;
}
return text.slice(0, MAX_LINE_LENGTH) + "...";
}
async function runRipgrep({ async function runRipgrep({
appPath, appPath,
query, query,
...@@ -82,7 +106,6 @@ async function runRipgrep({ ...@@ -82,7 +106,6 @@ async function runRipgrep({
"--no-config", "--no-config",
"--max-filesize", "--max-filesize",
`${MAX_FILE_SEARCH_SIZE}`, `${MAX_FILE_SEARCH_SIZE}`,
...RIPGREP_EXCLUDED_GLOBS.flatMap((glob) => ["--glob", glob]),
]; ];
// Case sensitivity: default is case-insensitive // Case sensitivity: default is case-insensitive
...@@ -90,8 +113,9 @@ async function runRipgrep({ ...@@ -90,8 +113,9 @@ async function runRipgrep({
args.push("--ignore-case"); args.push("--ignore-case");
} }
// Include pattern // Include pattern (skip no-op "*" which would override exclusion globs
if (includePat) { // and .gitignore rules since --glob always takes precedence over ignore logic)
if (includePat && includePat !== "*") {
args.push("--glob", includePat); args.push("--glob", includePat);
} }
...@@ -100,6 +124,10 @@ async function runRipgrep({ ...@@ -100,6 +124,10 @@ async function runRipgrep({
args.push("--glob", `!${excludePat}`); args.push("--glob", `!${excludePat}`);
} }
// Exclusion globs come LAST so they always take precedence over any
// include pattern (later --glob flags override earlier ones in ripgrep)
args.push(...RIPGREP_EXCLUDED_GLOBS.flatMap((glob) => ["--glob", glob]));
args.push("--", query, "."); args.push("--", query, ".");
const rg = spawn(getRgExecutablePath(), args, { cwd: appPath }); const rg = spawn(getRgExecutablePath(), args, { cwd: appPath });
...@@ -168,7 +196,8 @@ export const grepTool: ToolDefinition<z.infer<typeof grepSchema>> = { ...@@ -168,7 +196,8 @@ export const grepTool: ToolDefinition<z.infer<typeof grepSchema>> = {
- Returns matching lines with file paths and line numbers - Returns matching lines with file paths and line numbers
- By default, the search is case-insensitive - By default, the search is case-insensitive
- Use include_pattern to filter by file type (e.g. '*.tsx') - Use include_pattern to filter by file type (e.g. '*.tsx')
- Use exclude_pattern to skip certain files (e.g. '*.test.ts')`, - Use exclude_pattern to skip certain files (e.g. '*.test.ts')
- Results are limited to ${DEFAULT_LIMIT} matches by default (max ${MAX_LIMIT}). If results are truncated, narrow your search with include_pattern or a more specific query.`,
inputSchema: grepSchema, inputSchema: grepSchema,
defaultConsent: "always", defaultConsent: "always",
...@@ -192,7 +221,9 @@ export const grepTool: ToolDefinition<z.infer<typeof grepSchema>> = { ...@@ -192,7 +221,9 @@ export const grepTool: ToolDefinition<z.infer<typeof grepSchema>> = {
}, },
execute: async (args, ctx: AgentContext) => { execute: async (args, ctx: AgentContext) => {
const matches = await runRipgrep({ const includePatWasWildcard = args.include_pattern === "*";
const allMatches = await runRipgrep({
appPath: ctx.appPath, appPath: ctx.appPath,
query: args.query, query: args.query,
includePat: args.include_pattern, includePat: args.include_pattern,
...@@ -200,18 +231,37 @@ export const grepTool: ToolDefinition<z.infer<typeof grepSchema>> = { ...@@ -200,18 +231,37 @@ export const grepTool: ToolDefinition<z.infer<typeof grepSchema>> = {
caseSensitive: args.case_sensitive, caseSensitive: args.case_sensitive,
}); });
const attrs = buildGrepAttributes(args, matches.length); const totalCount = allMatches.length;
const limit = Math.min(args.limit ?? DEFAULT_LIMIT, MAX_LIMIT);
// Sort for deterministic output (ripgrep's parallel execution can produce varying order)
const sortedMatches = [...allMatches].sort(
(a, b) => a.path.localeCompare(b.path) || a.lineNumber - b.lineNumber,
);
const matches = sortedMatches.slice(0, limit);
const wasTruncated = totalCount > limit;
const attrs = buildGrepAttributes(args, matches.length, totalCount);
if (matches.length === 0) { if (matches.length === 0) {
ctx.onXmlComplete(`<dyad-grep ${attrs}>No matches found.</dyad-grep>`); ctx.onXmlComplete(`<dyad-grep ${attrs}>No matches found.</dyad-grep>`);
return "No matches found."; return "No matches found.";
} }
// Format output: path:line: content // Format output: path:line: content (with truncated line text)
const lines = matches.map( const lines = matches.map(
(m) => `${m.path}:${m.lineNumber}: ${m.lineText}`, (m) => `${m.path}:${m.lineNumber}: ${truncateLineText(m.lineText)}`,
); );
const resultText = lines.join("\n"); let resultText = lines.join("\n");
// Add truncation notice for the AI
if (wasTruncated) {
resultText += `\n\n[TRUNCATED: Showing ${matches.length} of ${totalCount} matches. Use include_pattern to narrow your search (e.g., include_pattern="*.tsx") or use a more specific query.]`;
}
// Warn the LLM that "*" was ignored so it doesn't retry with the same pattern
if (includePatWasWildcard) {
resultText += `\n\n[NOTE: include_pattern="*" was ignored because it matches all files including git-ignored files! Omit include_pattern to search all files, or use a specific glob like "*.ts".]`;
}
ctx.onXmlComplete( ctx.onXmlComplete(
`<dyad-grep ${attrs}>\n${escapeXmlContent(resultText)}\n</dyad-grep>`, `<dyad-grep ${attrs}>\n${escapeXmlContent(resultText)}\n</dyad-grep>`,
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论