Unverified 提交 970b440e authored 作者: Will Chen's avatar Will Chen 提交者: GitHub

Keep local agent running after chat summary (#3260)

## Summary - Stop treating set_chat_summary as a local-agent stream stop condition. - Update local-agent prompt and tool guidance to call the chat summary tool early and exactly once. - Add coverage and update prompt/request snapshots for the new behavior. ## Test plan - npm run fmt && npm run lint:fix && npm run ts - npm test <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/dyad-sh/dyad/pull/3260" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->
上级 3e2e137c
......@@ -176,7 +176,7 @@
"type": "function",
"function": {
"name": "set_chat_summary",
"description": "Set the title/summary for this chat message. You should only call this tool at the end of the turn when you have finished calling all the other tools.",
"description": "Set the title/summary for this chat. Call this tool exactly once early in the turn, as soon as you understand the user's request well enough to write a short title. Do not wait until the end of the turn.",
"parameters": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
......
......@@ -341,7 +341,7 @@
"type": "function",
"function": {
"name": "set_chat_summary",
"description": "Set the title/summary for this chat message. You should only call this tool at the end of the turn when you have finished calling all the other tools.",
"description": "Set the title/summary for this chat. Call this tool exactly once early in the turn, as soon as you understand the user's request well enough to write a short title. Do not wait until the end of the turn.",
"parameters": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
......
......@@ -18,3 +18,7 @@ Agent tool definitions live in `src/pro/main/ipc/handlers/local_agent/tools/`. E
## Metadata-only stop tools
- If a metadata-only tool such as `set_chat_summary` is added to `stopWhen`, audit downstream pass gates that inspect the final step's `toolCalls`. A final metadata tool call should not suppress safety follow-up passes such as incomplete todo reminders.
## Prompt and request snapshots
- When changing local-agent prompt text or tool descriptions, update both prompt unit snapshots and E2E request snapshots; stale request snapshots can still contain old tool descriptions even after unit prompt snapshots pass.
......@@ -27,13 +27,13 @@ If you output one of these commands, tell the user to look for the action button
- Always reply to the user in the same language they are using.
- Keep explanations concise and focused
- If the user asks for help or wants to give feedback, tell them to use the Help button in the bottom left.
- Set a chat summary early in the turn using the \`set_chat_summary\` tool. Call it exactly once, as soon as you understand the user's request well enough to write a short title. Do not wait until the end of the turn.
- Be careful not to introduce security vulnerabilities such as command injection, XSS, SQL injection, and other OWASP top 10 vulnerabilities. If you notice that you wrote insecure code, immediately fix it. Prioritize writing safe, secure, and correct code.
- Before proceeding with any code edits, check whether the user's request has already been implemented. If the requested change has already been made in the codebase, point this out to the user, e.g., "This feature is already implemented as described."
- Only edit files that are related to the user's request and leave all other files alone.
- All edits you make on the codebase will directly be built and rendered, therefore you should NEVER make partial changes like letting the user know that they should implement some components or partially implementing features.
- If a user asks for many features at once, implement as many as possible within a reasonable response. Each feature you implement must be FULLY FUNCTIONAL with complete code - no placeholders, no partial implementations, no TODO comments. If you cannot implement all requested features due to response length constraints, clearly communicate which features you've completed and which ones you haven't started yet.
- Prioritize creating small, focused files and components.
- Set a chat summary at the end of a turn using the \`set_chat_summary\` tool.
- Avoid over-engineering. Only make changes that are directly requested or clearly necessary. Keep solutions simple and focused.
- Don't add features, refactor code, or make "improvements" beyond what was asked. A bug fix doesn't need surrounding code cleaned up. A simple feature doesn't need extra configurability. Don't add docstrings, comments, or type annotations to code you didn't change. Only add comments where the logic isn't self-evident.
- Don't add error handling, fallbacks, or validation for scenarios that can't happen. Trust internal code and framework guarantees. Only validate at system boundaries (user input, external APIs). Don't use feature flags or backwards-compatibility shims when you can just change the code.
......@@ -141,6 +141,7 @@ You are friendly and helpful, always aiming to provide clear explanations. You t
- Always reply to the user in the same language they are using.
- Keep explanations concise and focused
- If the user asks for help or wants to give feedback, tell them to use the Help button in the bottom left.
- Set a chat summary early in the turn using the \`set_chat_summary\` tool. Call it exactly once, as soon as you understand the user's request well enough to write a short title. Do not wait until the end of the turn.
- Use your tools to read and understand the codebase before answering questions
- Provide clear, accurate explanations based on the actual code
- When explaining code, reference specific files and line numbers when helpful
......@@ -211,13 +212,13 @@ If you output one of these commands, tell the user to look for the action button
- Always reply to the user in the same language they are using.
- Keep explanations concise and focused
- If the user asks for help or wants to give feedback, tell them to use the Help button in the bottom left.
- Set a chat summary early in the turn using the \`set_chat_summary\` tool. Call it exactly once, as soon as you understand the user's request well enough to write a short title. Do not wait until the end of the turn.
- Be careful not to introduce security vulnerabilities such as command injection, XSS, SQL injection, and other OWASP top 10 vulnerabilities. If you notice that you wrote insecure code, immediately fix it. Prioritize writing safe, secure, and correct code.
- Before proceeding with any code edits, check whether the user's request has already been implemented. If the requested change has already been made in the codebase, point this out to the user, e.g., "This feature is already implemented as described."
- Only edit files that are related to the user's request and leave all other files alone.
- All edits you make on the codebase will directly be built and rendered, therefore you should NEVER make partial changes like letting the user know that they should implement some components or partially implementing features.
- If a user asks for many features at once, implement as many as possible within a reasonable response. Each feature you implement must be FULLY FUNCTIONAL with complete code - no placeholders, no partial implementations, no TODO comments. If you cannot implement all requested features due to response length constraints, clearly communicate which features you've completed and which ones you haven't started yet.
- Prioritize creating small, focused files and components.
- Set a chat summary at the end of a turn using the \`set_chat_summary\` tool.
- Avoid over-engineering. Only make changes that are directly requested or clearly necessary. Keep solutions simple and focused.
- Don't add features, refactor code, or make "improvements" beyond what was asked. A bug fix doesn't need surrounding code cleaned up. A simple feature doesn't need extra configurability. Don't add docstrings, comments, or type annotations to code you didn't change. Only add comments where the logic isn't self-evident.
- Don't add error handling, fallbacks, or validation for scenarios that can't happen. Trust internal code and framework guarantees. Only validate at system boundaries (user input, external APIs). Don't use feature flags or backwards-compatibility shims when you can just change the code.
......
......@@ -1368,6 +1368,32 @@ describe("handleLocalAgentStream", () => {
});
describe("Todo follow-up", () => {
it("does not stop the stream when set_chat_summary is called", async () => {
// Arrange
const { event } = createFakeEvent();
mockSettings = buildTestSettings({ enableDyadPro: true });
mockChatData = buildTestChat();
mockStreamResult = createFakeStream([]);
// Act
await handleLocalAgentStream(
event,
{ chatId: 1, prompt: "test" },
new AbortController(),
{
placeholderMessageId: 10,
systemPrompt: "You are helpful",
dyadRequestId,
},
);
// Assert
const streamOptions = vi.mocked(streamText).mock.calls[0]?.[0] as any;
expect(streamOptions.stopWhen).not.toContainEqual({
toolName: "set_chat_summary",
});
});
it("runs a follow-up pass when the first pass ends with set_chat_summary and incomplete todos remain", async () => {
// Arrange
const { event } = createFakeEvent();
......
......@@ -720,8 +720,6 @@ export async function handleLocalAgentStream(
tools: allTools,
stopWhen: [
stepCountIs(maxToolCallSteps),
// We instruct AI to only emit set chat summary tool call at the end of the turn.
hasToolCall(setChatSummaryTool.name),
// User needs to explicitly set up integration before AI can continue.
hasToolCall(addIntegrationTool.name),
// In plan mode, also stop after writing a plan or exiting plan mode.
......@@ -1213,7 +1211,7 @@ export async function handleLocalAgentStream(
}
// Check if the model ended with text only (no tool calls in the final step).
// A final set_chat_summary call is end-of-turn metadata, so it should not
// set_chat_summary is metadata, so a summary-only final step should not
// suppress the todo safety follow-up when the pass already produced text.
// This is more reliable than passProducedChatText which is set on any text-delta
// during the stream (including preambles before tool calls).
......
......@@ -13,7 +13,7 @@ export const setChatSummaryTool: ToolDefinition<
> = {
name: "set_chat_summary",
description:
"Set the title/summary for this chat message. You should only call this tool at the end of the turn when you have finished calling all the other tools.",
"Set the title/summary for this chat. Call this tool exactly once early in the turn, as soon as you understand the user's request well enough to write a short title. Do not wait until the end of the turn.",
inputSchema: setChatSummarySchema,
defaultConsent: "always",
......
......@@ -31,7 +31,8 @@ If you output one of these commands, tell the user to look for the action button
const COMMON_GUIDELINES = `- All text you output outside of tool use is displayed to the user. Output text to communicate with the user. You can use Github-flavored markdown for formatting.
- Always reply to the user in the same language they are using.
- Keep explanations concise and focused
- If the user asks for help or wants to give feedback, tell them to use the Help button in the bottom left.`;
- If the user asks for help or wants to give feedback, tell them to use the Help button in the bottom left.
- Set a chat summary early in the turn using the \`set_chat_summary\` tool. Call it exactly once, as soon as you understand the user's request well enough to write a short title. Do not wait until the end of the turn.`;
const GENERAL_GUIDELINES_BLOCK = `<general_guidelines>
${COMMON_GUIDELINES}
......@@ -41,7 +42,6 @@ ${COMMON_GUIDELINES}
- All edits you make on the codebase will directly be built and rendered, therefore you should NEVER make partial changes like letting the user know that they should implement some components or partially implementing features.
- If a user asks for many features at once, implement as many as possible within a reasonable response. Each feature you implement must be FULLY FUNCTIONAL with complete code - no placeholders, no partial implementations, no TODO comments. If you cannot implement all requested features due to response length constraints, clearly communicate which features you've completed and which ones you haven't started yet.
- Prioritize creating small, focused files and components.
- Set a chat summary at the end of a turn using the \`set_chat_summary\` tool.
- Avoid over-engineering. Only make changes that are directly requested or clearly necessary. Keep solutions simple and focused.
- Don't add features, refactor code, or make "improvements" beyond what was asked. A bug fix doesn't need surrounding code cleaned up. A simple feature doesn't need extra configurability. Don't add docstrings, comments, or type annotations to code you didn't change. Only add comments where the logic isn't self-evident.
- Don't add error handling, fallbacks, or validation for scenarios that can't happen. Trust internal code and framework guarantees. Only validate at system boundaries (user input, external APIs). Don't use feature flags or backwards-compatibility shims when you can just change the code.
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论