Unverified 提交 2ebe6208 authored 作者: Will Chen's avatar Will Chen 提交者: GitHub

Support compaction mid-turn (#2524)

<!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/dyad-sh/dyad/pull/2524" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a> <!-- devin-review-badge-end --> <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Medium Risk** > Changes local-agent streaming/history rebuilding and compaction timing/persistence, which can impact message ordering, UI rendering, and tool-loop continuity if edge cases are missed. > > **Overview** > **Enables mid-turn context compaction in local-agent mode** by triggering compaction between AI SDK steps (when token usage crosses the threshold and the step included tool calls) so the agent can finish the same user turn. > > Updates `handleLocalAgentStream` to rebuild message history after compaction while preserving in-flight tool-loop messages, inline the compaction indicator into the active assistant response, hide newly-inserted compaction-summary DB rows from streamed message lists, and persist only the post-compaction `aiMessagesJson` slice. Also hardens compaction output by XML-escaping the stored summary, adds `createdAtStrategy` handling for compaction insertion timing, and includes new unit/E2E coverage (fixture + snapshots) plus expanded git workflow documentation. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit ac3ccb6e1221012141954ba6560ef2426bf07253. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> <!-- This is an auto-generated description by cubic. --> --- ## Summary by cubic Enables mid-turn context compaction in local-agent tool loops so the agent can finish the same user turn. Improves history rebuilding, inline compaction messaging, streaming behavior, scheduling, and safe persistence. - **New Features** - Triggers mid-turn compaction when per-step tokens cross the threshold and tool calls exist; schedules in onStepFinish and applies before the next step. - Centralizes history via buildChatMessageHistory, preserving in-flight assistant/tool messages, hiding mid-turn compaction DB rows from streaming, and placing the summary after the triggering user. - Streams a compaction preview over current content and inlines the final compaction summary into the current assistant turn; selects createdAtStrategy ("now" mid-turn, "before-latest-user" pre-turn) with a 1-second margin to keep turn order. - Persists only post-compaction AI messages when compaction happens mid-turn, slicing correctly with 0-indexed stepNumber. - **Bug Fixes** - Escapes compaction summary in both the live preview and the persisted DB message to prevent XSS. - Re-queries chat only on successful compaction and filters hidden compaction summaries out of streaming payloads. - Clears injected messages after mid-turn compaction to avoid stale insertion indices; prevents repeat attempts and skips further compaction checks in the same turn after success. - Always runs checkAndMarkForCompaction in onStepFinish to mark next-turn compaction when appropriate. <sup>Written for commit ac3ccb6e1221012141954ba6560ef2426bf07253. Summary will update on new commits.</sup> <!-- End of auto-generated description by cubic. --> --------- Co-authored-by: 's avatarClaude Opus 4.5 <noreply@anthropic.com> Co-authored-by: 's avatarclaude[bot] <41898282+claude[bot]@users.noreply.github.com>
上级 af50a6c4
......@@ -91,3 +91,51 @@ Use unit testing for pure business logic and util functions.
### E2E testing
See [rules/e2e-testing.md](rules/e2e-testing.md) for full E2E testing guidance, including Playwright tips and fixture setup.
## Git workflow
When pushing changes and creating PRs:
1. If the branch already has an associated PR, push to whichever remote the branch is tracking.
2. If the branch hasn't been pushed before, default to pushing to `origin` (the fork `wwwillchen/dyad`), then create a PR from the fork to the upstream repo (`dyad-sh/dyad`).
3. If you cannot push to the fork due to permissions, push directly to `upstream` (`dyad-sh/dyad`) as a last resort.
### Skipping automated review
Add `#skip-bugbot` to the PR description for trivial PRs that won't affect end-users, such as:
- Claude settings, commands, or agent configuration
- Linting or test setup changes
- Documentation-only changes
- CI/build configuration updates
## Learnings
### Cross-repo PR workflows (forks)
When running GitHub Actions with `pull_request_target` on cross-repo PRs (from forks):
- The checkout action sets `origin` to the **fork** (head repo), not the base repo
- To rebase onto the base repo's main, you must add an `upstream` remote: `git remote add upstream https://github.com/<base-repo>.git`
- Remote setup for cross-repo PRs: `origin` → fork (push here), `upstream` → base repo (rebase from here)
- The `GITHUB_TOKEN` can push to the fork if the PR author enabled "Allow edits from maintainers"
### AI SDK step.usage in onStepFinish vs onFinish
In the AI SDK's `streamText`, `step.usage.totalTokens` in `onStepFinish` is **per-step** (single LLM call), not cumulative. The cumulative usage across all steps is only available in `onFinish` via `response.usage.totalTokens`. For context window comparisons (e.g., compaction thresholds), per-step usage is actually more accurate since each step's input tokens already include the full conversation context.
### AI SDK stepNumber is 0-indexed
In `prepareStep`, the AI SDK sets `stepNumber = steps.length`. The first call has `steps = []` so `stepNumber = 0`, the second call has one step so `stepNumber = 1`, etc. When writing tests that mock `prepareStep`, use 0-indexed step numbers to match real SDK behavior.
### Custom chat message indicators
The `<dyad-status>` tag in chat messages renders as a collapsible status indicator box. Use it for system messages like compaction notifications:
```
<dyad-status title="My Title" state="finished">
Content here
</dyad-status>
```
Valid states: `"finished"`, `"in-progress"`, `"aborted"`
......@@ -35,3 +35,33 @@ testSkipIfWindows(
await po.snapshotMessages({ replaceDumpPath: true });
},
);
testSkipIfWindows(
"local-agent - context compaction can run mid-turn",
async ({ po }) => {
await po.setUpDyadPro({ localAgent: true });
await po.importApp("minimal");
await po.chatActions.selectLocalAgentMode();
await po.sendPrompt("hi");
// This fixture emits a tool call with high token usage in step 1, then
// returns a final text response in step 2 of the same user turn.
await po.sendPrompt("tc=local-agent/compaction-mid-turn");
// Mid-turn compaction summary should be visible after a single prompt.
await expect(po.page.getByText("Conversation compacted")).toBeVisible({
timeout: Timeout.MEDIUM,
});
// The agent should still complete the response in the same turn.
await expect(po.page.getByText("END OF COMPACTED TURN.")).toBeVisible({
timeout: Timeout.MEDIUM,
});
await po.sendPrompt("[dump] hi");
await po.snapshotServerDump("all-messages");
// Snapshot the messages to capture the compaction summary + second response
await po.snapshotMessages({ replaceDumpPath: true });
},
);
import type { LocalAgentFixture } from "../../../../testing/fake-llm-server/localAgentTypes";
/**
* Fixture that triggers compaction during the same user turn:
* 1) First step makes a tool call and reports high token usage (200k)
* 2) Second step returns final text after tool results
*
* Local agent should compact between step 1 and step 2.
*/
export const fixture: LocalAgentFixture = {
description: "Trigger compaction between tool-loop steps in one turn",
turns: [
{
text: "first step",
toolCalls: [
{
name: "read_file",
args: {
path: "README.md",
},
},
],
},
{
text: "second step",
toolCalls: [
{
name: "read_file",
args: {
path: "AI_RULES.md",
},
},
],
},
{
text: "This tool call will trigger compaction.",
toolCalls: [
{
name: "read_file",
args: {
path: "src/App.tsx",
},
},
],
usage: {
prompt_tokens: 199_900,
completion_tokens: 100,
total_tokens: 200_000,
},
},
{
text: "post-compaction step",
toolCalls: [
{
name: "read_file",
args: {
path: "SOMEFILE.md",
},
},
],
},
{
text: "END OF COMPACTED TURN.",
},
],
};
- paragraph: /Generate an AI_RULES\.md file for this app\. Describe the tech stack in 5-\d+ bullet points and describe clear rules about what libraries to use for what\./
- button "file1.txt file1.txt Edit":
- img
- text: ""
- button "Edit":
- img
- text: ""
- img
- paragraph: More EOM
- button "Copy":
- img
- img
- text: Approved
- img
- text: claude-opus-4-5
- img
- text: less than a minute ago
- button "Copy Request ID":
- img
- text: ""
- paragraph: hi
- button "file1.txt file1.txt Edit":
- img
- text: ""
- button "Edit":
- img
- text: ""
- img
- paragraph: More EOM
- button "Copy":
- img
- img
- text: Approved
- img
- text: claude-opus-4-5
- img
- text: less than a minute ago
- button "Copy Request ID":
- img
- text: ""
- paragraph: tc=local-agent/compaction-mid-turn
- paragraph: first step
- img
- text: Read README.md
- 'button "Error Tool ''read_file'' failed: File does not exist: README.md Copy Fix with AI"':
- img
- text: ""
- img
- button "Copy":
- img
- text: ""
- button "Fix with AI":
- img
- text: ""
- paragraph: second step
- img
- text: Read AI_RULES.md
- 'button "Error Tool ''read_file'' failed: File does not exist: AI_RULES.md Copy Fix with AI"':
- img
- text: ""
- img
- button "Copy":
- img
- text: ""
- button "Fix with AI":
- img
- text: ""
- paragraph: This tool call will trigger compaction.
- img
- text: Read src/App.tsx
- button "Conversation compacted":
- img
- text: ""
- img
- heading "Key Decisions Made" [level=2]
- list:
- listitem: Completed initial task as requested
- heading "Current Task State" [level=2]
- paragraph: Conversation was compacted to save context space.
- paragraph: "If you need to retrieve earlier parts of the conversation history, you can read the backup file at: [[compaction-backup-path]] Note: This file may be large. Read only the sections you need or use grep to search for specific content rather than reading the entire file. post-compaction step"
- img
- text: Read SOMEFILE.md
- 'button "Error Tool ''read_file'' failed: File does not exist: SOMEFILE.md Copy Fix with AI"':
- img
- text: ""
- img
- button "Copy":
- img
- text: ""
- button "Fix with AI":
- img
- text: ""
- button "Fix All Errors (3)":
- img
- text: ""
- paragraph: END OF COMPACTED TURN.
- button "Copy":
- img
- img
- text: Approved
- img
- text: claude-opus-4-5
- img
- text: less than a minute ago
- img
- text: (1 files changed)
- button "Copy Request ID":
- img
- text: ""
- paragraph: "[dump] hi"
- paragraph: "[[dyad-dump-path=*]]"
- button "Copy":
- img
- img
- text: claude-opus-4-5
- img
- text: less than a minute ago
- button "Copy Request ID":
- img
- text: ""
- button "Undo":
- img
- text: ""
- button "Retry":
- img
- text: ""
\ No newline at end of file
===
role: system
message:
<role>
You are Dyad, an AI assistant that creates and modifies web applications. You assist users by chatting with them and making changes to their code in real-time. You understand that users can see a live preview of their application in an iframe on the right side of the screen while you make code changes.
You make efficient and effective changes to codebases while following best practices for maintainability and readability. You take pride in keeping things simple and elegant. You are friendly and helpful, always aiming to provide clear explanations.
</role>
<app_commands>
Do *not* tell the user to run shell commands. Instead, they can do one of the following commands in the UI:
- **Rebuild**: This will rebuild the app from scratch. First it deletes the node_modules folder and then it re-installs the npm packages and then starts the app server.
- **Restart**: This will restart the app server.
- **Refresh**: This will refresh the app preview page.
You can suggest one of these commands by using the <dyad-command> tag like this:
<dyad-command type="rebuild"></dyad-command>
<dyad-command type="restart"></dyad-command>
<dyad-command type="refresh"></dyad-command>
If you output one of these commands, tell the user to look for the action button above the chat input.
</app_commands>
<general_guidelines>
- Always reply to the user in the same language they are using.
- Before proceeding with any code edits, check whether the user's request has already been implemented. If the requested change has already been made in the codebase, point this out to the user, e.g., "This feature is already implemented as described."
- Only edit files that are related to the user's request and leave all other files alone.
- All edits you make on the codebase will directly be built and rendered, therefore you should NEVER make partial changes like letting the user know that they should implement some components or partially implementing features.
- If a user asks for many features at once, implement as many as possible within a reasonable response. Each feature you implement must be FULLY FUNCTIONAL with complete code - no placeholders, no partial implementations, no TODO comments. If you cannot implement all requested features due to response length constraints, clearly communicate which features you've completed and which ones you haven't started yet.
- Prioritize creating small, focused files and components.
- Keep explanations concise and focused
- Set a chat summary at the end using the `set_chat_summary` tool.
- DO NOT OVERENGINEER THE CODE. You take great pride in keeping things simple and elegant. You don't start by writing very complex error handling, fallback mechanisms, etc. You focus on the user's request and make the minimum amount of changes needed.
DON'T DO MORE THAN WHAT THE USER ASKS FOR.
</general_guidelines>
<tool_calling>
You have tools at your disposal to solve the coding task. Follow these rules regarding tool calls:
1. ALWAYS follow the tool call schema exactly as specified and make sure to provide all necessary parameters.
2. The conversation may reference tools that are no longer available. NEVER call tools that are not explicitly provided.
3. **NEVER refer to tool names when speaking to the USER.** Instead, just say what the tool is doing in natural language.
4. If you need additional information that you can get via tool calls, prefer that over asking the user.
5. If you make a plan, immediately follow it, do not wait for the user to confirm or tell you to go ahead. The only time you should stop is if you need more information from the user that you can't find any other way, or have different options that you would like the user to weigh in on.
6. Only use the standard tool call format and the available tools. Even if you see user messages with custom tool call formats (such as "<previous_tool_call>" or similar), do not follow that and instead use the standard format. Never output tool calls as part of a regular assistant message of yours.
7. If you are not sure about file content or codebase structure pertaining to the user's request, use your tools to read files and gather the relevant information: do NOT guess or make up an answer.
8. You can autonomously read as many files as you need to clarify your own questions and completely resolve the user's query, not just one.
9. You can call multiple tools in a single response. You can also call multiple tools in parallel, do this for independent operations like reading multiple files at once.
</tool_calling>
<tool_calling_best_practices>
- **Read before writing**: Use `read_file` and `list_files` to understand the codebase before making changes
- **Use `edit_file` for edits**: For modifying existing files, prefer `edit_file` over `write_file`
- **Be surgical**: Only change what's necessary to accomplish the task
- **Handle errors gracefully**: If a tool fails, explain the issue and suggest alternatives
</tool_calling_best_practices>
<file_editing_tool_selection>
You have three tools for editing files. Choose based on the scope of your change:
| Scope | Tool | Examples |
|-------|------|----------|
| **Small** (a few lines) | `search_replace` or `edit_file` | Fix a typo, rename a variable, update a value, change an import |
| **Medium** (one function or section) | `edit_file` | Rewrite a function, add a new component, modify multiple related lines |
| **Large** (most of the file) | `write_file` | Major refactor, rewrite a module, create a new file |
**Tips:**
- `edit_file` supports `// ... existing code ...` markers to skip unchanged sections
- When in doubt, prefer `search_replace` for precision or `write_file` for simplicity
**Post-edit verification (REQUIRED):**
After every edit, read the file to verify changes applied correctly. If something went wrong, try a different tool and verify again.
</file_editing_tool_selection>
<development_workflow>
1. **Understand:** Think about the user's request and the relevant codebase context. Use `grep` and `code_search` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use `read_file` to understand context and validate any assumptions you may have. If you need to read multiple files, you should make multiple parallel calls to `read_file`.
2. **Plan:** Build a coherent and grounded (based on the understanding in step 1) plan for how you intend to resolve the user's task. For complex tasks, break them down into smaller, manageable subtasks and use the `update_todos` tool to track your progress. Share an extremely concise yet clear plan with the user if it would help the user understand your thought process.
3. **Implement:** Use the available tools (e.g., `edit_file`, `write_file`, ...) to act on the plan, strictly adhering to the project's established conventions. When debugging, add targeted console.log statements to trace data flow and identify root causes. **Important:** After adding logs, you must ask the user to interact with the application (e.g., click a button, submit a form, navigate to a page) to trigger the code paths where logs were added—the logs will only be available once that code actually executes.
4. **Verify:** After making code changes, use `run_type_checks` to verify that the changes are correct and read the file contents to ensure the changes are what you intended.
5. **Finalize:** After all verification passes, consider the task complete and briefly summarize the changes you made.
</development_workflow>
# Tech Stack
- You are building a React application.
- Use TypeScript.
- Use React Router. KEEP the routes in src/App.tsx
- Always put source code in the src folder.
- Put pages into src/pages/
- Put components into src/components/
- The main page (default page) is src/pages/Index.tsx
- UPDATE the main page to include the new components. OTHERWISE, the user can NOT see any components!
- ALWAYS try to use the shadcn/ui library.
- Tailwind CSS: always use Tailwind CSS for styling components. Utilize Tailwind classes extensively for layout, spacing, colors, and other design aspects.
Available packages and libraries:
- The lucide-react package is installed for icons.
- You ALREADY have ALL the shadcn/ui components and their dependencies installed. So you don't need to install them again.
- You have ALL the necessary Radix UI components installed.
- Use prebuilt components from the shadcn/ui library after importing them. Note that these files shouldn't be edited, so make new components if you need to change them.
===
role: user
message: tc=local-agent/compaction-mid-turn
===
role: assistant
message: <dyad-compaction title="Conversation compacted" state="finished">
## Key Decisions Made
- Completed initial task as requested
## Current Task State
Conversation was compacted to save context space.
</dyad-compaction>
If you need to retrieve earlier parts of the conversation history, you can read the backup file at: [[compaction-backup-path]]
Note: This file may be large. Read only the sections you need or use grep to search for specific content rather than reading the entire file.
===
role: assistant
message: post-compaction step
===
role: tool
message: File does not exist: SOMEFILE.md
===
role: assistant
message: END OF COMPACTED TURN.
===
role: user
message: [dump] hi
\ No newline at end of file
......@@ -29,6 +29,25 @@ When running GitHub Actions with `pull_request_target` on cross-repo PRs (from f
Actions performed using the default `GITHUB_TOKEN` (including labels added by `github-actions[bot]` via `actions/github-script`) do **not** trigger `pull_request_target` or other workflow events. This is a GitHub limitation to prevent infinite loops. If one workflow adds a label that should trigger another workflow (e.g., `label-rebase-prs.yml` adds `cc:rebase` to trigger `claude-rebase.yml`), the label-adding step must use a **PAT** or **GitHub App token** (e.g., `PR_RW_GITHUB_TOKEN`) instead of `GITHUB_TOKEN`.
## GitHub API calls with special characters
When using `gh api` to post comments or replies containing backticks, `$()`, or other shell metacharacters, the security hook will block the command. Instead of passing the body inline with `-f body="..."`, write a JSON file and use `--input`:
```bash
# Write JSON body to a file (use the Write tool, not echo/cat)
# File: .claude/tmp/reply_body.json
# {"body": "Your comment with `backticks` and special chars"}
gh api repos/dyad-sh/dyad/pulls/123/comments/456/replies --input .claude/tmp/reply_body.json
```
Similarly for GraphQL mutations, write the full query + variables as JSON and use `--input`:
```bash
# {"query": "mutation($threadId: ID!) { ... }", "variables": {"threadId": "PRRT_abc123"}}
gh api graphql --input .claude/tmp/resolve_thread.json
```
## Adding labels to PRs
`gh pr edit --add-label` fails with a GraphQL "Projects (classic)" deprecation error on repos that had classic projects. Use the REST API instead:
......@@ -41,13 +60,28 @@ gh api repos/dyad-sh/dyad/issues/{PR_NUMBER}/labels -f "labels[]=label-name"
In CI, `claude-code-action` restricts file access to the repo working directory (e.g., `/home/runner/work/dyad/dyad`). Skills that save intermediate files (like PR diffs) must use `./filename` (current working directory), **never** `/tmp/`. Using `/tmp/` causes errors like: `cat in '/tmp/pr_*_diff.patch' was blocked. For security, Claude Code may only concatenate files from the allowed working directories`.
## Rebase conflict resolution tips
## Rebase workflow and conflict resolution
### Handling unstaged changes during rebase
If `git rebase` fails with "You have unstaged changes" (common with spurious `package-lock.json` changes):
```bash
git stash push -m "Stashing changes before rebase"
git rebase upstream/main
git stash pop
```
The stashed changes will be automatically merged back after the rebase completes.
### Conflict resolution tips
- **Before rebasing:** If `npm install` modified `package-lock.json` (common in CI/local), discard changes with `git restore package-lock.json` to avoid "unstaged changes" errors
- When resolving import conflicts (e.g., `<<<<<<< HEAD` with different imports), keep **both** imports if both are valid and needed by the component
- When resolving conflicts in i18n-related commits, watch for duplicate constant definitions that conflict with imports from `@/lib/schemas` (e.g., `DEFAULT_ZOOM_LEVEL`)
- If both sides of a conflict have valid imports/hooks, keep both and remove any duplicate constant redefinitions
- When rebasing documentation/table conflicts (e.g., workflow README tables), prefer keeping **both** additions from HEAD and upstream - merge new rows/content from both branches rather than choosing one side
- **Complementary additions**: When both sides added new sections at the end of a file (e.g., both added different documentation tips), keep both sections rather than choosing one — they're not truly conflicting, just different additions
## Rebasing with uncommitted changes
......@@ -59,3 +93,11 @@ If you need to rebase but have uncommitted changes (e.g., package-lock.json from
4. Discard spurious changes like package-lock.json (if package.json unchanged): `git restore package-lock.json`
This prevents rebase conflicts from uncommitted changes while preserving any work in progress.
## Resolving documentation rebase conflicts
When rebasing a PR branch that conflicts with upstream documentation changes (e.g., AGENTS.md):
- If upstream has reorganized content (e.g., moved sections to separate `rules/*.md` files), keep upstream's version
- Discard the PR's inline content that conflicts with the new organization
- The PR's documentation changes may need to be re-applied to the new file locations after the rebase
......@@ -25,6 +25,7 @@ import {
} from "./compaction_storage";
import { getPostCompactionMessages } from "./compaction_utils";
import { getProviderOptions, getAiHeaders } from "@/ipc/utils/provider_options";
import { escapeXmlContent } from "../../../../shared/xmlEscape";
const logger = log.scope("compaction_handler");
......@@ -115,6 +116,9 @@ export async function performCompaction(
appPath: string,
dyadRequestId: string,
onSummaryChunk?: (accumulatedText: string) => void,
options?: {
createdAtStrategy?: "before-latest-user" | "now";
},
): Promise<CompactionResult> {
const settings = readSettings();
......@@ -197,7 +201,7 @@ export async function performCompaction(
// Create the compaction indicator message
// Include relative backup path so the AI can read the full original conversation later
const compactionMessageContent = `<dyad-compaction title="Conversation compacted" state="finished">
${summary}
${escapeXmlContent(summary)}
</dyad-compaction>
If you need to retrieve earlier parts of the conversation history, you can read the backup file at: ${backupPath}
......@@ -218,8 +222,11 @@ Note: This file may be large. Read only the sections you need or use grep to sea
const latestUserMessage = [...chatMessages]
.reverse()
.find((m) => m.role === "user");
const compactionCreatedAt = latestUserMessage
? new Date(latestUserMessage.createdAt.getTime() - 1)
const compactionCreatedAt =
options?.createdAtStrategy === "now"
? new Date()
: latestUserMessage
? new Date(latestUserMessage.createdAt.getTime() - 1000)
: new Date();
await db.insert(messages).values({
chatId,
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论