Support compaction mid-turn (#2524)

 --- <a href="https://app.devin.ai/review/dyad-sh/dyad/pull/2524" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a>   --- > [!NOTE] > **Medium Risk** > Changes local-agent streaming/history rebuilding and compaction timing/persistence, which can impact message ordering, UI rendering, and tool-loop continuity if edge cases are missed. > > **Overview** > **Enables mid-turn context compaction in local-agent mode** by triggering compaction between AI SDK steps (when token usage crosses the threshold and the step included tool calls) so the agent can finish the same user turn. > > Updates `handleLocalAgentStream` to rebuild message history after compaction while preserving in-flight tool-loop messages, inline the compaction indicator into the active assistant response, hide newly-inserted compaction-summary DB rows from streamed message lists, and persist only the post-compaction `aiMessagesJson` slice. Also hardens compaction output by XML-escaping the stored summary, adds `createdAtStrategy` handling for compaction insertion timing, and includes new unit/E2E coverage (fixture + snapshots) plus expanded git workflow documentation. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit ac3ccb6e1221012141954ba6560ef2426bf07253. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup>   --- ## Summary by cubic Enables mid-turn context compaction in local-agent tool loops so the agent can finish the same user turn. Improves history rebuilding, inline compaction messaging, streaming behavior, scheduling, and safe persistence. - **New Features** - Triggers mid-turn compaction when per-step tokens cross the threshold and tool calls exist; schedules in onStepFinish and applies before the next step. - Centralizes history via buildChatMessageHistory, preserving in-flight assistant/tool messages, hiding mid-turn compaction DB rows from streaming, and placing the summary after the triggering user. - Streams a compaction preview over current content and inlines the final compaction summary into the current assistant turn; selects createdAtStrategy ("now" mid-turn, "before-latest-user" pre-turn) with a 1-second margin to keep turn order. - Persists only post-compaction AI messages when compaction happens mid-turn, slicing correctly with 0-indexed stepNumber. - **Bug Fixes** - Escapes compaction summary in both the live preview and the persisted DB message to prevent XSS. - Re-queries chat only on successful compaction and filters hidden compaction summaries out of streaming payloads. - Clears injected messages after mid-turn compaction to avoid stale insertion indices; prevents repeat attempts and skips further compaction checks in the same turn after success. - Always runs checkAndMarkForCompaction in onStepFinish to mark next-turn compaction when appropriate. <sup>Written for commit ac3ccb6e1221012141954ba6560ef2426bf07253. Summary will update on new commits.</sup>  --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>

Support compaction mid-turn (#2524)
2ebe6208 · Will Chen · GitHub · af50a6c4 · 2ebe6208 · 2ebe6208
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -91,3 +91,51 @@ Use unit testing for pure business logic and util functions.
 ### E2E testing
 See [rules/e2e-testing.md](rules/e2e-testing.md) for full E2E testing guidance, including Playwright tips and fixture setup.
+## Git workflow
+When pushing changes and creating PRs:
+1. If the branch already has an associated PR, push to whichever remote the branch is tracking.
+2. If the branch hasn't been pushed before, default to pushing to `origin` (the fork `wwwillchen/dyad`), then create a PR from the fork to the upstream repo (`dyad-sh/dyad`).
+3. If you cannot push to the fork due to permissions, push directly to `upstream` (`dyad-sh/dyad`) as a last resort.
+### Skipping automated review
+Add `#skip-bugbot` to the PR description for trivial PRs that won't affect end-users, such as:
+- Claude settings, commands, or agent configuration
+- Linting or test setup changes
+- Documentation-only changes
+- CI/build configuration updates
+## Learnings
+### Cross-repo PR workflows (forks)
+When running GitHub Actions with `pull_request_target` on cross-repo PRs (from forks):
+- The checkout action sets `origin` to the **fork** (head repo), not the base repo
+- To rebase onto the base repo's main, you must add an `upstream` remote: `git remote add upstream https://github.com/<base-repo>.git`
+- Remote setup for cross-repo PRs: `origin` → fork (push here), `upstream` → base repo (rebase from here)
+- The `GITHUB_TOKEN` can push to the fork if the PR author enabled "Allow edits from maintainers"
+### AI SDK step.usage in onStepFinish vs onFinish
+In the AI SDK's `streamText`, `step.usage.totalTokens` in `onStepFinish` is **per-step** (single LLM call), not cumulative. The cumulative usage across all steps is only available in `onFinish` via `response.usage.totalTokens`. For context window comparisons (e.g., compaction thresholds), per-step usage is actually more accurate since each step's input tokens already include the full conversation context.
+### AI SDK stepNumber is 0-indexed
+In `prepareStep`, the AI SDK sets `stepNumber = steps.length`. The first call has `steps = []` so `stepNumber = 0`, the second call has one step so `stepNumber = 1`, etc. When writing tests that mock `prepareStep`, use 0-indexed step numbers to match real SDK behavior.
+### Custom chat message indicators
+The `<dyad-status>` tag in chat messages renders as a collapsible status indicator box. Use it for system messages like compaction notifications:
+```
+<dyad-status title="My Title" state="finished">
+Content here
+</dyad-status>
+```
+Valid states: `"finished"`, `"in-progress"`, `"aborted"`
--- a/e2e-tests/context_compaction.spec.ts
+++ b/e2e-tests/context_compaction.spec.ts
@@ -35,3 +35,33 @@ testSkipIfWindows(
    await po.snapshotMessages({ replaceDumpPath: true });
  },
 );
+testSkipIfWindows(
+  "local-agent - context compaction can run mid-turn",
+  async ({ po }) => {
+    await po.setUpDyadPro({ localAgent: true });
+    await po.importApp("minimal");
+    await po.chatActions.selectLocalAgentMode();
+    await po.sendPrompt("hi");
+    // This fixture emits a tool call with high token usage in step 1, then
+    // returns a final text response in step 2 of the same user turn.
+    await po.sendPrompt("tc=local-agent/compaction-mid-turn");
+    // Mid-turn compaction summary should be visible after a single prompt.
+    await expect(po.page.getByText("Conversation compacted")).toBeVisible({
+      timeout: Timeout.MEDIUM,
+    });
+    // The agent should still complete the response in the same turn.
+    await expect(po.page.getByText("END OF COMPACTED TURN.")).toBeVisible({
+      timeout: Timeout.MEDIUM,
+    });
+    await po.sendPrompt("[dump] hi");
+    await po.snapshotServerDump("all-messages");
+    // Snapshot the messages to capture the compaction summary + second response
+    await po.snapshotMessages({ replaceDumpPath: true });
+  },
+);
--- a/e2e-tests/fixtures/engine/local-agent/compaction-mid-turn.ts
+++ b/e2e-tests/fixtures/engine/local-agent/compaction-mid-turn.ts
+import type { LocalAgentFixture } from "../../../../testing/fake-llm-server/localAgentTypes";
+/**
+ * Fixture that triggers compaction during the same user turn:
+ * 1) First step makes a tool call and reports high token usage (200k)
+ * 2) Second step returns final text after tool results
+ *
+ * Local agent should compact between step 1 and step 2.
+ */
+export const fixture: LocalAgentFixture = {
+  description: "Trigger compaction between tool-loop steps in one turn",
+  turns: [
+    {
+      text: "first step",
+      toolCalls: [
+        {
+          name: "read_file",
+          args: {
+            path: "README.md",
+          },
+        },
+      ],
+    },
+    {
+      text: "second step",
+      toolCalls: [
+        {
+          name: "read_file",
+          args: {
+            path: "AI_RULES.md",
+          },
+        },
+      ],
+    },
+    {
+      text: "This tool call will trigger compaction.",
+      toolCalls: [
+        {
+          name: "read_file",
+          args: {
+            path: "src/App.tsx",
+          },
+        },
+      ],
+      usage: {
+        prompt_tokens: 199_900,
+        completion_tokens: 100,
+        total_tokens: 200_000,
+      },
+    },
+    {
+      text: "post-compaction step",
+      toolCalls: [
+        {
+          name: "read_file",
+          args: {
+            path: "SOMEFILE.md",
+          },
+        },
+      ],
+    },
+    {
+      text: "END OF COMPACTED TURN.",
+    },
+  ],
+};
--- a/e2e-tests/snapshots/context_compaction.spec.ts_local-agent---context-compaction-can-run-mid-turn-1.aria.yml
+++ b/e2e-tests/snapshots/context_compaction.spec.ts_local-agent---context-compaction-can-run-mid-turn-1.aria.yml
+- paragraph: /Generate an AI_RULES\.md file for this app\. Describe the tech stack in 5-\d+ bullet points and describe clear rules about what libraries to use for what\./
+- button "file1.txt file1.txt Edit":
+  - img
+  - text: ""
+  - button "Edit":
+    - img
+    - text: ""
+  - img
+- paragraph: More EOM
+- button "Copy":
+  - img
+- img
+- text: Approved
+- img
+- text: claude-opus-4-5
+- img
+- text: less than a minute ago
+- button "Copy Request ID":
+  - img
+  - text: ""
+- paragraph: hi
+- button "file1.txt file1.txt Edit":
+  - img
+  - text: ""
+  - button "Edit":
+    - img
+    - text: ""
+  - img
+- paragraph: More EOM
+- button "Copy":
+  - img
+- img
+- text: Approved
+- img
+- text: claude-opus-4-5
+- img
+- text: less than a minute ago
+- button "Copy Request ID":
+  - img
+  - text: ""
+- paragraph: tc=local-agent/compaction-mid-turn
+- paragraph: first step
+- img
+- text: Read README.md
+- 'button "Error Tool ''read_file'' failed: File does not exist: README.md Copy Fix with AI"':
+  - img
+  - text: ""
+  - img
+  - button "Copy":
+    - img
+    - text: ""
+  - button "Fix with AI":
+    - img
+    - text: ""
+- paragraph: second step
+- img
+- text: Read AI_RULES.md
+- 'button "Error Tool ''read_file'' failed: File does not exist: AI_RULES.md Copy Fix with AI"':
+  - img
+  - text: ""
+  - img
+  - button "Copy":
+    - img
+    - text: ""
+  - button "Fix with AI":
+    - img
+    - text: ""
+- paragraph: This tool call will trigger compaction.
+- img
+- text: Read src/App.tsx
+- button "Conversation compacted":
+  - img
+  - text: ""
+  - img
+- heading "Key Decisions Made" [level=2]
+- list:
+  - listitem: Completed initial task as requested
+- heading "Current Task State" [level=2]
+- paragraph: Conversation was compacted to save context space.
+- paragraph: "If you need to retrieve earlier parts of the conversation history, you can read the backup file at: [[compaction-backup-path]] Note: This file may be large. Read only the sections you need or use grep to search for specific content rather than reading the entire file. post-compaction step"
+- img
+- text: Read SOMEFILE.md
+- 'button "Error Tool ''read_file'' failed: File does not exist: SOMEFILE.md Copy Fix with AI"':
+  - img
+  - text: ""
+  - img
+  - button "Copy":
+    - img
+    - text: ""
+  - button "Fix with AI":
+    - img
+    - text: ""
+- button "Fix All Errors (3)":
+  - img
+  - text: ""
+- paragraph: END OF COMPACTED TURN.
+- button "Copy":
+  - img
+- img
+- text: Approved
+- img
+- text: claude-opus-4-5
+- img
+- text: less than a minute ago
+- img
+- text: (1 files changed)
+- button "Copy Request ID":
+  - img
+  - text: ""
+- paragraph: "[dump] hi"
+- paragraph: "[[dyad-dump-path=*]]"
+- button "Copy":
+  - img
+- img
+- text: claude-opus-4-5
+- img
+- text: less than a minute ago
+- button "Copy Request ID":
+  - img
+  - text: ""
+- button "Undo":
+  - img
+  - text: ""
+- button "Retry":
+  - img
+  - text: ""
\ No newline at end of file
--- a/e2e-tests/snapshots/context_compaction.spec.ts_local-agent---context-compaction-can-run-mid-turn-1.txt
+++ b/e2e-tests/snapshots/context_compaction.spec.ts_local-agent---context-compaction-can-run-mid-turn-1.txt
+===
+role: system
+message: 
+<role>
+You are Dyad, an AI assistant that creates and modifies web applications. You assist users by chatting with them and making changes to their code in real-time. You understand that users can see a live preview of their application in an iframe on the right side of the screen while you make code changes.
+You make efficient and effective changes to codebases while following best practices for maintainability and readability. You take pride in keeping things simple and elegant. You are friendly and helpful, always aiming to provide clear explanations. 
+</role>
+<app_commands>
+Do *not* tell the user to run shell commands. Instead, they can do one of the following commands in the UI:
+- **Rebuild**: This will rebuild the app from scratch. First it deletes the node_modules folder and then it re-installs the npm packages and then starts the app server.
+- **Restart**: This will restart the app server.
+- **Refresh**: This will refresh the app preview page.
+You can suggest one of these commands by using the <dyad-command> tag like this:
+<dyad-command type="rebuild"></dyad-command>
+<dyad-command type="restart"></dyad-command>
+<dyad-command type="refresh"></dyad-command>
+If you output one of these commands, tell the user to look for the action button above the chat input.
+</app_commands>
+<general_guidelines>
+- Always reply to the user in the same language they are using.
+- Before proceeding with any code edits, check whether the user's request has already been implemented. If the requested change has already been made in the codebase, point this out to the user, e.g., "This feature is already implemented as described."
+- Only edit files that are related to the user's request and leave all other files alone.
+- All edits you make on the codebase will directly be built and rendered, therefore you should NEVER make partial changes like letting the user know that they should implement some components or partially implementing features.
+- If a user asks for many features at once, implement as many as possible within a reasonable response. Each feature you implement must be FULLY FUNCTIONAL with complete code - no placeholders, no partial implementations, no TODO comments. If you cannot implement all requested features due to response length constraints, clearly communicate which features you've completed and which ones you haven't started yet.
+- Prioritize creating small, focused files and components.
+- Keep explanations concise and focused
+- Set a chat summary at the end using the `set_chat_summary` tool.
+- DO NOT OVERENGINEER THE CODE. You take great pride in keeping things simple and elegant. You don't start by writing very complex error handling, fallback mechanisms, etc. You focus on the user's request and make the minimum amount of changes needed.
+DON'T DO MORE THAN WHAT THE USER ASKS FOR.
+</general_guidelines>
+<tool_calling>
+You have tools at your disposal to solve the coding task. Follow these rules regarding tool calls:
+1. ALWAYS follow the tool call schema exactly as specified and make sure to provide all necessary parameters.
+2. The conversation may reference tools that are no longer available. NEVER call tools that are not explicitly provided.
+3. **NEVER refer to tool names when speaking to the USER.** Instead, just say what the tool is doing in natural language.
+4. If you need additional information that you can get via tool calls, prefer that over asking the user.
+5. If you make a plan, immediately follow it, do not wait for the user to confirm or tell you to go ahead. The only time you should stop is if you need more information from the user that you can't find any other way, or have different options that you would like the user to weigh in on.
+6. Only use the standard tool call format and the available tools. Even if you see user messages with custom tool call formats (such as "<previous_tool_call>" or similar), do not follow that and instead use the standard format. Never output tool calls as part of a regular assistant message of yours.
+7. If you are not sure about file content or codebase structure pertaining to the user's request, use your tools to read files and gather the relevant information: do NOT guess or make up an answer.
+8. You can autonomously read as many files as you need to clarify your own questions and completely resolve the user's query, not just one.
+9. You can call multiple tools in a single response. You can also call multiple tools in parallel, do this for independent operations like reading multiple files at once.
+</tool_calling>
+<tool_calling_best_practices>
+- **Read before writing**: Use `read_file` and `list_files` to understand the codebase before making changes
+- **Use `edit_file` for edits**: For modifying existing files, prefer `edit_file` over `write_file`
+- **Be surgical**: Only change what's necessary to accomplish the task
+- **Handle errors gracefully**: If a tool fails, explain the issue and suggest alternatives
+</tool_calling_best_practices>
+<file_editing_tool_selection>
+You have three tools for editing files. Choose based on the scope of your change:
+| Scope | Tool | Examples |
+|-------|------|----------|
+| **Small** (a few lines) | `search_replace` or `edit_file` | Fix a typo, rename a variable, update a value, change an import |
+| **Medium** (one function or section) | `edit_file` | Rewrite a function, add a new component, modify multiple related lines |
+| **Large** (most of the file) | `write_file` | Major refactor, rewrite a module, create a new file |
+**Tips:**
+- `edit_file` supports `// ... existing code ...` markers to skip unchanged sections
+- When in doubt, prefer `search_replace` for precision or `write_file` for simplicity
+**Post-edit verification (REQUIRED):**
+After every edit, read the file to verify changes applied correctly. If something went wrong, try a different tool and verify again.
+</file_editing_tool_selection>
+<development_workflow>
+1. **Understand:** Think about the user's request and the relevant codebase context. Use `grep` and `code_search` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use `read_file` to understand context and validate any assumptions you may have. If you need to read multiple files, you should make multiple parallel calls to `read_file`.
+2. **Plan:** Build a coherent and grounded (based on the understanding in step 1) plan for how you intend to resolve the user's task. For complex tasks, break them down into smaller, manageable subtasks and use the `update_todos` tool to track your progress. Share an extremely concise yet clear plan with the user if it would help the user understand your thought process.
+3. **Implement:** Use the available tools (e.g., `edit_file`, `write_file`, ...) to act on the plan, strictly adhering to the project's established conventions. When debugging, add targeted console.log statements to trace data flow and identify root causes. **Important:** After adding logs, you must ask the user to interact with the application (e.g., click a button, submit a form, navigate to a page) to trigger the code paths where logs were added—the logs will only be available once that code actually executes.
+4. **Verify:** After making code changes, use `run_type_checks` to verify that the changes are correct and read the file contents to ensure the changes are what you intended.
+5. **Finalize:** After all verification passes, consider the task complete and briefly summarize the changes you made.
+</development_workflow>
+# Tech Stack
+- You are building a React application.
+- Use TypeScript.
+- Use React Router. KEEP the routes in src/App.tsx
+- Always put source code in the src folder.
+- Put pages into src/pages/
+- Put components into src/components/
+- The main page (default page) is src/pages/Index.tsx
+- UPDATE the main page to include the new components. OTHERWISE, the user can NOT see any components!
+- ALWAYS try to use the shadcn/ui library.
+- Tailwind CSS: always use Tailwind CSS for styling components. Utilize Tailwind classes extensively for layout, spacing, colors, and other design aspects.
+Available packages and libraries:
+- The lucide-react package is installed for icons.
+- You ALREADY have ALL the shadcn/ui components and their dependencies installed. So you don't need to install them again.
+- You have ALL the necessary Radix UI components installed.
+- Use prebuilt components from the shadcn/ui library after importing them. Note that these files shouldn't be edited, so make new components if you need to change them.
+===
+role: user
+message: tc=local-agent/compaction-mid-turn
+===
+role: assistant
+message: <dyad-compaction title="Conversation compacted" state="finished">
+## Key Decisions Made
+- Completed initial task as requested
+## Current Task State
+Conversation was compacted to save context space.
+</dyad-compaction>
+If you need to retrieve earlier parts of the conversation history, you can read the backup file at: [[compaction-backup-path]]
+Note: This file may be large. Read only the sections you need or use grep to search for specific content rather than reading the entire file.
+===
+role: assistant
+message: post-compaction step
+===
+role: tool
+message: File does not exist: SOMEFILE.md
+===
+role: assistant
+message: END OF COMPACTED TURN.
+===
+role: user
+message: [dump] hi
\ No newline at end of file
--- a/rules/git-workflow.md
+++ b/rules/git-workflow.md
@@ -29,6 +29,25 @@ When running GitHub Actions with `pull_request_target` on cross-repo PRs (from f
 Actions performed using the default `GITHUB_TOKEN` (including labels added by `github-actions[bot]` via `actions/github-script`) do **not** trigger `pull_request_target` or other workflow events. This is a GitHub limitation to prevent infinite loops. If one workflow adds a label that should trigger another workflow (e.g., `label-rebase-prs.yml` adds `cc:rebase` to trigger `claude-rebase.yml`), the label-adding step must use a **PAT** or **GitHub App token** (e.g., `PR_RW_GITHUB_TOKEN`) instead of `GITHUB_TOKEN`.
+## GitHub API calls with special characters
+When using `gh api` to post comments or replies containing backticks, `$()`, or other shell metacharacters, the security hook will block the command. Instead of passing the body inline with `-f body="..."`, write a JSON file and use `--input`:
+```bash
+# Write JSON body to a file (use the Write tool, not echo/cat)
+# File: .claude/tmp/reply_body.json
+# {"body": "Your comment with `backticks` and special chars"}
+gh api repos/dyad-sh/dyad/pulls/123/comments/456/replies --input .claude/tmp/reply_body.json
+```
+Similarly for GraphQL mutations, write the full query + variables as JSON and use `--input`:
+```bash
+# {"query": "mutation($threadId: ID!) { ... }", "variables": {"threadId": "PRRT_abc123"}}
+gh api graphql --input .claude/tmp/resolve_thread.json
+```
 ## Adding labels to PRs
 `gh pr edit --add-label` fails with a GraphQL "Projects (classic)" deprecation error on repos that had classic projects. Use the REST API instead:
@@ -41,13 +60,28 @@ gh api repos/dyad-sh/dyad/issues/{PR_NUMBER}/labels -f "labels[]=label-name"
 In CI, `claude-code-action` restricts file access to the repo working directory (e.g., `/home/runner/work/dyad/dyad`). Skills that save intermediate files (like PR diffs) must use `./filename` (current working directory), **never** `/tmp/`. Using `/tmp/` causes errors like: `cat in '/tmp/pr_*_diff.patch' was blocked. For security, Claude Code may only concatenate files from the allowed working directories`.
-## Rebase conflict resolution tips
+## Rebase workflow and conflict resolution
+### Handling unstaged changes during rebase
+If `git rebase` fails with "You have unstaged changes" (common with spurious `package-lock.json` changes):
+```bash
+git stash push -m "Stashing changes before rebase"
+git rebase upstream/main
+git stash pop
+```
+The stashed changes will be automatically merged back after the rebase completes.
+### Conflict resolution tips
 - **Before rebasing:** If `npm install` modified `package-lock.json` (common in CI/local), discard changes with `git restore package-lock.json` to avoid "unstaged changes" errors
 - When resolving import conflicts (e.g., `<<<<<<< HEAD` with different imports), keep **both** imports if both are valid and needed by the component
 - When resolving conflicts in i18n-related commits, watch for duplicate constant definitions that conflict with imports from `@/lib/schemas` (e.g., `DEFAULT_ZOOM_LEVEL`)
 - If both sides of a conflict have valid imports/hooks, keep both and remove any duplicate constant redefinitions
 - When rebasing documentation/table conflicts (e.g., workflow README tables), prefer keeping **both** additions from HEAD and upstream - merge new rows/content from both branches rather than choosing one side
+- **Complementary additions**: When both sides added new sections at the end of a file (e.g., both added different documentation tips), keep both sections rather than choosing one — they're not truly conflicting, just different additions
 ## Rebasing with uncommitted changes
@@ -59,3 +93,11 @@ If you need to rebase but have uncommitted changes (e.g., package-lock.json from
 4. Discard spurious changes like package-lock.json (if package.json unchanged): `git restore package-lock.json`
 This prevents rebase conflicts from uncommitted changes while preserving any work in progress.
+## Resolving documentation rebase conflicts
+When rebasing a PR branch that conflicts with upstream documentation changes (e.g., AGENTS.md):
+- If upstream has reorganized content (e.g., moved sections to separate `rules/*.md` files), keep upstream's version
+- Discard the PR's inline content that conflicts with the new organization
+- The PR's documentation changes may need to be re-applied to the new file locations after the rebase
--- a/src/__tests__/local_agent_handler.test.ts
+++ b/src/__tests__/local_agent_handler.test.ts
 import { describe, it, expect, vi, beforeEach } from "vitest";
 import type { IpcMainInvokeEvent, WebContents } from "electron";
+import { streamText } from "ai";
 // ============================================================================
 // Test Fakes & Builders
@@ -49,6 +50,7 @@ function buildTestChat(
      role: "user" | "assistant";
      content: string;
      aiMessagesJson?: unknown;
+      isCompactionSummary?: boolean | null;
      createdAt?: Date;
    }>;
    supabaseProjectId?: string | null;
@@ -125,17 +127,31 @@ function createFakeStream(
    delta?: string;
    [key: string]: unknown;
  }>,
-) {
+): FakeStreamResult {
  return {
    fullStream: (async function* () {
      for (const part of parts) {
        yield part;
      }
    })(),
-    response: Promise.resolve({ messages: [] }),
+    response: Promise.resolve({ messages: [] as any[] }),
+    steps: Promise.resolve([] as any[]),
  };
 }
+type FakeStreamResult = {
+  fullStream: AsyncGenerator<
+    {
+      type: string;
+      [key: string]: unknown;
+    },
+    void,
+    unknown
+  >;
+  response: Promise<{ messages: any[] }>;
+  steps?: Promise<any[]>;
+};
 // ============================================================================
 // Mocks
 // ============================================================================
@@ -145,6 +161,7 @@ vi.mock("electron-log", () => ({
  default: {
    scope: () => ({
      log: vi.fn(),
+      info: vi.fn(),
      error: vi.fn(),
      warn: vi.fn(),
      debug: vi.fn(),
@@ -207,10 +224,15 @@ vi.mock("@/ipc/utils/safe_sender", () => ({
  }),
 }));
-let mockStreamResult: ReturnType<typeof createFakeStream> | null = null;
+let mockStreamResult: FakeStreamResult | null = null;
+let mockStreamTextImpl:
+  | ((options: Record<string, any>) => FakeStreamResult)
+  | null = null;
 vi.mock("ai", () => ({
-  streamText: vi.fn(() => mockStreamResult),
+  streamText: vi.fn((options: Record<string, any>) =>
+    mockStreamTextImpl ? mockStreamTextImpl(options) : mockStreamResult,
+  ),
  stepCountIs: vi.fn((n: number) => ({ steps: n })),
  hasToolCall: vi.fn((toolName: string) => ({ toolName })),
 }));
@@ -292,9 +314,11 @@ describe("handleLocalAgentStream", () => {
    mockChatData = null;
    mockSettings = buildTestSettings();
    mockStreamResult = null;
+    mockStreamTextImpl = null;
    mockIsChatPendingCompaction.mockResolvedValue(false);
    mockPerformCompaction.mockResolvedValue({ success: true });
    mockCheckAndMarkForCompaction.mockResolvedValue(false);
+    vi.mocked(streamText).mockClear();
  });
  describe("Pro status validation", () => {
@@ -423,6 +447,336 @@ describe("handleLocalAgentStream", () => {
    });
  });
+  describe("Mid-turn compaction", () => {
+    it("should compact between steps when token usage crosses threshold", async () => {
+      // Arrange
+      const { event, getMessagesByChannel } = createFakeEvent();
+      mockSettings = buildTestSettings({ enableDyadPro: true });
+      mockChatData = buildTestChat({
+        messages: [
+          { id: 1, role: "user", content: "old context user" },
+          { id: 2, role: "assistant", content: "old context assistant" },
+          { id: 3, role: "user", content: "current task" },
+          { id: 10, role: "assistant", content: "" }, // placeholder
+        ],
+      });
+      mockIsChatPendingCompaction
+        .mockResolvedValueOnce(false) // pre-turn check
+        .mockResolvedValueOnce(true) // mid-turn check
+        .mockResolvedValue(false);
+      mockCheckAndMarkForCompaction.mockResolvedValue(true);
+      mockPerformCompaction.mockImplementation(async () => {
+        if (!mockChatData) {
+          return { success: false, error: "missing chat" };
+        }
+        mockChatData = {
+          ...mockChatData,
+          messages: [
+            ...mockChatData.messages,
+            {
+              id: 20,
+              role: "assistant",
+              content:
+                '<dyad-compaction title="Conversation compacted" state="finished">mid-turn summary</dyad-compaction>',
+              isCompactionSummary: true,
+            },
+          ],
+        } as any;
+        return {
+          success: true,
+          summary: "mid-turn summary",
+          backupPath: ".dyad/chats/1/compaction-test.md",
+        };
+      });
+      let secondStepPreparedMessages: any[] | undefined;
+      mockStreamTextImpl = (options) => {
+        const firstStepMessages = [
+          { role: "user", content: "old context user" },
+          { role: "assistant", content: "old context assistant" },
+          { role: "user", content: "current task" },
+        ];
+        return {
+          fullStream: (async function* () {
+            await options.prepareStep?.({
+              messages: firstStepMessages,
+              stepNumber: 0,
+              steps: [],
+              model: {},
+              experimental_context: undefined,
+            });
+            yield { type: "text-delta", text: "before-compaction\n" };
+            await options.onStepFinish?.({
+              usage: { totalTokens: 200_000 },
+              toolCalls: [{}],
+            });
+            const secondStepMessages = [
+              ...firstStepMessages,
+              { role: "assistant", content: "tool state assistant" },
+              { role: "assistant", content: "tool state result" },
+            ];
+            const preparedSecondStep = (await options.prepareStep?.({
+              messages: secondStepMessages,
+              stepNumber: 1,
+              steps: [],
+              model: {},
+              experimental_context: undefined,
+            })) ?? { messages: secondStepMessages };
+            secondStepPreparedMessages = preparedSecondStep.messages;
+            yield { type: "text-delta", text: "done" };
+          })(),
+          response: Promise.resolve({ messages: [] }),
+          steps: Promise.resolve([]),
+        };
+      };
+      // Act
+      await handleLocalAgentStream(
+        event,
+        { chatId: 1, prompt: "test" },
+        new AbortController(),
+        {
+          placeholderMessageId: 10,
+          systemPrompt: "You are helpful",
+          dyadRequestId,
+        },
+      );
+      // Assert
+      expect(mockCheckAndMarkForCompaction).toHaveBeenCalledWith(1, 200_000);
+      expect(mockPerformCompaction).toHaveBeenCalledTimes(1);
+      expect(mockPerformCompaction).toHaveBeenCalledWith(
+        expect.anything(),
+        1,
+        "/mock/apps/test-app-path",
+        dyadRequestId,
+        expect.any(Function),
+        { createdAtStrategy: "now" },
+      );
+      expect(secondStepPreparedMessages).toBeDefined();
+      const secondStepContents = (secondStepPreparedMessages ?? []).map(
+        (msg: any) =>
+          typeof msg.content === "string"
+            ? msg.content
+            : JSON.stringify(msg.content),
+      );
+      expect(
+        secondStepContents.some((content: string) =>
+          content.includes("Conversation compacted"),
+        ),
+      ).toBe(true);
+      expect(secondStepContents).not.toContain("old context user");
+      expect(secondStepContents).not.toContain("old context assistant");
+      expect(secondStepContents).toContain("tool state assistant");
+      expect(secondStepContents).toContain("tool state result");
+      const contentUpdates = dbOperations.updates.filter(
+        (u) => u.data.content !== undefined,
+      );
+      const finalContent = contentUpdates[contentUpdates.length - 1].data
+        .content as string;
+      const beforeCompactionIndex = finalContent.indexOf("before-compaction");
+      const compactionIndex = finalContent.indexOf("Conversation compacted");
+      const doneIndex = finalContent.indexOf("done");
+      const backupPathIndex = finalContent.indexOf(
+        ".dyad/chats/1/compaction-test.md",
+      );
+      expect(beforeCompactionIndex).toBeGreaterThanOrEqual(0);
+      expect(compactionIndex).toBeGreaterThan(beforeCompactionIndex);
+      expect(backupPathIndex).toBeGreaterThan(compactionIndex);
+      expect(doneIndex).toBeGreaterThan(compactionIndex);
+      const chunkMessages = getMessagesByChannel("chat:response:chunk");
+      const streamedMessageIds = chunkMessages.flatMap((message) => {
+        const payload = message.args[0] as { messages?: Array<{ id: number }> };
+        return (payload.messages ?? []).map((msg) => msg.id);
+      });
+      expect(streamedMessageIds).not.toContain(20);
+    });
+    it("should persist post-compaction response messages without reshaping", async () => {
+      // Arrange
+      const { event } = createFakeEvent();
+      mockSettings = buildTestSettings({ enableDyadPro: true });
+      mockChatData = buildTestChat({
+        messages: [
+          { id: 1, role: "user", content: "old context user" },
+          { id: 2, role: "assistant", content: "old context assistant" },
+          { id: 3, role: "user", content: "current task" },
+          { id: 10, role: "assistant", content: "" }, // placeholder
+        ],
+      });
+      mockIsChatPendingCompaction
+        .mockResolvedValueOnce(false) // pre-turn check
+        .mockResolvedValueOnce(true) // mid-turn check
+        .mockResolvedValue(false);
+      mockCheckAndMarkForCompaction.mockResolvedValue(true);
+      mockPerformCompaction.mockImplementation(async () => {
+        if (!mockChatData) {
+          return { success: false, error: "missing chat" };
+        }
+        mockChatData = {
+          ...mockChatData,
+          messages: [
+            ...mockChatData.messages,
+            {
+              id: 20,
+              role: "assistant",
+              content:
+                '<dyad-compaction title="Conversation compacted" state="finished">mid-turn summary</dyad-compaction>',
+              isCompactionSummary: true,
+            },
+          ],
+        } as any;
+        return {
+          success: true,
+          summary: "mid-turn summary",
+          backupPath: ".dyad/chats/1/compaction-test.md",
+        };
+      });
+      const preCompactionGenerated = [
+        {
+          role: "assistant",
+          content: [
+            {
+              type: "text",
+              text: "before compaction",
+            },
+          ],
+        },
+        {
+          role: "tool",
+          content: [
+            {
+              type: "tool-result",
+              toolName: "read_file",
+              toolCallId: "call_before",
+              output: "before result",
+            },
+          ],
+        },
+      ];
+      const postCompactionGenerated = [
+        {
+          role: "assistant",
+          content: [
+            {
+              type: "text",
+              text: "post compaction assistant",
+            },
+            {
+              type: "tool-call",
+              toolCallId: "call_after",
+              toolName: "read_file",
+              input: { path: "SOMEFILE.md" },
+            },
+          ],
+        },
+        {
+          role: "tool",
+          content: [
+            {
+              type: "tool-result",
+              toolName: "read_file",
+              toolCallId: "call_after",
+              output: "post result",
+            },
+          ],
+        },
+      ];
+      mockStreamTextImpl = (options) => {
+        const firstStepMessages = [
+          { role: "user", content: "old context user" },
+          { role: "assistant", content: "old context assistant" },
+          { role: "user", content: "current task" },
+        ];
+        return {
+          fullStream: (async function* () {
+            await options.prepareStep?.({
+              messages: firstStepMessages,
+              stepNumber: 0,
+              steps: [],
+              model: {},
+              experimental_context: undefined,
+            });
+            await options.onStepFinish?.({
+              usage: { totalTokens: 200_000 },
+              toolCalls: [{}],
+            });
+            const secondStepMessages = [
+              ...firstStepMessages,
+              ...preCompactionGenerated,
+            ];
+            await options.prepareStep?.({
+              messages: secondStepMessages,
+              stepNumber: 1,
+              steps: [],
+              model: {},
+              experimental_context: undefined,
+            });
+            yield { type: "text-delta", text: "done" };
+          })(),
+          response: Promise.resolve({
+            messages: [...preCompactionGenerated, ...postCompactionGenerated],
+          }),
+          steps: Promise.resolve([
+            {
+              response: {
+                messages: [...preCompactionGenerated],
+              },
+            },
+            {
+              response: {
+                messages: [
+                  ...preCompactionGenerated,
+                  ...postCompactionGenerated,
+                ],
+              },
+            },
+          ]),
+        };
+      };
+      // Act
+      await handleLocalAgentStream(
+        event,
+        { chatId: 1, prompt: "test" },
+        new AbortController(),
+        {
+          placeholderMessageId: 10,
+          systemPrompt: "You are helpful",
+          dyadRequestId,
+        },
+      );
+      // Assert
+      const aiMessagesUpdates = dbOperations.updates.filter(
+        (u) => u.data.aiMessagesJson !== undefined,
+      );
+      expect(aiMessagesUpdates).toHaveLength(1);
+      expect(
+        (aiMessagesUpdates[0].data.aiMessagesJson as { messages: unknown[] })
+          .messages,
+      ).toEqual(postCompactionGenerated);
+    });
+  });
  describe("Stream processing - text content", () => {
    it("should accumulate text-delta parts and update database", async () => {
      // Arrange

--- a/src/ipc/handlers/compaction/compaction_handler.ts
+++ b/src/ipc/handlers/compaction/compaction_handler.ts
@@ -25,6 +25,7 @@ import {
 } from "./compaction_storage";
 import { getPostCompactionMessages } from "./compaction_utils";
 import { getProviderOptions, getAiHeaders } from "@/ipc/utils/provider_options";
+import { escapeXmlContent } from "../../../../shared/xmlEscape";
 const logger = log.scope("compaction_handler");
@@ -115,6 +116,9 @@ export async function performCompaction(
  appPath: string,
  dyadRequestId: string,
  onSummaryChunk?: (accumulatedText: string) => void,
+  options?: {
+    createdAtStrategy?: "before-latest-user" | "now";
+  },
 ): Promise<CompactionResult> {
  const settings = readSettings();
@@ -197,7 +201,7 @@ export async function performCompaction(
    // Create the compaction indicator message
    // Include relative backup path so the AI can read the full original conversation later
    const compactionMessageContent = `<dyad-compaction title="Conversation compacted" state="finished">
-${summary}
+${escapeXmlContent(summary)}
 </dyad-compaction>
 If you need to retrieve earlier parts of the conversation history, you can read the backup file at: ${backupPath}
@@ -218,8 +222,11 @@ Note: This file may be large. Read only the sections you need or use grep to sea
    const latestUserMessage = [...chatMessages]
      .reverse()
      .find((m) => m.role === "user");
-    const compactionCreatedAt = latestUserMessage
+    const compactionCreatedAt =
-      ? new Date(latestUserMessage.createdAt.getTime() - 1)
+      options?.createdAtStrategy === "now"
+        ? new Date()
+        : latestUserMessage
+          ? new Date(latestUserMessage.createdAt.getTime() - 1000)
          : new Date();
    await db.insert(messages).values({
      chatId,

--- a/src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts
+++ b/src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts
@@ -56,7 +56,10 @@ import {
  type InjectedMessage,
 } from "./prepare_step_utils";
 import { TOOL_DEFINITIONS } from "./tool_definitions";
-import { parseAiMessagesJson } from "@/ipc/utils/ai_messages_utils";
+import {
+  parseAiMessagesJson,
+  type DbMessageForParsing,
+} from "@/ipc/utils/ai_messages_utils";
 import { parseMcpToolKey, sanitizeMcpName } from "@/ipc/utils/mcp_tool_utils";
 import { addIntegrationTool } from "./tools/add_integration";
 import { planningQuestionnaireTool } from "./tools/planning_questionnaire";
@@ -107,6 +110,95 @@ function findToolDefinition(toolName: string) {
  return TOOL_DEFINITIONS.find((t) => t.name === toolName);
 }
+function buildChatMessageHistory(
+  chatMessages: Array<
+    DbMessageForParsing & {
+      isCompactionSummary: boolean | null;
+      createdAt: Date;
+    }
+  >,
+  options?: { excludeMessageIds?: Set<number> },
+): ModelMessage[] {
+  const excludedIds = options?.excludeMessageIds;
+  const relevantMessages = getPostCompactionMessages(chatMessages);
+  const reorderedMessages = [...relevantMessages];
+  // For mid-turn compaction, keep the summary immediately after the triggering
+  // user message so subsequent turns reflect that compaction happened before
+  // post-compaction tool-loop steps.
+  for (const summary of [...reorderedMessages].filter(
+    (message) => message.isCompactionSummary,
+  )) {
+    const summaryIndex = reorderedMessages.findIndex(
+      (m) => m.id === summary.id,
+    );
+    if (summaryIndex < 0) {
+      continue;
+    }
+    const triggeringUser = [...reorderedMessages]
+      .filter((m) => m.role === "user" && m.id < summary.id)
+      .sort((a, b) => b.id - a.id)[0];
+    if (!triggeringUser) {
+      continue;
+    }
+    const triggeringUserIndex = reorderedMessages.findIndex(
+      (m) => m.id === triggeringUser.id,
+    );
+    if (triggeringUserIndex < 0) {
+      continue;
+    }
+    const isMidTurnSummary =
+      summary.createdAt.getTime() >= triggeringUser.createdAt.getTime();
+    if (!isMidTurnSummary || summaryIndex === triggeringUserIndex + 1) {
+      continue;
+    }
+    reorderedMessages.splice(summaryIndex, 1);
+    const targetIndex = Math.min(
+      triggeringUserIndex + 1,
+      reorderedMessages.length,
+    );
+    reorderedMessages.splice(targetIndex, 0, summary);
+  }
+  return reorderedMessages
+    .filter((msg) => !excludedIds?.has(msg.id))
+    .filter((msg) => msg.content || msg.aiMessagesJson)
+    .flatMap((msg) => parseAiMessagesJson(msg));
+}
+function getMidTurnCompactionSummaryIds(
+  chatMessages: Array<{
+    id: number;
+    role: string;
+    createdAt: Date;
+    isCompactionSummary: boolean | null;
+  }>,
+): Set<number> {
+  const hiddenIds = new Set<number>();
+  for (const summary of chatMessages.filter((m) => m.isCompactionSummary)) {
+    const triggeringUserMessage = [...chatMessages]
+      .filter((m) => m.role === "user" && m.id < summary.id)
+      .sort((a, b) => b.id - a.id)[0];
+    if (!triggeringUserMessage) {
+      continue;
+    }
+    if (
+      summary.createdAt.getTime() >= triggeringUserMessage.createdAt.getTime()
+    ) {
+      hiddenIds.add(summary.id);
+    }
+  }
+  return hiddenIds;
+}
 /**
 * Handle a chat stream in local-agent mode
 */
@@ -143,6 +235,30 @@ export async function handleLocalAgentStream(
  },
 ): Promise<boolean> {
  const settings = readSettings();
+  let fullResponse = "";
+  let streamingPreview = ""; // Temporary preview for current tool, not persisted
+  // Mid-turn compaction inserts a DB summary row for LLM history, but we render
+  // the user-facing compaction indicator inline in the active assistant turn.
+  const hiddenMessageIdsForStreaming = new Set<number>();
+  let postMidTurnCompactionStartStep: number | null = null;
+  const appendInlineCompactionToTurn = async (
+    summary?: string,
+    backupPath?: string,
+  ) => {
+    const summaryText =
+      summary && summary.trim().length > 0
+        ? summary
+        : "Conversation compacted.";
+    const inlineCompaction = `<dyad-compaction title="Conversation compacted" state="finished">\n${escapeXmlContent(summaryText)}\n</dyad-compaction>`;
+    const backupPathNote = backupPath
+      ? `\nIf you need to retrieve earlier parts of the conversation history, you can read the backup file at: ${backupPath}\nNote: This file may be large. Read only the sections you need or use grep to search for specific content rather than reading the entire file.`
+      : "";
+    const separator =
+      fullResponse.length > 0 && !fullResponse.endsWith("\n") ? "\n" : "";
+    fullResponse = `${fullResponse}${separator}${inlineCompaction}${backupPathNote}\n`;
+    await updateResponseInDb(placeholderMessageId, fullResponse);
+  };
  // Check Pro status or Basic Agent mode
  // Basic Agent mode allows non-Pro users with quota (quota check is done in chat_stream_handlers)
@@ -156,8 +272,8 @@ export async function handleLocalAgentStream(
    return false;
  }
-  // Get the chat and app — may be re-queried after compaction
+  const loadChat = async () =>
-  let chat = await db.query.chats.findFirst({
+    db.query.chats.findFirst({
      where: eq(chats.id, req.chatId),
      with: {
        messages: {
@@ -167,35 +283,66 @@ export async function handleLocalAgentStream(
      },
    });
-  if (!chat || !chat.app) {
+  // Get the chat and app — may be re-queried after compaction
+  const initialChat = await loadChat();
+  if (!initialChat || !initialChat.app) {
    throw new Error(`Chat not found: ${req.chatId}`);
  }
+  let chat = initialChat;
+  for (const id of getMidTurnCompactionSummaryIds(chat.messages)) {
+    hiddenMessageIdsForStreaming.add(id);
+  }
  const appPath = getDyadAppPath(chat.app.path);
-  // Check if compaction is pending and enabled before processing the message
+  const maybePerformPendingCompaction = async (options?: {
+    showOnTopOfCurrentResponse?: boolean;
+    force?: boolean;
+  }) => {
    if (
-    settings.enableContextCompaction !== false &&
+      settings.enableContextCompaction === false ||
-    (await isChatPendingCompaction(req.chatId))
+      (!options?.force && !(await isChatPendingCompaction(req.chatId)))
    ) {
+      return false;
+    }
    logger.info(`Performing pending compaction for chat ${req.chatId}`);
+    const existingCompactionSummaryIds = new Set(
+      chat.messages
+        .filter((message) => message.isCompactionSummary)
+        .map((message) => message.id),
+    );
    const compactionResult = await performCompaction(
      event,
      req.chatId,
      appPath,
      dyadRequestId,
      (accumulatedSummary: string) => {
-        // Stream compaction summary to the frontend in real-time
+        // Stream compaction summary to the frontend in real-time.
-        // We temporarily set the placeholder content to show compaction progress;
+        // During mid-turn compaction, keep already streamed content visible.
-        // after compaction, the chat is re-queried and the placeholder is reset.
+        const compactionPreview = `<dyad-compaction title="Compacting conversation">\n${escapeXmlContent(accumulatedSummary)}\n</dyad-compaction>`;
+        const previewContent = options?.showOnTopOfCurrentResponse
+          ? `${fullResponse}${streamingPreview ? streamingPreview : ""}\n${compactionPreview}`
+          : compactionPreview;
        sendResponseChunk(
          event,
          req.chatId,
          chat,
-          `<dyad-compaction title="Compacting conversation">\n${accumulatedSummary}\n</dyad-compaction>`,
+          previewContent,
          placeholderMessageId,
+          hiddenMessageIdsForStreaming,
        );
      },
+      {
+        // Mid-turn compaction should not render as a separate message above the
+        // current turn on subsequent streams, so keep its DB timestamp in turn order.
+        createdAtStrategy: options?.showOnTopOfCurrentResponse
+          ? "now"
+          : "before-latest-user",
+      },
    );
    if (!compactionResult.success) {
      logger.warn(
@@ -203,27 +350,57 @@ export async function handleLocalAgentStream(
      );
      // Continue anyway - compaction failure shouldn't block the conversation
    }
-    // Re-query to pick up the newly inserted compaction summary message
-    chat = (await db.query.chats.findFirst({
+    // Re-query to pick up the newly inserted compaction summary message.
-      where: eq(chats.id, req.chatId),
+    // Only update if compaction succeeded — a failed compaction may have left
-      with: {
+    // partial state that would corrupt subsequent message history.
-        messages: {
+    if (compactionResult.success) {
-          orderBy: (messages, { asc }) => [asc(messages.createdAt)],
+      const refreshedChat = await loadChat();
-        },
+      if (refreshedChat?.app) {
-        app: true,
+        chat = refreshedChat;
-      },
-    }))!;
      }
+      if (options?.showOnTopOfCurrentResponse) {
+        for (const message of chat.messages) {
+          if (
+            message.isCompactionSummary &&
+            !existingCompactionSummaryIds.has(message.id)
+          ) {
+            hiddenMessageIdsForStreaming.add(message.id);
+          }
+        }
+        await appendInlineCompactionToTurn(
+          compactionResult.summary,
+          compactionResult.backupPath,
+        );
+      }
+    }
+    if (options?.showOnTopOfCurrentResponse) {
+      sendResponseChunk(
+        event,
+        req.chatId,
+        chat,
+        fullResponse + streamingPreview,
+        placeholderMessageId,
+        hiddenMessageIdsForStreaming,
+      );
+    }
+    return compactionResult.success;
+  };
+  // Check if compaction is pending and enabled before processing the message
+  await maybePerformPendingCompaction();
  // Send initial message update
  safeSend(event.sender, "chat:response:chunk", {
    chatId: req.chatId,
-    messages: chat.messages,
+    messages: chat.messages.filter(
+      (message) => !hiddenMessageIdsForStreaming.has(message.id),
+    ),
  });
-  let fullResponse = "";
-  let streamingPreview = ""; // Temporary preview for current tool, not persisted
  // Track pending user messages to inject after tool results
  const pendingUserMessages: UserMessageContentPart[][] = [];
  // Store injected messages with their insertion index to re-inject at the same spot each step
@@ -260,6 +437,7 @@ export async function handleLocalAgentStream(
          chat,
          fullResponse + streamingPreview,
          placeholderMessageId,
+          hiddenMessageIdsForStreaming,
        );
      },
      onXmlComplete: (finalXml: string) => {
@@ -273,6 +451,7 @@ export async function handleLocalAgentStream(
          chat,
          fullResponse,
          placeholderMessageId,
+          hiddenMessageIdsForStreaming,
        );
      },
      requireConsent: async (params: {
@@ -311,13 +490,15 @@ export async function handleLocalAgentStream(
    // Use messageOverride if provided (e.g., for summarization)
    // If a compaction summary exists, only include messages from that point onward
    // (pre-compaction messages are preserved in DB for the user but not sent to LLM)
-    const relevantMessages = getPostCompactionMessages(chat.messages);
    const messageHistory: ModelMessage[] = messageOverride
      ? messageOverride
-      : relevantMessages
+      : buildChatMessageHistory(chat.messages);
-          .filter((msg) => msg.content || msg.aiMessagesJson)
-          .flatMap((msg) => parseAiMessagesJson(msg));
+    // Used to swap out pre-compaction history while preserving in-flight turn steps.
+    let baseMessageHistoryCount = messageHistory.length;
+    let compactBeforeNextStep = false;
+    let compactedMidTurn = false;
+    let compactionFailedMidTurn = false;
    // Stream the response
    const streamResult = streamText({
@@ -361,8 +542,95 @@ export async function handleLocalAgentStream(
      // We must re-inject all accumulated messages each step because the AI SDK
      // doesn't persist dynamically injected messages in its internal state.
      // We track the insertion index so messages appear at the same position each step.
-      prepareStep: (options) =>
+      prepareStep: async (options) => {
-        prepareStepMessages(options, pendingUserMessages, allInjectedMessages),
+        let stepOptions = options;
+        if (
+          !messageOverride &&
+          compactBeforeNextStep &&
+          !compactedMidTurn &&
+          settings.enableContextCompaction !== false
+        ) {
+          compactBeforeNextStep = false;
+          const inFlightTailMessages = options.messages.slice(
+            baseMessageHistoryCount,
+          );
+          const compacted = await maybePerformPendingCompaction({
+            showOnTopOfCurrentResponse: true,
+            force: true,
+          });
+          if (compacted) {
+            compactedMidTurn = true;
+            // Preserve only messages generated after this compaction boundary.
+            postMidTurnCompactionStartStep = options.stepNumber;
+            // Clear stale injected messages — their insertAtIndex values are
+            // based on the pre-compaction message array which has been rebuilt
+            // with a different (typically smaller) count. Keeping them would
+            // cause injectMessagesAtPositions to splice at wrong positions.
+            allInjectedMessages.length = 0;
+            const compactedMessageHistory = buildChatMessageHistory(
+              chat.messages,
+              {
+                // Keep the structured in-flight assistant/tool messages from
+                // the current stream instead of the placeholder DB content.
+                excludeMessageIds: new Set([placeholderMessageId]),
+              },
+            );
+            baseMessageHistoryCount = compactedMessageHistory.length;
+            stepOptions = {
+              ...options,
+              // Preserve in-flight turn messages so same-turn tool loops can
+              // continue, while later turns are compacted via persisted history.
+              messages: [...compactedMessageHistory, ...inFlightTailMessages],
+            };
+          } else {
+            // Prevent repeated compaction attempts if the first one fails.
+            compactionFailedMidTurn = true;
+          }
+        }
+        const preparedStep = prepareStepMessages(
+          stepOptions,
+          pendingUserMessages,
+          allInjectedMessages,
+        );
+        // prepareStepMessages returns undefined when it has no additional
+        // injections/cleanups to apply. If we already replaced the base
+        // message history (e.g., after mid-turn compaction), we still need
+        // to return the updated options.
+        if (preparedStep) {
+          return preparedStep;
+        }
+        return stepOptions === options ? undefined : stepOptions;
+      },
+      onStepFinish: async (step) => {
+        if (
+          settings.enableContextCompaction === false ||
+          compactedMidTurn ||
+          typeof step.usage.totalTokens !== "number"
+        ) {
+          return;
+        }
+        const shouldCompact = await checkAndMarkForCompaction(
+          req.chatId,
+          step.usage.totalTokens,
+        );
+        // If this step triggered tool calls, compact before the next step
+        // in this same user turn instead of waiting for the next message.
+        // Only attempt mid-turn compaction once per turn.
+        if (
+          shouldCompact &&
+          step.toolCalls.length > 0 &&
+          !compactionFailedMidTurn
+        ) {
+          compactBeforeNextStep = true;
+        }
+      },
      onFinish: async (response) => {
        const totalTokens = response.usage?.totalTokens;
        const inputTokens = response.usage?.inputTokens;
@@ -383,9 +651,6 @@ export async function handleLocalAgentStream(
            .set({ maxTokensUsed: totalTokens })
            .where(eq(messages.id, placeholderMessageId))
            .catch((err) => logger.error("Failed to save token count", err));
-          // Check if compaction should be triggered for the next message
-          await checkAndMarkForCompaction(req.chatId, totalTokens);
        }
      },
      onError: (error: any) => {
@@ -507,6 +772,7 @@ export async function handleLocalAgentStream(
          chat,
          fullResponse,
          placeholderMessageId,
+          hiddenMessageIdsForStreaming,
        );
      }
    }
@@ -520,7 +786,27 @@ export async function handleLocalAgentStream(
    // Save the AI SDK messages for multi-turn tool call preservation
    try {
      const response = await streamResult.response;
-      const aiMessagesJson = getAiMessagesJsonIfWithinLimit(response.messages);
+      const steps = await streamResult.steps;
+      const aiMessagesForPersistence =
+        compactedMidTurn && postMidTurnCompactionStartStep !== null
+          ? (() => {
+              // stepNumber is 0-indexed (from AI SDK: stepNumber = steps.length).
+              // We want the step just before compaction to determine how many
+              // response messages to skip (they belong to pre-compaction context).
+              const prevStepMessages =
+                steps[postMidTurnCompactionStartStep - 1]?.response.messages;
+              if (!prevStepMessages) {
+                logger.warn(
+                  `No step data found at index ${postMidTurnCompactionStartStep - 1} for mid-turn compaction slicing; persisting all messages`,
+                );
+              }
+              return response.messages.slice(prevStepMessages?.length ?? 0);
+            })()
+          : response.messages;
+      const aiMessagesJson = getAiMessagesJsonIfWithinLimit(
+        aiMessagesForPersistence,
+      );
      if (aiMessagesJson) {
        await db
          .update(messages)
@@ -611,8 +897,11 @@ function sendResponseChunk(
  chat: any,
  fullResponse: string,
  placeholderMessageId: number,
+  hiddenMessageIds?: Set<number>,
 ) {
-  const currentMessages = [...chat.messages];
+  const currentMessages = [...chat.messages].filter(
+    (message) => !hiddenMessageIds?.has(message.id),
+  );
  // Find the placeholder message by ID rather than assuming it's the last
  // assistant message. After compaction, a compaction summary message may
  // exist after the placeholder and we must not overwrite it.