feat: add voice input feature with transcription support (#2344)
closes #1804
<!-- devin-review-badge-begin -->
---
<a href="https://app.devin.ai/review/dyad-sh/dyad/pull/2344">
<picture>
<source media="(prefers-color-scheme: dark)"
srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
<img
src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1"
alt="Open with Devin">
</picture>
</a>
<!-- devin-review-badge-end -->
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Introduces voice input across chat inputs with transcription via Dyad
Engine and Pro gating.
>
> - Replaces send row with `LexicalVoiceInputRow` in `ChatInput` and
`HomeChatInput`, adding mic control, waveform (`VoiceWaveform`), and
send/cancel integration
> - New `VoiceInputButton` handles Pro-only disabled state,
recording/transcribing states, and tooltips
> - New hooks `useAudioRecorder` and `useVoiceInput` to record via
`MediaRecorder`, visualize with `AnalyserNode`, and call
`ipc.misc.transcribeAudio`
> - IPC: adds `misc.transcribeAudio` contract, registers
`transcription_handlers` that validate input, support E2E mock, and call
`transcribeWithDyadEngine`
> - Dyad Engine util: adds `transcribeWithDyadEngine` with request-id
attempt tracking and multipart upload to `/audio/transcriptions`
> - E2E tests for voice flow and Pro gating; mocks `getUserMedia` and
asserts transcription append
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
7dc1944bf0149a9f88b63a3fdfe0df83e7aa4f9f. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Adds voice input with waveform visualization and transcription for chat,
gated to Dyad Pro users. Improves reliability with fixes for recording
setup leaks, analyser state, and proper audio MIME typing so IPC-backed
Dyad Engine transcription consistently appends text; addresses #1804.
- **New Features**
- Integrated VoiceInputButton and VoiceWaveform via LexicalVoiceInputRow
in ChatInput and HomeChatInput; appends transcribed text to the input.
- Added useAudioRecorder/useVoiceInput hooks to record via
MediaRecorder, visualize with AnalyserNode, and invoke IPC channel
chat:transcribe.
- Pro gating with tooltip and disabled state for non-Pro users;
recording can always be stopped.
- IPC handler validates payloads and calls Dyad Engine via multipart
upload; includes E2E mock support.
- E2E tests mock getUserMedia and verify transcription append and
Pro-only disabled state.
- **Migration**
- Provide a Dyad Pro API key (settings or DYAD_PRO_API_KEY) and enable
Dyad Pro.
- Ensure microphone permissions are granted.
- Optionally set DYAD_ENGINE_URL; defaults to https://engine.dyad.sh/v1.
<sup>Written for commit fa71433ae270a7276e5466c6c8df359eab1eb03d.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
---------
Co-authored-by:
Will Chen <willchen90@gmail.com>
正在显示
e2e-tests/voice_to_text.spec.ts
0 → 100644
src/hooks/useVoiceToText.test.ts
0 → 100644
src/hooks/useVoiceToText.ts
0 → 100644
src/ipc/types/audio.ts
0 → 100644
请
注册
或者
登录
后发表评论