• Adekunle James Adeniji's avatar
    feat: add voice input feature with transcription support (#2344) · 31c1a145
    Adekunle James Adeniji 提交于
    closes #1804
    <!-- devin-review-badge-begin -->
    
    ---
    
    <a href="https://app.devin.ai/review/dyad-sh/dyad/pull/2344">
      <picture>
    <source media="(prefers-color-scheme: dark)"
    srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
    <img
    src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1"
    alt="Open with Devin">
      </picture>
    </a>
    <!-- devin-review-badge-end -->
    
    <!-- CURSOR_SUMMARY -->
    ---
    
    > [!NOTE]
    > Introduces voice input across chat inputs with transcription via Dyad
    Engine and Pro gating.
    > 
    > - Replaces send row with `LexicalVoiceInputRow` in `ChatInput` and
    `HomeChatInput`, adding mic control, waveform (`VoiceWaveform`), and
    send/cancel integration
    > - New `VoiceInputButton` handles Pro-only disabled state,
    recording/transcribing states, and tooltips
    > - New hooks `useAudioRecorder` and `useVoiceInput` to record via
    `MediaRecorder`, visualize with `AnalyserNode`, and call
    `ipc.misc.transcribeAudio`
    > - IPC: adds `misc.transcribeAudio` contract, registers
    `transcription_handlers` that validate input, support E2E mock, and call
    `transcribeWithDyadEngine`
    > - Dyad Engine util: adds `transcribeWithDyadEngine` with request-id
    attempt tracking and multipart upload to `/audio/transcriptions`
    > - E2E tests for voice flow and Pro gating; mocks `getUserMedia` and
    asserts transcription append
    > 
    > <sup>Written by [Cursor
    Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
    7dc1944bf0149a9f88b63a3fdfe0df83e7aa4f9f. This will update automatically
    on new commits. Configure
    [here](https://cursor.com/dashboard?tab=bugbot).</sup>
    <!-- /CURSOR_SUMMARY -->
    
    <!-- This is an auto-generated description by cubic. -->
    ---
    ## Summary by cubic
    Adds voice input with waveform visualization and transcription for chat,
    gated to Dyad Pro users. Improves reliability with fixes for recording
    setup leaks, analyser state, and proper audio MIME typing so IPC-backed
    Dyad Engine transcription consistently appends text; addresses #1804.
    
    - **New Features**
    - Integrated VoiceInputButton and VoiceWaveform via LexicalVoiceInputRow
    in ChatInput and HomeChatInput; appends transcribed text to the input.
    - Added useAudioRecorder/useVoiceInput hooks to record via
    MediaRecorder, visualize with AnalyserNode, and invoke IPC channel
    chat:transcribe.
    - Pro gating with tooltip and disabled state for non-Pro users;
    recording can always be stopped.
    - IPC handler validates payloads and calls Dyad Engine via multipart
    upload; includes E2E mock support.
    - E2E tests mock getUserMedia and verify transcription append and
    Pro-only disabled state.
    
    - **Migration**
    - Provide a Dyad Pro API key (settings or DYAD_PRO_API_KEY) and enable
    Dyad Pro.
      - Ensure microphone permissions are granted.
    - Optionally set DYAD_ENGINE_URL; defaults to https://engine.dyad.sh/v1.
    
    <sup>Written for commit fa71433ae270a7276e5466c6c8df359eab1eb03d.
    Summary will update on new commits.</sup>
    
    <!-- End of auto-generated description by cubic. -->
    
    ---------
    Co-authored-by: 's avatarWill Chen <willchen90@gmail.com>
    31c1a145
pro_handlers.ts 4.3 KB