ConvertBox Logo
ConvertBox

Transcribe audio to text

Use your browser's speech recognition to convert audio or video into SRT, VTT, or plain text. Real-time, private, free.

How to Use

  1. Choose the spoken language of your audio.
  2. Select an audio or video file (up to 100MB).
  3. The audio plays once while the browser transcribes it in real time.
  4. Download the transcript as SRT, VTT, or plain text.

All processing is done in your browser, and files are never sent to a server.

Frequently Asked Questions

In your browser, through the built-in Web Speech API. Some implementations may use cloud recognition (browser-dependent). Files are not uploaded by ConvertBox.
The Web Speech API listens to your device's audio output as the file plays. Mute the speakers if you prefer β€” the recognition still works.
Accuracy is best on clear single-speaker recordings. Background noise, accents, and overlapping speech reduce quality.