Drop in a recording, get a clean editable transcript, then download it as TXT, Markdown, SRT, VTT, or Word.
Supported formats: .mp3, .wav, .m4a, .ogg, .webm, .flac, .aac, .opus. Maximum file size: 200 MB. Long recordings work but transcription will take longer.
Pick a model, then click Transcribe. The model is downloaded the first time you use it (then cached for offline use), so the first run is slower than subsequent ones.
Click a timestamp to seek the audio. Edit any line directly. Use the × to drop a line.
Pick a format. Plain text and Word strip timestamps; SRT, VTT, and Markdown keep them.
Speech recognition by
transformers.js (Apache 2.0)
running OpenAI’s Whisper (MIT) models.
Word export via JSZip (MIT).
Open-source acknowledgements