Documentation

Documentation

Providers

Voiceavel supports multiple AI transcription providers. Switch between them at any time from the menu bar: Settings → Provider.

Soniox Real-time

The default provider for Pro users and trial users. Words appear in your editor as you speak, with near-instant latency — no waiting after you release the key.

  • Cost: 30 min/day included with Pro (managed by Voiceavel — no key needed). 10 min/day during the free 3-day trial.
  • Privacy: Audio is streamed to Soniox's servers, processed in real-time, then discarded
  • Speed: Real-time — words appear within ~200ms of being spoken
  • Accuracy: Excellent — 60+ languages supported
  • Requirements: Pro license (no API key needed). Bring your own Soniox key for unlimited minutes (Pro required).
  • Fallback: When the daily cap is reached, Voiceavel automatically falls back to Groq batch transcription

How it works: Hold the hotkey, speak — text appears live in your editor. Release the key when done. No wait, no delay.

Cloud (Voiceavel)

Uses Groq's Whisper Large v3 running on Voiceavel's servers. No API key needed.

  • Cost: Included with Pro
  • Privacy: Audio is sent to Voiceavel's servers, processed via Groq, then discarded
  • Speed: Ultra-fast (sub-second after key release)
  • Accuracy: Very good — Whisper Large v3
  • Requirements: Pro license (no API key needed)

Local AI

The default provider for free users. Uses faster-whisper, an optimized Whisper model that runs entirely on your Mac.

  • Cost: Free — forever
  • Privacy: 100% offline — your voice never leaves your device
  • Speed: Fast (under 1 second for most recordings)
  • Accuracy: Good for English and major languages
  • Requirements: None — works out of the box

Note: The first transcription after launching may take slightly longer as the model loads. All subsequent recordings will be fast.

Groq (BYOK)

Uses Whisper Large v3 on Groq's LPU hardware with your own Groq API key.

  • Cost: Free tier available from Groq, then usage-based
  • Privacy: Audio is sent to Groq's servers
  • Speed: Ultra-fast (sub-second)
  • Accuracy: Very good — comparable to Cloud (Voiceavel)
  • Requirements: Pro license + Groq API key

OpenAI

Uses OpenAI's GPT-4o Transcribe API for the highest accuracy available.

  • Cost: ~$0.006/minute (billed directly by OpenAI)
  • Privacy: Audio is sent to OpenAI's servers
  • Speed: Fast (1–2 seconds)
  • Accuracy: Best available — excellent for all languages including rare ones
  • Requirements: Pro license + OpenAI API key

AssemblyAI (Real-time BYOK)

Real-time streaming via AssemblyAI's WebSocket API with your own key.

  • Cost: $0.0025/min (~$0.15/hr, billed by AssemblyAI). $50 free credit on signup.
  • Privacy: Audio is streamed to AssemblyAI's servers
  • Speed: Real-time — ~300ms latency
  • Accuracy: Very good for supported languages
  • Requirements: Pro license + AssemblyAI API key
  • Supported languages (streaming): English, Spanish, French, German, Italian, Portuguese. Other languages fall back to batch.

Plan Requirements Summary

| Provider | Free | Trial (3 days) | Pro | |----------|------|----------------|-----| | Local Whisper | ✅ | ✅ | ✅ | | Soniox Real-time | ❌ | ✅ 10 min/day | ✅ 30 min/day | | Cloud (Voiceavel / Groq) | ❌ | ✅ | ✅ | | OpenAI (own key) | ❌ | ✅ | ✅ | | AssemblyAI (own key) | ❌ | ✅ | ✅ | | Soniox (own key) | ❌ | ✅ | ✅ unlimited |

After the 3-day trial ends, only Local Whisper remains available. Upgrade to Pro to restore cloud access.