vocateca
Podcasts, YouTube & Instagram → on-device transcription → searchable Markdown library.
Everything runs on your Mac. No cloud APIs, no telemetry, no account required to transcribe.
Native SwiftUI · Parakeet-TDT (default) + WhisperKit + Qwen3-ASR · Apple Silicon.
What it does
π§ Podcasts
Find feeds by name (Apple Podcasts/iTunes directory) or paste any RSS URL — even a Spotify episode link, which vocateca matches to the show’s public feed and transcribes (Spotify-exclusive shows without a public feed can’t be transcribed). Resumable downloads.
πΊ YouTube
Subscribe to channels by any URL form. Captions-first; on-device transcription when none are available.
πΈ Instagram
Reels, Stories and Posts from accounts you follow, via your own signed-in session — shown by @handle. No third-party API.
π₯ Local files & URLs
Drop audio/video files, import a folder, or paste any URL yt-dlp recognises (SoundCloud, Vimeo, and hundreds more).
β‘ One-off transcription
Paste a single link to transcribe it immediately — no subscription needed.
π Fully on-device
Three on-device engines: Parakeet-TDT (default on Apple Silicon, via FluidAudio/CoreML — roughly 2× faster than Whisper large-v3-turbo at comparable accuracy, multilingual), WhisperKit (universal baseline), and Qwen3-ASR (optional higher-accuracy engine, auto-selected on capable Apple Silicon — Pro/Max/Ultra, 24 GB+). All run on the Apple GPU (CoreML/Metal/MLX); the engine is auto-selected per machine and overridable in Settings. Audio and transcripts never leave your Mac. First run downloads the ~1.3 GB Parakeet model once.
π Obsidian-ready library
One .md per episode with YAML frontmatter, plus optional .srt, .txt and .html sidecars. Built-in mirroring to an Obsidian vault, plus multi-destination export to copy each transcript to several folders — your own knowledge hub.
π Full-text search
Search across your whole library by keyword. Full-text, not semantic. Every transcript records which engine and model produced it.
πͺ Webhooks (Pro)
Signed (HMAC-SHA256) JSON POSTed to your own URL on events such as an episode being transcribed or a run finishing. Wire vocateca into Zapier, n8n, Home Assistant, or anything else that takes a webhook.
β¨οΈ Menu bar, CLI & MCP server
A native macOS menu bar with full keyboard shortcuts for everyday use. For power users: a headless vocateca-cli that exposes every GUI action as a command with stable --json output, so it scripts cleanly — and vocateca-cli mcp exposes the same commands as tools to Claude Desktop or any Model Context Protocol client.
Queue & automation
A drag-reorderable Up Next queue with multi-select and “stop after current”, plus a Background vs Power mode so transcription can stay light while you work or run flat-out when you’re away. Per-show retention policies and backfill campaigns work through a show’s back-catalogue. A watchlist tracks shows and keywords you care about. Pro adds a background daemon: a scheduled daily check that auto-downloads and auto-transcribes your subscribed shows with the window closed, then sends you a daily summary.
Free vs. vocateca Pro
The guiding rule: active click = free, runs without you = Pro. Everything you trigger manually is free forever. Pro (β¬4.90/month or β¬49/year, incl. VAT — about the price of a coffee; cancel anytime) adds the automation layer.
| Feature | Free | Pro |
|---|---|---|
| Add & browse all sources (Podcasts, YouTube, Instagram) | β | β |
| Manual download + transcription | β | β |
| Searchable Markdown library + .srt export | β | β |
| Scheduler β automatic daily source checks | β | β |
| Folder watch β auto-ingest dropped files | β | β |
| Background daemon β transcribes with the window closed | β | β |
| Webhooks β signed event notifications to your own URL | β | β |
| Per-show auto-download + retention | β | β |
| Watchlist β track shows & keywords | β | β |
What’s next
Deeper integrations with Notion and Readwise export (webhooks already ship today, so both are reachable now via automation). OPML import for bulk podcast subscription. Additional UI languages.
Privacy
All three transcription engines — Parakeet-TDT (CoreML, via FluidAudio), WhisperKit (CoreML/Metal)
and Qwen3-ASR (MLX) — run entirely on-device on the Apple GPU. No OpenAI API key, no cloud inference.
Your audio and transcripts stay on your Mac (in the output folder you choose and in
~/Library/Application Support/Vocateca/).
Telemetry requires explicit opt-in. Instagram login uses your own signed-in session, stored in your
macOS Keychain — nothing is sent anywhere except Instagram's own servers.
Open core
The vocateca app core is open source (MIT) on GitHub. The Pro automation layer (background daemon, webhooks, retention/backfill, watchlist) and hosted services (billing, licensing) are proprietary. The core builds on: Parakeet-TDT via FluidAudio (Apache-2.0), WhisperKit (MIT), Qwen3-ASR via speech-swift + MLX (MIT; model Apache-2.0), GRDB.swift (MIT), FeedKit (MIT), Yams (MIT), ffmpeg (LGPL), yt-dlp (Unlicense), gallery-dl (GPL-2.0).