vocateca
Podcasts, YouTube & Instagram → on-device transcription → searchable Markdown library.
Everything runs on your Mac. No cloud APIs, no telemetry, no account required to transcribe.
Native SwiftUI · Parakeet-TDT (default) + WhisperKit + Qwen3-ASR · Apple Silicon.
What it does
π§ Podcasts
Search the Apple Podcasts directory by name — with the show’s real description and language, not a storefront guess — or paste any RSS URL, even a Spotify episode link, which vocateca matches to the show’s public feed (Spotify-exclusive shows without a public feed can’t be transcribed). Resumable downloads.
πΊ YouTube
Subscribe to channels by any URL form. Captions-first; on-device transcription when none are available.
πΈ Instagram
Reels, Stories and Posts from accounts you follow, via your own signed-in session — shown by @handle. No third-party API.
π₯ Local files & URLs
Drop audio/video files, import a folder, or paste any URL yt-dlp recognises (SoundCloud, Vimeo, and hundreds more).
β‘ One-off transcription
Paste a single link to transcribe it immediately — no subscription needed.
π Bulk subscribe & OPML import
Move your whole podcast list over in one go: import an OPML file from any other app, or subscribe to many shows at once. Free — with an optional back-catalogue backfill.
π Watchlist & alerts
Track shows and any number of keywords; get notified in-app when something new matches — a specific guest, a topic, a series. Fully free, no keyword limit.
π Fully on-device
Three on-device engines: Parakeet-TDT (default on Apple Silicon, via FluidAudio/CoreML — roughly 2× faster than Whisper large-v3-turbo at comparable accuracy, multilingual), WhisperKit (universal baseline), and Qwen3-ASR (optional higher-accuracy engine, auto-selected on capable Apple Silicon — Pro/Max/Ultra, 24 GB+). All run on the Apple GPU (CoreML/Metal/MLX); the engine is auto-selected per machine and overridable in Settings. Per show you can force “always spoken word” so a music jingle is never mistaken for a skippable track. Audio and transcripts never leave your Mac. First run downloads the ~1.3 GB Parakeet model once.
π Obsidian-ready library
One .md per episode with YAML frontmatter, plus optional .srt, .txt and .html sidecars. Built-in mirroring to an Obsidian vault, plus multi-destination export to copy each transcript to several folders — your own knowledge hub.
π Full-text search
Search across your whole library by keyword. Full-text, not semantic. Every transcript records which engine and model produced it.
π§ Built to find your way
Screens organised by what you want to do, not by system internals. A short guided tour on first launch, re-runnable any time. Full German and English UI. Master–detail browsing for shows and podcast search.
πͺ Webhooks Pro
Signed (HMAC-SHA256) JSON POSTed to your own URL on events such as an episode being transcribed or a run finishing. Wire vocateca into Zapier, n8n, Home Assistant, or anything else that takes a webhook.
β¨οΈ Menu bar, CLI & MCP server
A native macOS menu bar with full keyboard shortcuts for everyday use. For power users: a headless vocateca-cli that exposes every GUI action as a command with stable --json output, so it scripts cleanly — and vocateca-cli mcp exposes the same commands as tools to Claude Desktop or any Model Context Protocol client. Both free.
Queue & automation
A drag-reorderable Up Next queue with multi-select and “stop after current”, plus a Background vs Power mode so transcription can stay light while you work or run flat-out when you’re away. A watchlist tracks the shows and keywords you care about and notifies you in-app — all free. Pro adds the hands-off layer: a background daemon runs a scheduled daily check that auto-downloads and auto-transcribes your subscribed shows with the window closed, then sends you a daily summary.
Free vs. vocateca Pro
The guiding rule: active click = free, runs without you = Pro. Everything you trigger yourself — adding sources, bulk-importing, transcribing, exporting, watching keywords, scripting via CLI/MCP — is free forever. Pro (β¬4.90/month or β¬49/year, incl. VAT — about the price of a coffee; cancel anytime) adds only the automation layer that works while the app is closed.
| Feature | Free | Pro |
|---|---|---|
| Add & browse all sources (Podcasts, YouTube, Instagram, local) | β | β |
| Manual download + transcription, one-off links | β | β |
| Bulk subscribe + OPML import | β | β |
| Searchable Markdown library + .srt/.txt/.html export | β | β |
| Obsidian mirroring + multi-destination export | β | β |
| Watchlist β track shows & keywords, in-app alerts | β | β |
| CLI + MCP server (scriptable, headless) | β | β |
| Manual Notion push | β | β |
| Scheduler β automatic daily source checks | β | β |
| Background daemon β transcribes with the window closed | β | β |
| Per-show auto-download + auto-start queue | β | β |
| Folder watch β auto-ingest dropped files | β | β |
| Webhooks + Notion auto-push | β | β |
| Daily summary notification | β | β |
What’s next
Deeper integrations with Notion and Readwise export (webhooks and manual Notion push already ship today, so both are reachable now). More UI languages beyond German and English.
Privacy
All three transcription engines — Parakeet-TDT (CoreML, via FluidAudio), WhisperKit (CoreML/Metal)
and Qwen3-ASR (MLX) — run entirely on-device on the Apple GPU. No OpenAI API key, no cloud inference.
Your audio and transcripts stay on your Mac (in the output folder you choose and in
~/Library/Application Support/Vocateca/).
Telemetry requires explicit opt-in. Instagram login uses your own signed-in session, stored in your
macOS Keychain — nothing is sent anywhere except Instagram's own servers.
Open core
The vocateca app core is open source (MIT) on GitHub. The Pro automation layer (background daemon, webhooks, per-show auto-download, folder watch) and hosted services (billing, licensing) are proprietary. The core builds on: Parakeet-TDT via FluidAudio (Apache-2.0), WhisperKit (MIT), Qwen3-ASR via speech-swift + MLX (MIT; model Apache-2.0), GRDB.swift (MIT), FeedKit (MIT), Yams (MIT), ffmpeg (LGPL), yt-dlp (Unlicense), gallery-dl (GPL-2.0).