vocateca

Podcasts, YouTube & Instagram → on-device transcription → searchable Markdown library.
Everything runs on your Mac. No cloud APIs, no telemetry, no account required to transcribe. Native SwiftUI · Parakeet-TDT (default) + WhisperKit + Qwen3-ASR · Apple Silicon.

macOS 15+ · Apple Silicon Parakeet-TDT · WhisperKit · Qwen3-ASR Deutsch & English Signed & notarised Open core
Source on GitHub
vocateca is currently in development and not yet publicly released. Watch the GitHub repo for updates.

What it does

🎧 Podcasts

Search the Apple Podcasts directory by name — with the show’s real description and language, not a storefront guess — or paste any RSS URL, even a Spotify episode link, which vocateca matches to the show’s public feed (Spotify-exclusive shows without a public feed can’t be transcribed). Resumable downloads.

πŸ“Ί YouTube

Subscribe to channels by any URL form. Captions-first; on-device transcription when none are available.

πŸ“Έ Instagram

Reels, Stories and Posts from accounts you follow, via your own signed-in session — shown by @handle. No third-party API.

πŸ“₯ Local files & URLs

Drop audio/video files, import a folder, or paste any URL yt-dlp recognises (SoundCloud, Vimeo, and hundreds more).

⚑ One-off transcription

Paste a single link to transcribe it immediately — no subscription needed.

πŸ“š Bulk subscribe & OPML import

Move your whole podcast list over in one go: import an OPML file from any other app, or subscribe to many shows at once. Free — with an optional back-catalogue backfill.

πŸ”” Watchlist & alerts

Track shows and any number of keywords; get notified in-app when something new matches — a specific guest, a topic, a series. Fully free, no keyword limit.

πŸ”’ Fully on-device

Three on-device engines: Parakeet-TDT (default on Apple Silicon, via FluidAudio/CoreML — roughly 2× faster than Whisper large-v3-turbo at comparable accuracy, multilingual), WhisperKit (universal baseline), and Qwen3-ASR (optional higher-accuracy engine, auto-selected on capable Apple Silicon — Pro/Max/Ultra, 24 GB+). All run on the Apple GPU (CoreML/Metal/MLX); the engine is auto-selected per machine and overridable in Settings. Per show you can force “always spoken word” so a music jingle is never mistaken for a skippable track. Audio and transcripts never leave your Mac. First run downloads the ~1.3 GB Parakeet model once.

πŸ“ Obsidian-ready library

One .md per episode with YAML frontmatter, plus optional .srt, .txt and .html sidecars. Built-in mirroring to an Obsidian vault, plus multi-destination export to copy each transcript to several folders — your own knowledge hub.

πŸ”Ž Full-text search

Search across your whole library by keyword. Full-text, not semantic. Every transcript records which engine and model produced it.

🧭 Built to find your way

Screens organised by what you want to do, not by system internals. A short guided tour on first launch, re-runnable any time. Full German and English UI. Master–detail browsing for shows and podcast search.

πŸͺ Webhooks Pro

Signed (HMAC-SHA256) JSON POSTed to your own URL on events such as an episode being transcribed or a run finishing. Wire vocateca into Zapier, n8n, Home Assistant, or anything else that takes a webhook.

⌨️ Menu bar, CLI & MCP server

A native macOS menu bar with full keyboard shortcuts for everyday use. For power users: a headless vocateca-cli that exposes every GUI action as a command with stable --json output, so it scripts cleanly — and vocateca-cli mcp exposes the same commands as tools to Claude Desktop or any Model Context Protocol client. Both free.

Queue & automation

A drag-reorderable Up Next queue with multi-select and “stop after current”, plus a Background vs Power mode so transcription can stay light while you work or run flat-out when you’re away. A watchlist tracks the shows and keywords you care about and notifies you in-app — all free. Pro adds the hands-off layer: a background daemon runs a scheduled daily check that auto-downloads and auto-transcribes your subscribed shows with the window closed, then sends you a daily summary.

Free vs. vocateca Pro

The guiding rule: active click = free, runs without you = Pro. Everything you trigger yourself — adding sources, bulk-importing, transcribing, exporting, watching keywords, scripting via CLI/MCP — is free forever. Pro (€4.90/month or €49/year, incl. VAT — about the price of a coffee; cancel anytime) adds only the automation layer that works while the app is closed.

FeatureFreePro
Add & browse all sources (Podcasts, YouTube, Instagram, local)βœ…βœ…
Manual download + transcription, one-off linksβœ…βœ…
Bulk subscribe + OPML importβœ…βœ…
Searchable Markdown library + .srt/.txt/.html exportβœ…βœ…
Obsidian mirroring + multi-destination exportβœ…βœ…
Watchlist β€” track shows & keywords, in-app alertsβœ…βœ…
CLI + MCP server (scriptable, headless)βœ…βœ…
Manual Notion pushβœ…βœ…
Scheduler β€” automatic daily source checksβ€”βœ…
Background daemon β€” transcribes with the window closedβ€”βœ…
Per-show auto-download + auto-start queueβ€”βœ…
Folder watch β€” auto-ingest dropped filesβ€”βœ…
Webhooks + Notion auto-pushβ€”βœ…
Daily summary notificationβ€”βœ…

What’s next

Deeper integrations with Notion and Readwise export (webhooks and manual Notion push already ship today, so both are reachable now). More UI languages beyond German and English.

Privacy

All three transcription engines — Parakeet-TDT (CoreML, via FluidAudio), WhisperKit (CoreML/Metal) and Qwen3-ASR (MLX) — run entirely on-device on the Apple GPU. No OpenAI API key, no cloud inference. Your audio and transcripts stay on your Mac (in the output folder you choose and in ~/Library/Application Support/Vocateca/). Telemetry requires explicit opt-in. Instagram login uses your own signed-in session, stored in your macOS Keychain — nothing is sent anywhere except Instagram's own servers.

Open core

The vocateca app core is open source (MIT) on GitHub. The Pro automation layer (background daemon, webhooks, per-show auto-download, folder watch) and hosted services (billing, licensing) are proprietary. The core builds on: Parakeet-TDT via FluidAudio (Apache-2.0), WhisperKit (MIT), Qwen3-ASR via speech-swift + MLX (MIT; model Apache-2.0), GRDB.swift (MIT), FeedKit (MIT), Yams (MIT), ffmpeg (LGPL), yt-dlp (Unlicense), gallery-dl (GPL-2.0).

YouTube, Instagram, Apple Podcasts and other names are trademarks of their respective owners. vocateca is an independent tool, not affiliated with or endorsed by them.