Meeting Notes

Local-first macOS meeting transcription using Apple's on-device speech recognition.

macOSLOCALSource ready; public repo pending.

The Problem

Meetings generate unstructured audio, but capture is friction-heavy. Existing tools require calendar access, inject bots into calls, or stream audio to external services. There's no privacy-first, locally-running solution that captures both sides of a meeting and exports clean transcripts.

What I Built

A macOS terminal + menu bar app (Swift 6.2, macOS 26.0+) that captures meeting audio via ScreenCaptureKit and mic input via AVAudioEngine, transcribes both tracks in real-time using Apple's Speech framework, and labels speakers as "You" vs. "Meeting Audio." Sessions persist as timestamped JSON; a Python utility exports clean, timestamped, speaker-labeled transcripts. The capture target auto-detects common meeting apps (Zoom, FaceTime, Google Meet, Teams, browsers) via a heuristic scorer that reasons about app names, bundle IDs, window titles, and focus state.

Notable

The two-track architecture (separate SpeechTranscriber per track) shows both volatile (in-flight) and final results simultaneously, giving real-time feedback without waiting for finalization. Everything runs on-device — Apple's speech assets download once per locale on first run.

Stack

Swift 6.2macOS 26.0+ScreenCaptureKitAVFoundationApple SpeechAppKitSwiftUIPython 3

Status

Source ready; public repo pending.