Pinna detects who-spoke-when and turns your call and meeting audio into clean, speaker-labeled transcripts. Submit by web, email, or API — get back a transcript that knows the difference between voices.
Automatically detects how many speakers are in the audio and labels who-spoke-when — no manual tagging, no guessing.
Whisper-class ASR turns speech into text, time-aligned and segmented per speaker turn so the transcript reads like the conversation happened.
Hebrew, Arabic, English, and Russian — including right-to-left scripts. Auto-detected, or force a language when you know it.
Upload in the web app, email an attachment, or call the API. Same engine, same speaker-labeled result, whichever fits your workflow.
The core runs without any third-party cloud-API dependency — which is what makes on-prem and airgap deployment possible for regulated environments.
Every speaker turn carries a timestamp, so you can jump straight to the moment in the recording that a line came from.
Drag a file into the app, watch the job progress, and download the transcript when it's done. The simplest way to get started.
Send the audio as an attachment and get the transcript back — no app needed. Perfect for forwarding a recording the moment a call ends.
<you>@in.pinna.im. The sending address has to be on your account allowlist.Submit jobs programmatically and pull results into your own systems. Built API-first, so automation is a first-class path, not an afterthought.
The core diarization and transcription run on a self-contained engine. Your audio doesn't get handed to a third-party transcription API to do the work.
Because the engine is offline-capable, Pinna can be deployed onto infrastructure you own — including airgapped environments — not only as a hosted service.
Email ingress is allowlisted per account, and the authenticated app manages API keys and job access — so only the people you authorize can submit and read.