Captioning Types Explained: SDH, Live Captions, CART & More

Not all captions are the same — and the difference matters more than most people realize. Standard subtitles transcribe dialogue. SDH tells you a door slammed. CART provides human-generated real-time captions accurate enough for legal proceedings. Auto-captions are fast but imperfect. Knowing which type you're working with — and which is available on which platform — helps you set realistic expectations and find the right tool for each situation.

The Main Types of Captioning

PRE-RECORDED

Subtitles vs SDH (Subtitles for the Deaf and Hard of Hearing)

Standard subtitles transcribe spoken dialogue only — they assume the viewer can hear everything else. SDH goes further: it identifies speakers, describes non-speech audio (music, sound effects, ambient sounds), and conveys audio information that would otherwise be inaccessible. Examples of SDH descriptions include [wind howling], [phone ringing in distance], [crowd cheering], [tense music], [door slams].

For someone with significant hearing loss, the difference between subtitles and SDH is the difference between following the plot and understanding the mood, tension, and context of a scene. A character's reaction to a sound you didn't know happened only makes sense if the caption told you the sound occurred.

SDH is the standard for physical media (Blu-ray, DVD) in the US and is increasingly available on streaming platforms. Look for "SDH" or the CC symbol specifically — not just "subtitles" — when selecting caption tracks.

Best forPre-recorded content — streaming, physical media, downloaded video

LimitationOnly available on content that has been professionally captioned — live content uses different systems

LIVE / AUTO-GENERATED

Automatic Speech Recognition (ASR) Captions

ASR captions are generated in real time by speech recognition software — the same technology behind Siri, Google Assistant, and voice-to-text. They appear with a short delay (typically 1-3 seconds) and accuracy varies significantly based on speaker clarity, accent, background noise, and technical vocabulary.

Major platforms have built ASR captioning into their core products. Microsoft Teams, Zoom, Google Meet, and Apple's FaceTime all offer live auto-captions. Accuracy on clear speech in quiet environments is generally good. Accuracy drops with multiple simultaneous speakers, strong accents, technical terms, or background noise — exactly the conditions common in real meetings.

ASR captions do not describe non-speech audio — they capture words only. [Laughter] or [applause] may occasionally appear in some implementations, but environmental sounds are generally not described.

Best forLive meetings, calls, real-time conversations where some errors are acceptable

LimitationAccuracy varies; no sound description; struggles with accents and technical terms

PROFESSIONAL LIVE

CART (Communication Access Realtime Translation)

CART is human-generated real-time captioning provided by a trained stenographer. A CART provider listens to speech and transcribes it using a stenotype machine at speeds that keep pace with natural conversation — typically 95%+ accuracy even with technical content, accents, and multiple speakers.

CART is the gold standard for accessibility in educational, legal, medical, and professional settings. It's the system used in courtrooms, at conferences, and in university lecture halls. It can describe non-speech audio when the provider chooses to do so.

The practical limitation is cost and availability — CART requires a trained human provider and is priced accordingly. Remote CART (where the provider works off-site via audio feed) has made it more accessible, but it remains a premium service relative to ASR alternatives.

Best forHigh-stakes situations — legal, medical, academic, professional conferences

LimitationCost; requires advance scheduling; not practical for casual daily use

BROADCAST

Closed Captions (CC) — Television Standard

Closed Captions on broadcast television in the US are federally mandated under the FCC's closed captioning rules. All broadcast and cable programming must be captioned. The CC standard includes speaker identification and some non-speech audio description, though quality varies significantly between providers.

Live broadcast captions (news, sports, live events) are typically generated by stenographers or voice writers — humans who re-speak content into speech recognition software trained on their voice. Pre-recorded broadcast content is captioned in post-production and is generally more accurate.

Best forTelevision — live and recorded broadcast content

LimitationLive broadcast quality varies; streaming services operate under different rules than broadcast

Live Captioning Apps & Tools by Platform

The practical question for most hearing-impaired users isn't which captioning type is best in theory — it's what's available on the device in front of them. Here's what exists across major consumer platforms as of 2026.

Platform	Built-In Tool	Where to Find It	Notes
Windows 11	Live Captions	Settings → Accessibility → Captions	Captions any audio on the device. Works offline. Reasonable accuracy on clear speech.
macOS (Ventura+)	Live Captions	System Settings → Accessibility → Live Captions	Similar to Windows implementation. Captions FaceTime calls and device audio.
iPhone / iPad (iOS 16+)	Live Captions	Settings → Accessibility → Live Captions	Captions phone calls, FaceTime, and media. Also available in Control Center.
Android	Live Transcribe	Accessibility settings or Google Play	Transcribes speech around you in real time. Requires internet connection.
Android	Sound Amplifier	Accessibility settings	Not captioning — amplifies and filters audio. Useful complement to captions.
Microsoft Teams	Live Captions + Transcript	Meeting controls → More → Turn on live captions	Captions during meeting; full transcript available after. Speaker identification included.
Zoom	Live Transcription	Meeting controls → CC → Enable Auto-Transcription	Must be enabled by host. Third-party CART integration also supported.
Google Meet	Captions	Bottom bar → Turn on captions (CC icon)	Available to all participants. English-primary; other languages expanding.

Third-Party Apps Worth Knowing

Otter.ai

AI-powered transcription that works in real time and produces a searchable, shareable transcript. Useful for meetings, interviews, and lectures. Free tier has monthly minute limits; paid tiers remove them. Better than most built-in tools for technical vocabulary if you train it on your domain terminology.

Google Live Transcribe (Android)

Standalone app separate from the built-in Android feature. Designed specifically for face-to-face conversations — holds the phone between you and the other person and transcribes speech in real time. Useful in restaurants, appointments, and anywhere you'd normally struggle to hear someone across a table.

Apple Live Listen

Not captioning, but functionally related — uses AirPods as a remote microphone, streaming audio from your phone's mic directly to your ears. Point the phone toward a speaker across the room and hear them through your AirPods. Practical for lectures, presentations, and noisy environments. Found under Settings → Accessibility → Hearing Devices.

Sorenson Communications / ZVRS

Video Relay Services (VRS) for ASL users — connects a hearing-impaired caller with an ASL interpreter who voices the call to the hearing party. Free to qualified users under FCC regulations. Different from captioning but worth knowing for those who use ASL.

A practical note on accuracy: no auto-caption system performs well on proper nouns, technical terms, or names it hasn't seen before. If you're in a specialized field — medical, legal, technical — consider supplementing auto-captions with a custom vocabulary list where the platform allows it, or requesting CART for high-stakes situations where accuracy matters.

Choosing the Right Tool for Each Situation

Watching TV or streaming content — look for SDH specifically, not just subtitles. CC on broadcast is federally mandated.
Virtual meetings — Teams and Zoom have the most capable built-in tools. Enable transcription in addition to live captions where available — the transcript is searchable after the fact.
Face-to-face conversations — Google Live Transcribe (Android) or Live Captions (iOS) with the phone placed between speakers.
Lectures or presentations — request CART through your institution or employer's accessibility office if ASR accuracy isn't sufficient for the content.
Phone calls — captioned telephone services (CapTel, CaptionCall, InnoCaption) display real-time captions of the other party's speech. Free to qualifying individuals under FCC relay service rules.
High-stakes situations — legal, medical, academic — request CART. ASR accuracy is not reliable enough for content where every word matters.

Captioning Types Explained: SDH, Live Captions, CART, and Everything In Between

The Main Types of Captioning

Subtitles vs SDH (Subtitles for the Deaf and Hard of Hearing)

Automatic Speech Recognition (ASR) Captions

CART (Communication Access Realtime Translation)

Closed Captions (CC) — Television Standard

Live Captioning Apps & Tools by Platform

Third-Party Apps Worth Knowing

Otter.ai

Google Live Transcribe (Android)

Apple Live Listen

Sorenson Communications / ZVRS

Choosing the Right Tool for Each Situation