1.2.9 Audio-only (Live)
A text alternative must be provided for live audio-only content, delivering equivalent information in real time for users who cannot hear the audio.
What this rule means
WCAG 1.2.9 requires that live audio-only content — such as live radio broadcasts, audio-only podcasts streamed in real time, conference call audio, and emergency audio announcements — provide a real-time text alternative. This is typically delivered as live captions or a real-time text stream that conveys the spoken content as it happens.
This Level AAA criterion extends the concept of live captions (1.2.4, which covers synchronized media) to audio-only live content. The key difference is that there is no video component — only audio being broadcast in real time.
Why it matters
Live audio events exclude deaf and hard-of-hearing users entirely without a text alternative. Unlike prerecorded content where a transcript can be provided after the fact, live audio is time-sensitive — the information has immediate value that diminishes or disappears önce the event is over.
Real-time text alternatives are critical for emergency communications, live news audio feeds, and interactive audio events like radio call-in shows where participation depends on understanding the content as it happens.
A post-event transcript is better than nothing, but it does not satisfy this criterion. The requirement is for real-time access during the live broadcast.
Related axe-core rules
No axe-core rules address live audio-only content. Automated tools cannot detect live audio streams or verify whether real-time text alternatives are being provided. This criterion requires testing during actual live events.
How to test
- Identify all live audio-only content on the platform (live radio, audio streams, conference calls).
- During a live audio event, check whether a real-time text alternative is displayed.
- Evaluate the latency — text should appear within a few seconds of the spoken content.
- Verify that the text accurately represents the spoken content, including speaker identification.
- Check that the text alternative is accessible via screen readers and other assistive technologies.
- Confirm that meaningful non-speech sounds are described in the text stream.
How to fix
Implement a live text stream alongside the audio player:
<div role="region" aria-label="Live audio broadcast with real-time transcript">
<audio controls autoplay>
<source src="/live-stream" type="audio/mpeg" />
Your browser does not support the audio element.
</audio>
<div
id="live-transcript"
role="log"
aria-live="polite"
aria-label="Real-time transcript"
class="live-transcript-panel"
>
<!-- Transcript lines injected in real time -->
</div>
</div>
Connect to a real-time captioning service via WebSocket:
const transcriptEl = document.getElementById("live-transcript");
const ws = new WebSocket("wss://caption-service.example.com/stream");
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
const line = document.createElement("p");
if (data.speaker) {
const speaker = document.createElement("strong");
speaker.textContent = `${data.speaker}: `;
line.appendChild(speaker);
}
line.appendChild(document.createTextNode(data.text));
transcriptEl.appendChild(line);
// Auto-scroll to the latest line
transcriptEl.scrollTop = transcriptEl.scrollHeight;
};
For a simpler approach using the Web Speech API as a real-time fallback:
const recognition = new webkitSpeechRecognition();
recognition.continuous = true;
recognition.interimResults = true;
recognition.lang = "en-US";
const transcriptEl = document.getElementById("live-transcript");
let currentParagraph = null;
recognition.onresult = (event) => {
if (!currentParagraph) {
currentParagraph = document.createElement("p");
transcriptEl.appendChild(currentParagraph);
}
const result = event.results[event.results.length - 1];
currentParagraph.textContent = result[0].transcript;
if (result.isFinal) {
currentParagraph = null;
}
};
recognition.start();
Common mistakes
- Providing only a post-event transcript instead of real-time text during the live broadcast.
- Using automated speech recognition without monitoring for errors, producing unreliable text.
- Not providing speaker identification in the real-time text stream.
- Placing the text alternative in a location that is difficult to find or not associated with the audio player.
- Failing to describe non-speech audio elements such as music, sound effects, or significant pauses.
- Not making the text stream accessible to assistive technologies (missing ARIA roles or live regions).