Accessibility isn't like icing on a cake. It's never anything you put on a backlog with the hope of introducing it later. Captions and screen reader descriptions are the only way many users can experience your videos, and in some jurisdictions, they're even required by law or regulation.
To add captions or screen reader descriptions to a web video, add a
<track> tag to a
<video> tag. In addition to captions and screen reader descriptions, tags may also be used for subtitles and chapter titles. The can also help search engines understand what's in a video. Those capabilities are outside the scope of this article.
To add captions to your video add a track element as a child of the video element:
<source src="https://storage.googleapis.com/webfundamentals-assets/videos/chrome.webm" type="video/webm" />
<source src="https://storage.googleapis.com/webfundamentals-assets/videos/chrome.mp4" type="video/mp4" />
<track src="chrome-subtitles-en.vtt" label="English captions"
kind="captions" srclang="en" default>
<track src="chrome-subtitles-zh.vtt" label="中文字幕"
<p>This browser does not support the video element.</p>
<track> tag is similar to the
<source> element in that both have a
src attribute that points to referenced content. For a
<track> tag, it points to a WebVTT file. The
label attribute specifies the how a particular track will be identified in the interface. To provide tracks for multiple languages add a separate
<track> tag for each WebVTT file you're providing and indicate the language using the
srclang attribute. The
default attribute indicates which language track is the default.
Define captions in track file
Below is a hypothetical WebVTT file for the demo linked to above. The file is a text file containing a series of cues. Each cue is a block of text to display on screen and the time range during which it will be displayed.
00:00.000 --> 00:04.999
Man sitting on a tree branch, using a laptop.
00:05.000 --> 00:08.000
The branch breaks, and he starts to fall.
Each item in a track file is called a cue. Each cue has a start time and end time separated by an arrow, with cue text in the line below. Cues can optionally also have IDs:
manuscript in the example below. Cues are separated by an empty line.
00:00:10.000 --> 00:00:12.500
Left uninspired by the crust of railroad earth
00:00:13.200 --> 00:00:16.900
that touched the lead to the pages of your manuscript.
Cue times are in hours:minutes:seconds:milliseconds format. Parsing is strict. Numbers must be zero padded if necessary: hours, minutes, and seconds must have two digits (00 for a zero value) and milliseconds must have three digits (000 for zero). There is an excellent WebVTT validator at Live WebVTT Validator, which checks for errors in time formatting, and problems such as non-sequential times.
You can create a VTT file by hand, thought there are services that will create them for you.