In my recent adventures into indie animated series on YouTube, I've found some of them have had someone—presumably a fan or a staff member—go to the effort of manually adding closed captions to the video. Which is great! Since the automated captions on YouTube are still pretty garbage.
(Why do the automated captions display words one-by-one instead of just printing out the maximum amount that can fill two lines?)
Only problem is that the captions I've seen are pretty amateur. That isn't to discount the work that went into writing and timing them, and they even went to the effort of including audio descriptions for a deaf audience. The issue lies in improper formatting and some sloppy presentation.
So, let's go down some of the common and less-common fumbles. I mainly operate on something approaching the official Netflix subtitle style guide, but I've also been doing this off-and-on for about 10-ish years.
- A big one first: do not put multiple exclamation points or question marks. This does not actually convey any useful information to the viewer, and it just looks sloppy. Trust that the body language on-screen will convey the degree of someone's statement
- This is a real nitpicky thing since it's a very soft rule that's broken all the time, but I'll say it anyway: exclamation point question mark is not grammatically correct. Games and anime subs constantly ignore this, so it's just a question of how strictly you want to adhere to grammar rules. If you use the interrobang, I will find you
- Try to not use exclamation points too often, a character slightly raising their voice does not necessarily mean a statement is being exclaimed
- Don't have too many resolved sentences in the same line (e.g. "Don't! It's dangerous!" Be careful!"), these can usually be concatenated into other statements ("Don't, it's dangerous! Be careful!"). This is a general polish thing
- It's better to use italics for emphasis than putting a word in all-caps. If your given format doesn't support italics, then it's fine to use caps
- Subtitling programs like Aegisub have a useful feature that marks lines with progressive shades of red to indicate if the captions-per-second are above a safe level for a viewer to easily process. Slightly red is usually fine, but if you're approaching solid red, consider combining it into another line to give it more time (though sometimes this is sadly an unavoidable problem)
- Small but major: always try to have a gap of time between subtitles rather than having them bunched together. This helps the viewer process that the caption has switched to a new line
- I don't see this rule broken very often, but I'll point it out anyway: avoid going over two lines on-screen at a given time (you can occasionally go to 3 if several characters are talking in quick succession)
- If you have two different character's dialogues on-screen in the same line (a good practice for adjacent statements), put a hyphen at the start of each line to indicate they belong to different speakers. I use hypen and space, but the rule is usually just a hyphen
- If you want to go hardcore Netflix mode, always point out who the speaker is when they're not visibly speaking. You only need to include this on the first line if the next ones belong to that same speaker, only include a name when a non-visible speaker begins talking
- If a character is repeating a phrase, you don't need to write out every single instance that's used. Instead of something like "No no no no no no no no!" use "No, no, no…!" instead
- Use an ellipses to convey a statement trailing off, and a double-hyphen to convey a statement being interrupted/stammered. There's some overlap between an interruption and a trail-off, so use your best judgment. I recommend using the unicode ellipses (Alt + 0133) in case you have to do a find-and-replace on any periods. Also avoid having too many ellipses close together, it looks bad
- Do not use two periods in any circumstance, this is really easy to fix if you're using the unicode ellipses (find and replace two periods with one)
- If there's a pause longer than about 3-4 seconds in a statement, that's usually grounds to break it up into separate lines
- Instead of individually writing out all the parts of a non-verbal line (e.g. "Ha, haha, ha!"), use descriptors inside of brackets ([laughs derisively])
- Capitalize non-name titles when they're being referred to by that title in a direct way
- "My father.", "I'm waiting for Father to pick me up.", "Listen to your father."
- "The prince has arrived.", "The noble Prince Thomas has arrived!"
- Also remember to capitalize titles like Your Highness
- Never have multiple different lines appear on-screen. Don't do the thing of having background dialogue on top of the screen, it creates visual chaos for a viewer and it's shocking that services like Crunchyroll still think it's acceptable. You have to be an absolutist about whether a line is important enough that a deaf/hard-of-hearing viewer needs to know it's being said. Include it as a main line, put it into an audio description like [audience chattering], or just don't include it at all
- Try to keep lines from occupying more than 50% of the screen width, and bias towards natural rests in statements (like after commas) for line breaks to avoid chopping up the dialogue unnecessarily (e.g. avoid things like "He is working on the knight's [line break] weapon"). This is such a bad problem in video games
- Small but very important: when space allows, have a line begin about half a second before the audio begins and also end half a second after the statement is concluded. People often register on-screen text a little slower than their auditory processing, so it helps for ease of viewing
- If there's two character dialogues butting up against each other (like one character interrupting another), aim to have the first line end a tiny bit into the second statement before changing to the next line. I can't really explain how this helps, but it does
- Avoid trying to modulate a word when a character puts emphasis on a syllable (e.g. "See you guys laterrrrrrrr!"), this creates words that are not words and are thus harder to visually process. Either use punctuation like an ellipses, put the word into italics, or just trust the viewer to recognize the body language
- Big one aimed primarily at people doing the subs on these indie animated shows: do not try to insert your own comedy into the subtitles. Stuff like emoticons, "goofy" comments/descriptors, etc. It's just distracting and annoying because a viewer's attention is being pulled away from the thing they're actually here to watch
Note that, for anyone who's ever seen the work I've done (which is pretty much limited to my circle of friends), I might not always adhere to these rules one-hundred percent, but they're very good guidelines that I always try to follow.
As a little rant session, captioning is one of those very important things for media to have in general, but is seen as low-skill labor because it's technically something anyone can do, to the point of some services trying to make a push towards more automation in the space to save money.
Granted, that automation will almost always adhere to the rules, but it isn't magic and is lacking context from what's happening on-screen. You'll always need people at the end of the day to do this job, and it's labor that does require knowledge of a particular set of rules, standards, and guidelines. Plus, it's incredibly time-consuming depending on the length and line density of the media (when I briefly did timing for Gamecenter CX, each one-hour episode would take twenty hours or more to finish).
For all of Netflix's issues (and their extreme inconsistency about including separate captions tracks for anime dubs), they've been adhering to a certain bar of quality for a very long time on many, many shows and movies. It's a consistency that I wish was more common across the streaming services, and more understood by people doing the (presumed) charity work of captions for shows on YouTube.
