jessfromonline

▶️: RISE_AGAINST-SAVIOR.MP3 🎶

jess. @staff, kind of. CRIT Award-nominated TTRPG designer. jewish lesbian. marxist. scifi author. educator. musician. agitprop writer. she/her ☭ קער אַ וועלט היינט


freelancing: available!


Last listened to:
last.fm listening


wesbite 🔗
jessfrom.online/

mcc
@mcc

GIRL: I long to be at your side as you journey to Etheria. But I—
BOY: Your place is here in the castle.

Do you see the problem here? No, you don't, because it doesn't show up in writing. The problem is that someone wrote out a script like this, and one line ended with "And I—" or "But—". And then, I'm assuming here but it must be this is fed into an asset pipeline where each line is independently animated and associated with a voice clip, and the voice actor records each of their lines one by one, probably alone, probably without getting to hear the lines before and after theirs. And so when you listen to this bit in the actual game, it's

GIRL: "I long to be at your side as you journey to Etheria. But I—" [full-second pause as the game waits for her to finish her "natural" gesturing animation before moving the camera]

BOY: [200ms pause before speaking as he begins to "naturally" gesture] "Your place is here in the castle."

No!! That is not how people speak!! We tolerate this in writing because the emdash— is assumed to indicate an interruption in speech. An interruption! Obviously!! Not someone stopping for no reason in mid sentence and then someone else starting to speak a second or more later! You can't just read it out like that! No other form of theatre or filmic media interprets scripts this way, you associate this kind of line-reading with bad high school theatre, but it's all over video games. I'm watching Christine play Final Fantasy 16 and every third line in the opening section does this. This game has so much advanced technology. They've got some fancy face-scanning acting for facial animation. There's an early cutscene where a character is listlessly pushing leftovers around a plate that just feels like pure showing off. But they've got the same problem where they cannot naturally portray a person interrupting another person that the very earliest voice-acted games had, and it's either because of limitations in their directing or their fricking asset pipeline. In earlier games it was easy to just write this off as "bad voice acting", but the acting in this game is professional and might in fact be very good. Whatever's wrong here is systemic.

What makes this possible? Is it because game dialogue started as a text medium and transitioned to a filmic one? Is it because gamers got frog boiled into accepting bad acting? Did allowing dialogue to advance with the A button just override every other concern?


You must log in to comment.

in reply to @mcc's post:

Anyway, things I like about Final Fantasy 16:

  1. Torgal (a tiny dog)
  2. Torgal is such a good boy
  3. Every time Torgal is not on screen I'm asking "Where's Torgal"
  4. Anytime they render fire it looks really good?

Uh… I guess that's it. Maybe they'll add some additional good elements later. They certainly don't seem to have understood what was interesting about Game of Thrones

What's funny is that the only games I've played made in the past year this year were fifa 23 and efootball and both of those games not only don't have this issue, fifa 23 even has special dialogue recorded when a goal is likely and one of the announcers is currently speaking about something else.

The FIFA 23 commentary engine might be the coolest part of that game, to be honest

Path of Exile does this correctly because the two "characters" are actually one character (programming-wise, they're not like fused together) so there's no cutting audio tracks or anything

the one case i can think of that makes enough sense is like, when you're doing single-dialogue boxes and a confirm-before-next-message system, since you would have to swap from always waiting for the player to advance to just advancing for them.

that in and of itself isn't that rare, but given that video games can be played with audio off and not everyone is a great reader, i do wonder how much of an impact this has over things

Voice actors basically never get to speak over the finished product to get the timing that close, and game cinematics don't usually have the equivalent of an "editor" that snugs the timing once all the pieces are together (distinct from a "director" who plots things out beforehand).

And of course multiple languages make it even harder.

There's also the problem of commercial games' demand for ever greater levels of photorealism straining against the medium's aesthetic baggage - the strict codification of genre, especially as that's entwined with mechanics; the many non-verisimilitudinous modes of representation.

This problem is unsolvable as long as the player has control of when dialog advances. I haven't played FF16 but it is in the lineage of games that traditionally cede control of dialog timing to the player. It makes sense to me that developers and players of JRPGs and visual novels would accept that this is just how life is.

Y'know, it's weird, because in FF15 they solved a similar "unsolvable" problem (live composed music that matches the timing of what the player is doing)

I think the solution is probably not a technological one per se though but a writing/conceptual one, and you don't have to give up "advancing". Like, have you ever noticed in printings of Shakespeare, whenever one character interrupts another they don't start a new iambic pentameter "line", they just finish out the line they interrupted? In printings they usually denote this with a long indent. One solution might be you could treat the interrupted line and its completion as one line for text advance purposes. Or just don't have characters interrupt each other…

Actually, the fact that (out of the three hours in the demo) the first (chronologically) notional scene has a ton of interruptions and after that I noticed none makes me wonder if they derived this rule on the fly after seeing how the first scene's voice acting was coming off (this is very unlikely because the chances they did the tutorial/chronological first scene first are not good).

based on a false premise and ignoring the reality that everyone used pencils while they had to but once someone developed a pen that worked in space and didn't leave bits of graphite floating around to short circuit the machine that is the only thing separating them from the cold harsh void immediately placed large orders?

iirc, the PS1 Final Fantasy games will show two textboxes on screen at once for any overlapping/interrupting dialog, and you can advance past them both with a single button press.

Once you have to position where each voice line comes from in a 3D space though, that probably gets really dicey if you want to chunk dialog together.

best version of this i've seen is in the Amarantus demo (which is a visual novel not bound by Voice Acting) where it will automatically advance the dialogue in such cases (and has a toggle for auto advancing or not with these specific sorts of cases)

Getting the timing right in an interruption is really difficult, because in reality 'interruptions' in speech are really overlaps – one actor is briefly talking over the other. So the pause between lines is actually supposed to be negative, and even a very brief gap is noticeable.

So to have actually-good interruptions in a video game, you actually need support for crosstalk, which is often rare. It's one of those edge case things that often goes unsupported.

Getting the specific timing right is also very fiddly, and the workflow at a lot of studios is:

  • Writer writes line
  • Line gets tested in-game with temp audio ('stubs'), is deemed okay
  • Line gets recorded by the real VA
  • Ooops the exacting timing between lines is now fucked up because the VA didn't read the line exactly like the stub
  • This never gets another revision pass to fix dialogue timings

Theoretically writers really should avoid writing for affordances the dialog system doesn't have, but people do anyway.

really, that should make it easier. Playing multiple audio files at once isn't hard, people talk over music and environmental noises all the time. And yet the only game I can think of that does that is because they didn't retime it when Shadow's English VA took a bit longer than his Japanese one.