Animation Lead on Wanderstop! She/Her & Transgenderrific! Past: Radial Games, Gaslamp Games



dog
@dog

Unicode is pretty cool (waits for anyone less obsessive than me to filter out of the post) unicode is pretty cool, but they've made a bunch of weird decisions over the years, and nothing has ever been weirder than their disastrous choice about encoding CJK characters.

Take a look at these two characters:

Really appreciate the different ways they're written and how seeing one of them implies which language it comes from. Wait a minute, those are the same character aren't they. They sure are! But now look at these next two:

They're sure different, right? But it's the same bytes, all thanks to the curse of han unification.

See, back in the day Unicode was built with the assumption they'd be encoding a maximum of 65,536 characters total, but there's several languages that use Chinese-derived characters and there's a lot of those. More specifically between Chinese, Japanese, Korean and Vietnamese, if you looked at all the different variations of each character, you were looking at over 100,000 characters. Wasn't going to fit, right? You could make Unicode larger, but what if... what if you made those languages smaller?

Enter han unification: the idea that instead of giving each language's version of a character its own codepoint, you'd define one single unified codepoint for each character and then define different ways that character should be rendered depending on the language that's being rendered. But of course that gives you a new problem: if you just have a bunch of raw characters, you don't know how those characters are rendered. If it's traditional Chinese, you render it one way. If it's Japanese, you render it another way. You have to have information from outside the text itself to figure out how it's supposed to look.

For example take those two characters above. When I got them to render differently, what I actually did was this:

<div lang="ja">直</div>
<div lang="zh">直</div>

That is, I used the HTML language tag to hint to the document what the rendering should look like, so it knows whether to use the Japanese or Chinese variant of the character. Without that hinting, the browser just picks one - and it might get it wrong!

If you're guessing this might have pissed off a bunch of people, you're completely right. As far as I know, han unification single-handedly pushed back Unicode adoption in Japan by at least a decade, and it didn't really pick up until the iPhone era and international messaging kind of forced the issue. And, best of all: Unicode didn't even end up sticking with the 65,536 limit. Pretty soon after they ended up swapping to new encoding methods that let them handle more than a million characters, but han unification was already finished and the damage was done and we have to deal with it for the rest of our lives.



gamedeveloper
@gamedeveloper

After nearly 20 years, the Xbox 360 has been fully retired. Its digital storefront, which opened alongside the console's launch in 2005, is now closed.

Microsoft previously announced the marketplace would shut down late last year. Players can no longer buy games, DLC, or movies for the console, but anything bought prior to the closure can be re-downloaded to Xbox One or Xbox Series X|S.

The end of the marketplace puts a final bow on the console. Throughout its lifetime, the Xbox 360 helped define the Xbox brand, thanks to Xbox Live, the dashboard (and its many evolutions), and key games like 1 vs. 100 and Gears of War.

Read the full article at Game Developer.