I've typed the first three tags of this post as "ß" (U+00DF LATIN SMALL LETTER SHARP S), "ẞ" (U+1E9E LATIN CAPITAL LETTER SHARP S), and "ss" (two times U+0073 LATIN SMALL LETTER S). Let's see what happens...
Not really surprising, because both "ß" and "ss" already existed as tags. Now let's see which one it gets transformed to (almost certainly "ß" though). I've typed the first tag of this post as "ẞ" (U+1E9E LATIN CAPITAL LETTER SHARP S).
I'm curious what would have happened if the uppercase "ẞ" tag had existed first. Would the lowercase "ß" tag then get merged into it, or would both coexist?
... wait, I can just test this, can't I? I've typed the first tag of this post as "new tag with uppercase ẞ".
So "new tag with uppercase ẞ" now exists with an actual uppercase ẞ in it. And going by the tag search, "new tag with uppercase ß" (with a lowercase "ß") is considered equivalent, but "new tag with uppercase ss" is not.
Not quite sure what that tells us about how Cohost does case-normalization, but I think it means that they don't do a proper case-insensitive comparison, but instead convert tags to lowercase before comparing them, or something.
For those unaware what all of this is about: although German has both a lowercase "ß" and an uppercase "ẞ", the uppercase version is not generally used, except in some special cases. Instead, "ß" is normally uppercased as "SS" (because "ß" is kind-of-but-not-completely-equivalent to "ss").
This is reflected in the Unicode uppercasing rules, making "ß" one of the few letters that will not "roundtrip" when converted to uppercase and back1. However, "ẞ" converted to lowercase results in "ß" and not "ss". This means that when you want to compare two strings case-insensitively, it's not always sufficient to convert both sides to lowercase! Because of this, Unicode also defines a different case mapping specifically for case-insensitive conversions, which converts both "ß" and "ẞ" to "ss" in a single step.
To demonstrate, in the Python REPL:
>>> "ß".upper()
'SS'
>>> "ß".upper().lower()
'ss'
>>> "ẞ".lower()
'ß'
>>> "ẞ".lower().upper()
'SS'
>>> "ß".casefold()
'ss'
>>> "ẞ".casefold()
'ss'
-
It's also one of the even fewer letters where converting to uppercase results in a different number of letters!
As expected, #i already exists so #İ gets normalized to #i, which is what happens to #I. #ı remains unmolested, though.
