and why they might be better off as 6 bits
wait I thought a byte was always 8 bits?
nope! a byte is whatever size a "character" is, at its smallest. on modern systems, this is pretty much always 8 bits, but for a good while, the most common byte size was instead 6 bits. I'll get into why, and why we swapped, as this post goes on.
so we used to use 6 bit bytes?
yes, in the earliest binary computers, and in many later designs until around the mid 1970s, that was the norm. the reason we swapped is partly the fault of the American Standards Association, but also partly the fault of Johannes Gutenberg, and the fault of some monks in the late 700s choosing a new, faster way to write Latin.
let's start with 1950s mechanical computers. basically all of the pre-electronic computers were designed to work with decimal digits. top-of-the-line engineering calculators used ten decimal digits of precision, far more than the typical three significant digits you could get on a slide rule. so, when new, fully electronic computers came out, working in binary, they needed at least enough precision to match these calculators, or engineers would have little interest in them.
it turns out that 34 bits is enough to pull this off (having 17,179,869,184 possible values, more than the 10 billion you need!). adding a sign bit, this gives 35... but both 34 and 35 are rather ugly numbers of bits. the first is 2 x 17, the second is 5 x 7, and neither of these are very nice ways to split up the full word. so, they added one more bit, bumping this up to 36 bits. this allowed them to divide the word into 6 x 6 bit characters, each with enough space to fit all 26 letters and 10 digits used in English, plus punctuation, spaces, and some control characters.
later computers would come out in mostly 18 and 12 bit word sizes, but always with the concept of the 6 bit byte in mind, in these early days. also, sometimes, they cared about binary-coded decimal, which required multiples of 4 to fit decimal digits into binary. this made 12, 24, and 36 bit words preferable.
some early 16 bit computers would come about, such as the MIT Whirlwind in the early 1950s, but these were unconcerned with handling written words, as far as I can tell. they were designed to handle only numbers, so they didn't need a multiple of 6 bits of width.
however, while IBM had its BCD codes for 6 bit standards, even between different IBM machines, these were not standardized. you could never be sure that punch cards or tapes taken from one machine would be readable on another, unless they were the same model of machine. we needed a proper standard, and we'd get one in the 1960s from the American Standards Association, in the form of ASCII.
but first, let's look further back...
something like 770 CE
near as we can tell, the concept of minuscule latin script, as we know it today, came from an abbey somewhere to the north of Paris, sometime before 778 CE. "Carolingian minuscule" was created to make writing faster and more legible between scribes and readers of Latin, within the churches of Europe. the Irish "Insular script" and the widespread "uncial" scripts were beautiful, but not always that legible, in comparison. By using majuscule characters for headers and the starts of sentences, but minuscule characters within sentences, a more familiarly modern style of writing started to emerge.
this style would soon be common in official writing, posted bills, and so on, anywhere the Latin script was used. through cultural exchange, some other languages with similar enough alphabets would do the same, so Greek and Cyrillic eventually got their own minuscule forms.
then forward to the 1400s
as Johannes Gutenberg worked on a movable type printing press, it was plainly obvious to him that he needed both majuscule and minuscule letters. after all, proper writing by hand or by woodcut used both! so he also needed both, and so many other special characters and diacritics and ligatures, and so on. the resulting metal letters and glyphs were arranged into cases, which were arranged according to the needs of the printer.
over time, it became the norm to put majuscule letters in an upper case and minuscule letters in a lower case for the printing press. while Gutenberg himself likely didn't come up with this standard, his decision to ensure all of these letters were present lets us draw a direct line to why we have this concept of "upper and lower case" nowadays.
and back to the 1960s
when the ASA decided what to do for a computer character standard, they were only concerned with English letters, which made the character set they needed small. but, they were also concerned with both upper and lower case letters, due to the centuries-old printing standards! since there isn't room in a mere 64 glyphs for all upper and lower case characters, plus punctuation, spaces, control characters, etc. they couldn't fit this into only 6 bits. (you can fit 26 + 26 + 10 into 62 characters, leaving only 2 for everything that isn't a letter or number!)
they apparently briefly considered making the new standard 6 bits, but using shift codes to swap between code pages. two 6-bit sets would be enough to cover all the characters they needed, and would make the encoding very compact, saving both transfer time and storage space. however, transmitting data at the time had error rates that were too high for this, and were often sent with zero error correction. (for example, the 300 Baud frequency-shift keying modems, which were the norm for a long time!)
without corrections, one error in a shift code could make long sections of the transmission illegible. for this reason, they gave up on the idea of using a 6 bit encoding, and moved to 7 bits instead. the encoding they made in the end is what we now know as ASCII.
while modern computers mostly use Unicode to send text, ASCII is still embedded in the first 128 symbols of unicode, in the exact same positions. so, as long as unicode text uses no other characters, it's identical to ASCII, and can still be read by older systems.
microcomputers
as microchips became more common in the late '70s and throughout the '80s, the computers being made at the time tended to use ASCII for characters on screen, or printed onto paper. since 7 bit words would be awkward, and 8 bits is a multiple of 4 that allows two binary-coded decimal digits, newer machines moved to an 8 bit byte. as a result, the 1970s saw microchip CPUs emerge with 8 and 16 bit word widths.
the 1980s saw the rise of microcomputers, computers small enough to sit on a desk, rather than being the desk or the entire room. these, of course, used microchips, and thus saw the use of CPUs like the Intel 8080 and the Zilog Z80 (8 bit), the MOS Technologies 6502 (8 bit), and the Motorola 68000 (16 bit). as these became widespread, so too did the concept of the 8 bit byte.
and that's why bytes are 8 bits now.
since no standard emerged for a 6 bit byte that could encode the entire ASCII space, and now a modern 6 bit encoding would also need to handle the entire Unicode space, we've pretty much stuck with 8 bit bytes for the past 40 years and change.
this isn't to say we couldn't swap to a 6 bit byte again, if we had the hardware and the encodings to allow for it. maybe at some point in the future, the efficiency that such a tight encoding could offer would drive us in that direction, especially if there are other factors making the hardware attractive.
but, for now, we live in a world of 8 bit bytes, and in great part, it's thanks to English having "upper and lower case" letters. π¦
