i guess follow me @bethposting on bsky or pillowfort


discord username:
bethposting

all over natural languages--among other things--you see this pattern of how common units such as letters, sounds, or words are. the frequency of a unit is inversely proportional to its rank.

here's a simplified example. imagine a language with five letters: ABCDE. let's say A is the most common letter, B is the second, and so on. we then expect A to occur twice as much as B, three times as much as C, and so on. we might have counts as follows:

  • A: 60
  • B: 30
  • C: 20
  • D: 15
  • E: 12

and we might have percentages like:

  • A: 44%
  • B: 22%
  • C: 15%
  • D: 11%
  • E: 8%

in english, the most common word "the" makes up about 7% of some big collections of text, while the second most common word "of" makes about about 3.5%


You must log in to comment.