all over natural languages--among other things--you see this pattern of how common units such as letters, sounds, or words are. the frequency of a unit is inversely proportional to its rank.
here's a simplified example. imagine a language with five letters: ABCDE. let's say A is the most common letter, B is the second, and so on. we then expect A to occur twice as much as B, three times as much as C, and so on. we might have counts as follows:
- A: 60
- B: 30
- C: 20
- D: 15
- E: 12
and we might have percentages like:
- A: 44%
- B: 22%
- C: 15%
- D: 11%
- E: 8%
in english, the most common word "the" makes up about 7% of some big collections of text, while the second most common word "of" makes about about 3.5%