all over natural languages--among other things--you see this pattern of how common units such as letters, sounds, or words are. the frequency of a unit is inversely proportional to its rank.
here's a simplified example. imagine a language with five letters: ABCDE. let's say A is the most common letter, B is the second, and so on. we then expect A to occur twice as much as B, three times as much as C, and so on. we might have counts as follows:
- A: 60
- B: 30
- C: 20
- D: 15
- E: 12
and we might have percentages like:
- A: 44%
- B: 22%
- C: 15%
- D: 11%
- E: 8%
in english, the most common word "the" makes up about 7% of some big collections of text, while the second most common word "of" makes about about 3.5%
you'd probably need to have a bit of structure to make sure the results are at least sorta pronounceable, maybe by dealing with consonants and vowels separately and having a few word templates like CV and CVC. also might have to deal with schwa usually not being the only vowel in words except for common function words
here's some possible english words following this kind of logic:
- rin
- nis
- seeb
- vam