This script generates a number of usually pronounceable and frequently amusing fake words loosely based on a specific language. They can be used, for example, to name characters in games and whatnot. Maybe you need to name a town or an NPC in your next campaign?
This script generates a number of usually pronounceable and frequently amusing fake words loosely based on a specific language. They can be used, for example, to name characters in games and whatnot. Maybe you need to name a town or an NPC in your next campaign?
The words are generated based on the frequency with which any given sequence of characters occurs in a language, based on data from a sample text (for example, for English, I used the full text of a public domain novel from Project Gutenberg).
The sample text is broken up into individual words, and then each word is broken up into overlapping 3-letter chunks. Given the word explain
, it would get the "chunks" exp
, xpl
, pla
, lai
, and ain
. A tally is kept of how often each 3-letter chunk occurs in the text. To generate a word, the script picks a random 3-letter chunk to start with. Next, it finds all other chunks in which the first 2 characters match the last 2 characters of the initial chunk &emdash; if it started out with maf
, it would look for all chunks starting with af
to see how often each one appears in the original source text:
afe => 11
aff => 43
afi => 12
afl => 5
afo => 5
aft => 25
The script selects a chunk randomly, but weights each one by their frequency. In this case, the sum of the frequencies is 101, so aff
, which has a frequency of 43, has a 43 out of 101 chance of being selected &emdash; a bit less than 43%. Once it's selected a chunk, it appends the last letter of that chunk to the word it's building. If it started with maf
and then selected afi
, then the word-in-progress would be mafi
. Then it repeats the process, looking for chunks that start with fi
and repeating the process as described above until the word reaches the requested length.