Fake Word Generator

This script generates a number of usually pronounceable and frequently amusing fake words loosely based on a specific language. They can be used, for example, to name characters in games and whatnot. Maybe you need to name a town or an NPC in your next campaign?

This script generates a number of usually pronounceable and frequently amusing fake words loosely based on a specific language. They can be used, for example, to name characters in games and whatnot. Maybe you need to name a town or an NPC in your next campaign?

How does this app work?


The words are generated based on the frequency with which any given sequence of characters occurs in a language, based on data from a sample text (for example, for English, I used the full text of a public domain novel from Project Gutenberg).

The sample text is broken up into individual words, and then each word is broken up into overlapping 3-letter chunks. Given the word explain, it would get the "chunks" exp, xpl, pla, lai, and ain. A tally is kept of how often each 3-letter chunk occurs in the text. To generate a word, the script picks a random 3-letter chunk to start with. Next, it finds all other chunks in which the first 2 characters match the last 2 characters of the initial chunk &emdash; if it started out with maf, it would look for all chunks starting with af to see how often each one appears in the original source text:


	afe => 11
	aff => 43
	afi => 12
	afl => 5
	afo => 5
	aft => 25

The script selects a chunk randomly, but weights each one by their frequency. In this case, the sum of the frequencies is 101, so aff, which has a frequency of 43, has a 43 out of 101 chance of being selected &emdash; a bit less than 43%. Once it's selected a chunk, it appends the last letter of that chunk to the word it's building. If it started with maf and then selected afi, then the word-in-progress would be mafi. Then it repeats the process, looking for chunks that start with fi and repeating the process as described above until the word reaches the requested length.

Comments or suggestions? Send me an email!