A small tool to generate usernames out of existing words.
I used the sample of the Corpus of Contemporary American English (COCA) from wordfrequency.info,
which you can download here, and save it to corpus/top5000.txt. It contains the 5,000 most frequent words in the
corpus.
Then process it by running extract_pos.py from inside the corpus/ folder:
$ cd corpus
$ python extract_pos.pyIf you'd like to use another corpus, make sure you have a separate file for each part of speech in the corpus/ directory. The files should be named
pos.txt, where "pos" is one of: adjectives, adverbs, conjunctions, determiners, nouns, prepositions, quantifiers, and verbs. The files should contain
one word per line with no delimiters.
Run usernames.py in an interactive python session (e.g., if using IPython, enter %run usernames.py). This will load the list with different parts of speech
into memory. The lists will be called adj, adv, conj, det, noun, prep, and verb.
To generate one username, use the function make_name() and pass the parts of speech you want the username to have in order.
In []: make_name(det, adj, noun) # Make a username with a determiner, adjective, and a noun.
Out[]: 'TheirUniversalWarrior'
In []: make_name(detm adj, adj, noun) # You can repeat parts of speech.
Out[]: 'MyExactUsefulCup'To generate multiple usernames at once, use make_some_names() and pass it the number of names you want, and optionally a list of the sequences of parts of
speech it can choose from. If you do not pass any sequences, it will choose randomly from its pre-defined options.
Some sequences have pre-defined abbreviations. They are:
- "an": adjective, noun
- "aan": adjective, adjective, noun
- "aa": adverb, adjective
- "adn": adverb, determiner, noun
- "vp": verb, preposition
- "dan": determiner, adjective, noun
- "aca": adjective, conjunction, adjective
- "va": verb, adverb
- "vpn": verb, preposition, noun
- "npn": noun, preposition, noun
- "nn": noun, noun
Example:
# Make 10 names, using all pre-defined sequences.
make_some_names(10)
# Make 5 names from the specified sequences.
make_some_names(5, ["nn", "aa", [noun, conj, noun]])