Menu

Home

John Dalbey

This project is a random password and passphrase generator written in Python(2.7).

There are already a multitude of password generators in existence but I wanted to build one to meet my specific needs.
I was motivated after reading a Hacker News article about the topic.
In particular I studied the bitwords.py code presented by Kragen in this article.

As I studied Kragen's code several questions occurred to me:

(I assume when he mentions "12-bit words" he is describing words chosen from a pool of 2^12 (4096) candidate words and his code creates this "pool" by slicing from the word list.)
1. The word file he uses is ordered by word frequency and the most commonly occurring words tend to be short. So when he generates "6-bit" words, he is choosing from a pool of only 64 words taken from the beginning of the word list, which are very short words. I was curious why he did this, and concluded that he prefered short words because they are easier to remember or type.
2. Similarly I noticed that his word list has 21822 words that are less than 6 letters long. So conceivably he could generate 14-bit words (2^14=16384) which would give you more entropy. But he didn't. So many of the words in the list are never used. Again, I assumed he limited the pool size to 4096 in attempt to improve word familiarity or memorability.
3. After some contemplation, I concluded the algorithm would be slightly better if we choose a slice of size 16384 starting at some random place x in the list of 21822 (where x < 21822-16384)? I suppose we might generate some words that are slightly less common, but it seems to me that any 5-letter words is going to be pretty easy to remember.

So I decided to enhance his code to use the entire word list. In the process I converted it to OO style and add many explanatory comments. I'm new to Python so please forgive me if my code doesn't follow Python culture. Criticism and suggestions are welcome.

You are welcome to assume I use this program to generate my own passwords but you'll never know for certain :)

Usage

The program runs as a python script
python wordlistdriver.py

To specify the number of bits in the randomly generated number
python wordlistdriver.py 85

To specify the exact random number to be used in password generation (for testing). (Note: the number of bits will be ignored)
python wordlistdriver.py 1 123456789

To provide a wordlist file
python wordlistdriver.py 1 5 alphawords

Sample output

Number: 4 (2 bits) Numwords: 20 Poolsize: 16 Surplussize: 4 Offset: 0
The 2-bit number 4 can be represented as:
In binary: 100
In octal: 4
In the first half of the alphabet to avoid profanity, like Chrome: e
In lowercase letters, ideal for a cellphone keyboard: e
In lowercase letters and digits: 4
In letters and digits: 4
In letters, digits, hyphen, and underscore, like YouTube: 4
In printable ASCII: %
In 12-bit words: e
 (1) letters has entropy 4.7
In 6-bit words: e
In 12-bit words of 5 letters or less: e

Code Overview

filename description
wordlist.py The class that represents the wordlist and is the engine of the application.
unittest.py A set of unittests for wordlist.py
wordlistdriver.py A console application that uses wordlist to display random passwords and passphrases
kragenwords.txt The wordfile used by Kragen in his article
effwords.txt The EFF large wordlist
numericwords a small wordfile for unittesting
alphawords a small wordfile for unittesting

Project Members: