With larger lexica (several 100,000 words), there are
efficiency problems. Profile information suggest that
the data structures and algorithms could be improved.
It might be worthwhile to read up on the topic.
(Sorry that this is rather vague: I didn't profile the
code (colleagues did), but work with the programme very
often - and I am a computational linguist, i.e. aware
of some of the issues.)