Using a custom reg-exp splitter with the following regular expression (= r"(?x)([A-Z]\.)+ | \$?\d+(\.\d+)?%? | \w+([-']\w+)*") , allowing indexing of stop words. Searching for the phrase '"Ax+By"' calls the PhraseNode method in the zopyx.txng3.core.Evaluator with a AndNode that has two WordNodes (ax and by) in it ... so in line 100
words = [n.getValue() for n in node.getValue()]
the words array no longer contains unicode strings, but instead it contains a list [[WordNote('ax'), WordNode('by')]] this will raise error in WordList.extend method and sometimes it will make the method Lexicon.getWordIds() return None ids.
Rather than changing anything in the zopyx.txng3.core.parsetree code (change the getValue of the AndNode class) I simply 'fix' the evaluator (see attached file) to make sure the items in the word list there are always unicode strings