Confirmed. CRF++ eats up ridiculous amounts of memory when training on dataset with large number of decision classes.

My case: using crf_learn with default args against a training file consisting of 1.2 million lines (made by 77k sentences).
The columns are the following: wordform, POS ambitag (set of possible values), case ambitag (set of possible values of gram. case), gender ambitag, number ambitag, tag to be chosen (decision class).
There are 925 different values of the decision class (last column, tag). The second column has also got quite a number of different values: 241.
Observed behaviour: crf_learn reads the training data and before finishing the first iteration it takes up 40 GB ram and still counting. Can't go any further, it's already on my swap.

Is this the case that the underlying algorhithm is prone to combinatorial explosions in such cases?