Thread: [infomap-nlp-users] problems in constructing a sentence model

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Dear All,

   I'm trying to build a model for BNC corpus but instead of building 
for document, I want to do it for sentence. I have all the corpus 
separately sentence-by-sentence in files and then when trying to 
construct the model, it fails showing me this error:

Allocating filename memory: Cannot allocate memory
Couldn't initialize tokenizer.
make: *** [/corpus/models//BNC_SENTENCE/wordlist] Error 1

The file directory, is huge,  now contains 5.009.088 files and I don't 
know if there is any problem because of the amount of files, and/or just 
because they contain a very few data and doesn't make sense to construct 
such a model or just because of the default parametres.

Thanks in advance,

Bests,

Montse