[Classifier4j-devel] Problem with MySQL and JDBCWordsDataSource
Status: Beta
Brought to you by:
nicklothian
|
From: Nadja S. <sen...@21...> - 2006-05-02 19:39:18
|
Hello all,
I am trying out Classifier4J as a possible tool for categorizing news
messages. I have several thousand test files of varying length at the
moment and 12 different categories. With that amount of data I have to
use JDBCWordsDataSource (I naturally get "out of memory"-errors with
SimpleWordDataSource) or something similar. Also, I chose to use
JDBCWordsDataSource over JDBMWordsDataSource mostly because I couldn't
figure out how to properly use JDBMWordsDataSource (can't find the
source code of it and there doesn't seem to be much documentation that I
can find for it either).
Anyway, long story short: I keep getting the
"net.sf.classifier4J.bayesian.WordsDataSourceException: Problem updating
WordProbability" while still training some texts for my first category
and it seems that the underlying problem here is another exception:
java.net.SocketException: "java.net.BindException: Address already in
use: connect". The MySQL documention tells me that this happens when an
application is trying to open too many connections within a short time span.
Now what I am basically doing code-wise is this (the code has been
simplified so that it only includes neccessary information):
Iterator iter = list.iterator(); /*list is an ArrayList of filenames to
train with for this category*/
while(iter.hasNext()){
nextFile = (String)iter.next();
text = TextUtilities.getText(nextFile); /*returns the contents of
the file as plain text*/
tokenizedText = this.tokenizer.tokenize(text);
for(int i = 0; i < tokenizedText.length; i++){
jdbcDataSource.addMatch(pool, tokenizedText[i]);
}
}
I hope this piece of code will still be readable once I send the email. :)
Some things seem to get entered into the database table before the
exception occurs.
I also tried using the classifier so I wouldn't have to add every single
token but could train an entire message at once but I still got the same
exception and it seemed like no data at all made it to the database.
Can anyone help me with this? I just can't figure out how to solve this
problem. Wouldn't surprise me if it was some really stupid mistake on my
part. :)
Regards,
Nadja
|