[Classifier4j-devel] Update Word Probability Break Down
Status: Beta
Brought to you by:
nicklothian
From: Matt C. <MCo...@my...> - 2003-11-11 21:41:56
|
Hello All! I have been working around the clock on various issues relating to my ignorance of Java and the nuances of Classifier4J. Thanks to Nick, and using the latest CVS code, I have succeeded in implemeting Classifier4J after only 60 hours! I have now come upon an interesting problem. My project involves categorizing a large volume of data. That data exists in a blob field in a mySQL (4.0.16) database. I am using this same database to store my word_probability table. I am using the mySQL connector/J 3.0.9. I am using Java SDK 1.4.2_02. My project begins by teaching classifier 4J large amounts of already classified data. I am providing a category and a string taken from the mySQL blob field. All is well at this point. The bayesian teachMatch function works great for about 4000 words (in my environment, results may vary), then: --- SQL Exception in updateWordProbability : Unable to connect to any hosts due to exception: java.net.BindException: Address already in use: connect WordsDataSourceException Occurred during teachMatch : Problem updating WordProbability --- I have added System.out e.getMessage() to the Exception Handler in the updateWordProbability function to produce the above result. Otherwise, you simply see an SQL Exception. Initially I thought this problem related to my ignorance and improper implementation of connection pooling. I wrote the attached test program to eliminate this possibility. I found that the error still existed and is 100% reproduceable on my system. This program effectively loops through x number of teachMatch functions. On my system, the program starts generating exceptions just before 4000, usually between 3800 and 4900 iterations. Just to make sure I didn't have some environmental problem, I wrote another program that writes x records to mySQL, emulating the function of updateWordProbability. No problems here atleast up to 100,000 records. I hope someone with more knowlege and experience will be able to figure this one out. Matt Collier RemoteIT mco...@my... 877-4-NEW-LAN |