Re: [Classifier4j-devel] New Stop Words Provider
Status: Beta
Brought to you by:
nicklothian
From: moedusa <mo...@in...> - 2003-11-16 10:21:24
|
Matt Collier wrote: > Attached is an alternate stop words provider for classifier4J. I simply > copied the whole of DefaultStopWordsProvide.java and renamed it to > AlphaStopWordsProvider.java. May I suggest one thing? I think, it would be better not to hard-code stop-list, or pass string or URL with stop-words-file location, but to find it automatically from the classpath. The idea came from an article at onjava.com (http://www.onjava.com/pub/a/onjava/excerpt/jebp_3/index1.html?page=3), here is a small quotation, explaining what should be done: "Example 3-4[http://www.onjava.com/pub/a/onjava/excerpt/jebp_3/index1.html?page=3#ex3-4] demonstrates the search technique with a class called Resource. Given a resource name, the Resource constructor searches the class path and resource path attempting to locate the resource. When the resource is found, it makes available the resource contents as well as its directory location and last modified time (if those are available). The last modified time helps an application know, for example, when to reload the configuration data. The class uses special code to convert file: URL resources to File objects. This proves handy because URLs, even file: URLs, often don't expose special features such as a modified time. By searching both the class path and the resource path this class can find server-wide resources and per-application resources." You can find code for that class here: http://www.onjava.com/pub/a/onjava/excerpt/jebp_3/index1.html?page=3 it seems that it would be better solution. Well, I hope. Sorry for my poor English, it is not my native language. Also I am sorry for just making suggestions and not doing any coding, but I have three deadlines now and simply have no time for that, but I want help somehow this project to become more useful. |