RE: [Classifier4j-devel] Project Improvements
Status: Beta
Brought to you by:
nicklothian
From: Nick L. <nl...@es...> - 2003-11-13 00:25:24
|
> Bayesian tokenizer. It was reported that the tokenizer > improperly handles a > number of strings including possessive pronouns and others. > Anybody working > on this? > I don't remember this discussion. Could you post a reference? > HTML togenizer for Bayesian system. Idea was to be able to > "ignore" xml in a > classification string. This happens to be required for my > current project. > I've either got to remove HTML from my source documents or > get C4J to ignore > it. > Yes, this would be nice. If you want to do it in C4J then you need to implement the net.sf.classifier4J.ITokenizer interface. > Connection pooling. What ARE we going to do about connection pooling. > Still looking at this. > Documentation. We need some. I would like to help with > this. How do we do > it? What framework are we using for documentation. > Cool. I'm using Maven to build the website (which contains the docs, such as they are). The docs themselves are in CVS (See <http://cvs.sourceforge.net/viewcvs.py/classifier4j/Classifier4J/xdocs/>) in xdoc format. The xdoc format is (kindof) documented at <http://jakarta.apache.org/site/jakarta-site-tags.html> Patches/New docs/Whatever are greatfully accepted. |