-
Sorry , I made a mistake,
the Class made that assumption is the
AbstractStopWordFilter.
2009-05-05 21:55:49 UTC by dvdface
-
Hi , Few days ago, I found this framework for the creation of the VSM.
these day , I was working for a simple chinese text classification(TC) system.
I used WVTool in my system. However ,I found a problem.
When I used WVTool In English for TC, it's ok.
but, whem it comes to chinese ,there is problem.
You assume that the length of a word should greater than 2.
That's ok for...
2009-05-05 21:50:20 UTC by dvdface
-
When converting an UTF-8 encoded string to an InputStream the encoding of the string is not taken into account, which causes all letters to be converted to question marks. The attached Rapid Miner process illustrates the problem.
I can fix the problem by giving the encoding as an argument to the getBytes() method called in the loadDocument() method of the SourceAsTextLoader class:
/**...
2009-04-01 08:17:52 UTC by kochan
-
ingomierswa committed patchset 69 of module wvtool to the Word Vector Tool CVS repository, changing 4 files.
2008-08-04 10:05:30 UTC by ingomierswa
-
mjwurst committed patchset 68 of module wvtool to the Word Vector Tool CVS repository, changing 13 files.
2008-08-03 17:59:57 UTC by mjwurst
-
mjwurst committed patchset 67 of module wvtool to the Word Vector Tool CVS repository, changing 4 files.
2007-11-19 09:46:53 UTC by mjwurst
-
mjwurst committed patchset 66 of module wvtool to the Word Vector Tool CVS repository, changing 2 files.
2007-07-19 13:51:43 UTC by mjwurst
-
mjwurst committed patchset 65 of module wvtool to the Word Vector Tool CVS repository, changing 1 files.
2007-07-19 13:51:01 UTC by mjwurst
-
mjwurst committed patchset 64 of module wvtool to the Word Vector Tool CVS repository, changing 6 files.
2007-07-02 15:10:20 UTC by mjwurst
-
mjwurst committed patchset 63 of module wvtool to the Word Vector Tool CVS repository, changing 1 files.
2007-05-25 15:02:08 UTC by mjwurst