[marf-cvs] marf/src/marf/Stats StatisticalObject.java,1.15,1.16
Brought to you by:
mokhov
From: Serguei A. M. <mo...@us...> - 2006-09-03 20:56:22
|
Update of /cvsroot/marf/marf/src/marf/Stats In directory sc8-pr-cvs7.sourceforge.net:/tmp/cvs-serv24170/marf/src/marf/Stats Modified Files: StatisticalObject.java Log Message: Fix two bugs in the Zipf's Law implementation, namely #1459461 and #1551592. In the latter a case when just a corpus filename is supplied was overlooked and was giving and error of invalid options. The former of the irregular ArrayIndexOutOfBounds, was due to the fact that for the large corpora (and depending on the tokenizer settings) there exist words with the occurence frequency greater than 100, but we were collecting up to the hundred only, so this was for now fixed by placing a warning that beyond the page boundary, no C(f,w) is computed. This ought to change for the general case, but in some near future. Clean up some comments in the StatisticalObject. Index: StatisticalObject.java =================================================================== RCS file: /cvsroot/marf/marf/src/marf/Stats/StatisticalObject.java,v retrieving revision 1.15 retrieving revision 1.16 diff -C2 -d -r1.15 -r1.16 *** StatisticalObject.java 17 Jan 2006 22:41:14 -0000 1.15 --- StatisticalObject.java 3 Sep 2006 20:56:02 -0000 1.16 *************** *** 17,26 **** { /** ! * Word's frequency in a given corpus. Default <code>0</code>. */ protected int iFrequency = 0; /** ! * Word's rank in the corpus. The rank of 1 indicates the top frequent word. * Default is <code>-1</code>, i.e. unset. */ --- 17,29 ---- { /** ! * Number of occurences of this object basic data in a given document ! * (for example a corpus or a WAVE file). ! * Default <code>0</code>. */ protected int iFrequency = 0; /** ! * Rank of the object in the document. ! * The rank of 1 indicates the most frequent. * Default is <code>-1</code>, i.e. unset. */ |