You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(5) |
Sep
|
Oct
(14) |
Nov
(37) |
Dec
(13) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(14) |
Feb
|
Mar
|
Apr
(15) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
(2) |
2003 |
Jan
(4) |
Feb
|
Mar
(1) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
2004 |
Jan
(1) |
Feb
(3) |
Mar
|
Apr
|
May
(4) |
Jun
(3) |
Jul
(1) |
Aug
(6) |
Sep
|
Oct
|
Nov
|
Dec
|
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(17) |
Nov
(3) |
Dec
|
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(23) |
Dec
|
2007 |
Jan
|
Feb
|
Mar
(7) |
Apr
(17) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2008 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
(3) |
Sep
(20) |
Oct
|
Nov
(15) |
Dec
(2) |
2009 |
Jan
(38) |
Feb
(4) |
Mar
(20) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(4) |
Jul
|
Aug
(17) |
Sep
(26) |
Oct
|
Nov
(2) |
Dec
|
From: Joern K. <joe...@us...> - 2010-09-06 07:48:34
|
Update of /cvsroot/maxent/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv10024 Modified Files: README Log Message: Added instructions on how to build with maven. Index: README =================================================================== RCS file: /cvsroot/maxent/maxent/README,v retrieving revision 1.1.1.1 retrieving revision 1.2 diff -C2 -d -r1.1.1.1 -r1.2 *** README 23 Oct 2001 14:06:52 -0000 1.1.1.1 --- README 6 Sep 2010 07:48:25 -0000 1.2 *************** *** 4,9 **** See the web site http://maxent.sf.net ! Installing the build tools ========================== --- 4,30 ---- See the web site http://maxent.sf.net + The maxent package can be build with ant and maven. ! Building with maven ! ========================== ! To build the package make sure maven is installed ! on your system. ! The current version and installation instructions ! can be found here: ! http://maven.apache.org/download.html ! ! Go into the maxent source directory ! and type: ! mvn install ! ! The maxent jar file will then be installed into your ! local maven repository. ! ! To build the source distribution from the source ! type: ! mvn assembly:assembly ! ! ! Building with ant ========================== *************** *** 30,39 **** (ksh, bash) - That's it! - - - Building instructions - ===================== - Ok, let's build the code. First, make sure your current working directory is where the build.xml file is located. Then type --- 51,54 ---- |
From: Joern K. <joe...@us...> - 2010-09-06 07:23:57
|
Update of /cvsroot/maxent/maxent/lib In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv5073/lib Modified Files: LIBNOTES Removed Files: LGPL java-getopt.jar Log Message: Removed get-opt jar file, because it is not longer needed. Index: LIBNOTES =================================================================== RCS file: /cvsroot/maxent/maxent/lib/LIBNOTES,v retrieving revision 1.12 retrieving revision 1.13 diff -C2 -d -r1.12 -r1.13 *** LIBNOTES 28 Sep 2008 18:04:35 -0000 1.12 --- LIBNOTES 6 Sep 2010 07:23:44 -0000 1.13 *************** *** 14,27 **** A Java based build tool. It is supported by crimson.jar and jaxp.jar, and license information for those is included in this directory. - - - ------------------------------------------------------------------------ - java-getopt.jar - - The GNU Java Getopt Package, version 1.0.8 - Homepage: http://www.urbanophile.com/arenn/hacking/download.html - License: LGPL - - A Java command line option parser. - - ------------------------------------------------------------------------ --- 14,15 ---- --- LGPL DELETED --- --- java-getopt.jar DELETED --- |
From: Joern K. <joe...@us...> - 2010-09-05 19:38:45
|
Update of /cvsroot/maxent/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv813 Modified Files: CHANGES Log Message: Mentioned perceptron support. Index: CHANGES =================================================================== RCS file: /cvsroot/maxent/maxent/CHANGES,v retrieving revision 1.24 retrieving revision 1.25 diff -C2 -d -r1.24 -r1.25 *** CHANGES 28 Sep 2008 18:04:20 -0000 1.24 --- CHANGES 5 Sep 2010 19:38:36 -0000 1.25 *************** *** 3,7 **** Removed trove dependency. Changed license to ASL. ! Re-organized package structure to support upcomming work. 2.5.1 --- 3,8 ---- Removed trove dependency. Changed license to ASL. ! Re-organized package structure to support up coming work. ! Added perceptron classifier. 2.5.1 |
From: Joern K. <joe...@us...> - 2010-08-12 07:56:53
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv4073/src/main/java/opennlp/maxent Modified Files: GISModel.java GISTrainer.java Log Message: [ maxent-Bugs-3040940 ] GISModel has been rolled back to version 1.2 and GISTrainer has been updated to always write one as correction constant into the model. For the training itself the real correction constant is used. Index: GISModel.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/GISModel.java,v retrieving revision 1.4 retrieving revision 1.5 diff -C2 -d -r1.4 -r1.5 *** GISModel.java 10 Aug 2010 03:25:06 -0000 1.4 --- GISModel.java 12 Aug 2010 07:56:43 -0000 1.5 *************** *** 169,176 **** for (int oid = 0; oid < model.getNumOutcomes(); oid++) { if (model.getCorrectionParam() != 0) { ! prior[oid] = Math.exp(prior[oid]+((1.0 - ((double) numfeats[oid] / model.getCorrectionConstant())) * model.getCorrectionParam())); } else { ! prior[oid] = Math.exp(prior[oid]); } normal += prior[oid]; --- 169,176 ---- for (int oid = 0; oid < model.getNumOutcomes(); oid++) { if (model.getCorrectionParam() != 0) { ! prior[oid] = Math.exp(prior[oid]*model.getConstantInverse()+((1.0 - ((double) numfeats[oid] / model.getCorrectionConstant())) * model.getCorrectionParam())); } else { ! prior[oid] = Math.exp(prior[oid]*model.getConstantInverse()); } normal += prior[oid]; Index: GISTrainer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/GISTrainer.java,v retrieving revision 1.5 retrieving revision 1.6 diff -C2 -d -r1.5 -r1.6 *** GISTrainer.java 10 Aug 2010 14:33:22 -0000 1.5 --- GISTrainer.java 12 Aug 2010 07:56:43 -0000 1.6 *************** *** 300,304 **** modelExpects = new MutableContext[numPreds]; observedExpects = new MutableContext[numPreds]; ! evalParams = new EvalParameters(params,0,correctionConstant,numOutcomes); int[] activeOutcomes = new int[numOutcomes]; int[] outcomePattern; --- 300,309 ---- modelExpects = new MutableContext[numPreds]; observedExpects = new MutableContext[numPreds]; ! ! // The model does need the correction constant and the correction feature. The correction constant ! // is only needed during training, and the correction feature is not necessary. ! // For compatibility reasons the model contains form now on a correction constant of 1, ! // and a correction param 0. ! evalParams = new EvalParameters(params,0,1,numOutcomes); int[] activeOutcomes = new int[numOutcomes]; int[] outcomePattern; *************** *** 375,387 **** /***************** Find the parameters ************************/ display("Computing model parameters...\n"); ! findParameters(iterations); /*************** Create and return the model ******************/ ! return new GISModel(params, predLabels, outcomeLabels, correctionConstant, evalParams.getCorrectionParam()); } /* Estimate and return the model parameters. */ ! private void findParameters(int iterations) { double prevLL = 0.0; double currLL = 0.0; --- 380,393 ---- /***************** Find the parameters ************************/ display("Computing model parameters...\n"); ! findParameters(iterations, correctionConstant); /*************** Create and return the model ******************/ ! // To be compatible with old models the correction constant is always 1 ! return new GISModel(params, predLabels, outcomeLabels, 1, evalParams.getCorrectionParam()); } /* Estimate and return the model parameters. */ ! private void findParameters(int iterations, int correctionConstant) { double prevLL = 0.0; double currLL = 0.0; *************** *** 394,398 **** else display(i + ": "); ! currLL = nextIteration(); if (i > 1) { if (prevLL > currLL) { --- 400,404 ---- else display(i + ": "); ! currLL = nextIteration(correctionConstant); if (i > 1) { if (prevLL > currLL) { *************** *** 442,446 **** /* Compute one iteration of GIS and retutn log-likelihood.*/ ! private double nextIteration() { // compute contribution of p(a|b_i) for each feature and the new // correction parameter --- 448,452 ---- /* Compute one iteration of GIS and retutn log-likelihood.*/ ! private double nextIteration(int correctionConstant) { // compute contribution of p(a|b_i) for each feature and the new // correction parameter *************** *** 481,485 **** } if (useSlackParameter) ! CFMOD += (evalParams.getCorrectionConstant() - contexts[ei].length) * numTimesEventsSeen[ei]; loglikelihood += Math.log(modelDistribution[outcomeList[ei]]) * numTimesEventsSeen[ei]; --- 487,491 ---- } if (useSlackParameter) ! CFMOD += (correctionConstant - contexts[ei].length) * numTimesEventsSeen[ei]; loglikelihood += Math.log(modelDistribution[outcomeList[ei]]) * numTimesEventsSeen[ei]; *************** *** 507,511 **** for (int aoi=0;aoi<activeOutcomes.length;aoi++) { if (useGaussianSmoothing) { ! params[pi].updateParameter(aoi,gaussianUpdate(pi,aoi,numEvents,evalParams.getCorrectionConstant())); } else { --- 513,517 ---- for (int aoi=0;aoi<activeOutcomes.length;aoi++) { if (useGaussianSmoothing) { ! params[pi].updateParameter(aoi,gaussianUpdate(pi,aoi,numEvents,correctionConstant)); } else { *************** *** 514,518 **** } //params[pi].updateParameter(aoi,(Math.log(observed[aoi]) - Math.log(model[aoi]))); ! params[pi].updateParameter(aoi,((Math.log(observed[aoi]) - Math.log(model[aoi]))/evalParams.getCorrectionConstant())); } modelExpects[pi].setParameter(aoi,0.0); // re-initialize to 0.0's --- 520,524 ---- } //params[pi].updateParameter(aoi,(Math.log(observed[aoi]) - Math.log(model[aoi]))); ! params[pi].updateParameter(aoi,((Math.log(observed[aoi]) - Math.log(model[aoi]))/correctionConstant)); } modelExpects[pi].setParameter(aoi,0.0); // re-initialize to 0.0's |
From: Joern K. <joe...@us...> - 2010-08-10 14:33:32
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv28662/src/main/java/opennlp/maxent Modified Files: GIS.java TrainEval.java GISTrainer.java Log Message: [ maxent-Bugs-3042561 ] EventStream should throw IOException Index: GIS.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/GIS.java,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** GIS.java 5 Aug 2010 17:42:27 -0000 1.3 --- GIS.java 10 Aug 2010 14:33:22 -0000 1.4 *************** *** 18,22 **** package opennlp.maxent; ! import opennlp.model.AbstractModel; import opennlp.model.DataIndexer; import opennlp.model.EventStream; --- 18,23 ---- package opennlp.maxent; ! import java.io.IOException; ! import opennlp.model.DataIndexer; import opennlp.model.EventStream; *************** *** 53,57 **** * to disk using an opennlp.maxent.io.GISModelWriter object. */ ! public static GISModel trainModel(EventStream eventStream) { return trainModel(eventStream, 100, 0, false, PRINT_MESSAGES); } --- 54,58 ---- * to disk using an opennlp.maxent.io.GISModelWriter object. */ ! public static GISModel trainModel(EventStream eventStream) throws IOException { return trainModel(eventStream, 100, 0, false, PRINT_MESSAGES); } *************** *** 68,72 **** * to disk using an opennlp.maxent.io.GISModelWriter object. */ ! public static GISModel trainModel(EventStream eventStream, boolean smoothing) { return trainModel(eventStream, 100, 0, smoothing,PRINT_MESSAGES); } --- 69,73 ---- * to disk using an opennlp.maxent.io.GISModelWriter object. */ ! public static GISModel trainModel(EventStream eventStream, boolean smoothing) throws IOException { return trainModel(eventStream, 100, 0, smoothing,PRINT_MESSAGES); } *************** *** 85,89 **** public static GISModel trainModel(EventStream eventStream, int iterations, ! int cutoff) { return trainModel(eventStream, iterations, cutoff, false,PRINT_MESSAGES); } --- 86,90 ---- public static GISModel trainModel(EventStream eventStream, int iterations, ! int cutoff) throws IOException { return trainModel(eventStream, iterations, cutoff, false,PRINT_MESSAGES); } *************** *** 105,109 **** int iterations, int cutoff, ! boolean smoothing,boolean printMessagesWhileTraining) { GISTrainer trainer = new GISTrainer(printMessagesWhileTraining); trainer.setSmoothing(smoothing); --- 106,110 ---- int iterations, int cutoff, ! boolean smoothing,boolean printMessagesWhileTraining) throws IOException { GISTrainer trainer = new GISTrainer(printMessagesWhileTraining); trainer.setSmoothing(smoothing); *************** *** 126,130 **** int iterations, int cutoff, ! double sigma) { GISTrainer trainer = new GISTrainer(PRINT_MESSAGES); if (sigma > 0) --- 127,131 ---- int iterations, int cutoff, ! double sigma) throws IOException { GISTrainer trainer = new GISTrainer(PRINT_MESSAGES); if (sigma > 0) Index: TrainEval.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/TrainEval.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** TrainEval.java 22 Jan 2009 23:23:34 -0000 1.1 --- TrainEval.java 10 Aug 2010 14:33:22 -0000 1.2 *************** *** 18,31 **** package opennlp.maxent; - import java.io.File; - import java.io.FileReader; import java.io.IOException; import java.io.Reader; - import opennlp.maxent.io.SuffixSensitiveGISModelReader; - import opennlp.maxent.io.SuffixSensitiveGISModelWriter; - import opennlp.model.AbstractModel; import opennlp.model.Event; - import opennlp.model.EventCollectorAsStream; import opennlp.model.EventStream; import opennlp.model.MaxentModel; --- 18,25 ---- *************** *** 69,73 **** } ! public static MaxentModel train(EventStream events, int cutoff) { return GIS.trainModel(events, 100, cutoff); } --- 63,67 ---- } ! public static MaxentModel train(EventStream events, int cutoff) throws IOException { return GIS.trainModel(events, 100, cutoff); } Index: GISTrainer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/GISTrainer.java,v retrieving revision 1.4 retrieving revision 1.5 diff -C2 -d -r1.4 -r1.5 *** GISTrainer.java 10 Aug 2010 03:25:06 -0000 1.4 --- GISTrainer.java 10 Aug 2010 14:33:22 -0000 1.5 *************** *** 18,21 **** --- 18,23 ---- package opennlp.maxent; + import java.io.IOException; + import opennlp.model.DataIndexer; import opennlp.model.EvalParameters; *************** *** 199,203 **** * @return A GIS model trained with specified */ ! public GISModel trainModel(EventStream eventStream, int iterations, int cutoff) { return trainModel(iterations, new OnePassDataIndexer(eventStream,cutoff),cutoff); } --- 201,205 ---- * @return A GIS model trained with specified */ ! public GISModel trainModel(EventStream eventStream, int iterations, int cutoff) throws IOException { return trainModel(iterations, new OnePassDataIndexer(eventStream,cutoff),cutoff); } |
From: Joern K. <joe...@us...> - 2010-08-10 14:33:32
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/model In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv28662/src/main/java/opennlp/model Modified Files: OnePassRealValueDataIndexer.java TwoPassDataIndexer.java EventStream.java OnePassDataIndexer.java Log Message: [ maxent-Bugs-3042561 ] EventStream should throw IOException Index: OnePassRealValueDataIndexer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/model/OnePassRealValueDataIndexer.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** OnePassRealValueDataIndexer.java 22 Jan 2009 23:23:33 -0000 1.1 --- OnePassRealValueDataIndexer.java 10 Aug 2010 14:33:22 -0000 1.2 *************** *** 18,21 **** --- 18,22 ---- package opennlp.model; + import java.io.IOException; import java.util.ArrayList; import java.util.Arrays; *************** *** 36,40 **** float[][] values; ! public OnePassRealValueDataIndexer(EventStream eventStream, int cutoff, boolean sort) { super(eventStream,cutoff,sort); } --- 37,41 ---- float[][] values; ! public OnePassRealValueDataIndexer(EventStream eventStream, int cutoff, boolean sort) throws IOException { super(eventStream,cutoff,sort); } *************** *** 47,51 **** * observed in order to be included in the model. */ ! public OnePassRealValueDataIndexer(EventStream eventStream, int cutoff) { super(eventStream,cutoff); } --- 48,52 ---- * observed in order to be included in the model. */ ! public OnePassRealValueDataIndexer(EventStream eventStream, int cutoff) throws IOException { super(eventStream,cutoff); } Index: TwoPassDataIndexer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/model/TwoPassDataIndexer.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** TwoPassDataIndexer.java 15 Mar 2009 03:25:05 -0000 1.2 --- TwoPassDataIndexer.java 10 Aug 2010 14:33:22 -0000 1.3 *************** *** 135,139 **** } ! private List index(int numEvents, EventStream es, Map<String,Integer> predicateIndex) { Map<String,Integer> omap = new HashMap<String,Integer>(); int outcomeCount = 0; --- 135,139 ---- } ! private List index(int numEvents, EventStream es, Map<String,Integer> predicateIndex) throws IOException { Map<String,Integer> omap = new HashMap<String,Integer>(); int outcomeCount = 0; Index: EventStream.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/model/EventStream.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** EventStream.java 22 Jan 2009 23:23:33 -0000 1.1 --- EventStream.java 10 Aug 2010 14:33:22 -0000 1.2 *************** *** 18,50 **** package opennlp.model; ! import java.util.Iterator; /** ! * A object which can deliver a stream of training events for the GIS ! * procedure (or others such as IIS if and when they are implemented). ! * EventStreams don't need to use opennlp.maxent.DataStreams, but doing so ! * would provide greater flexibility for producing events from data stored in ! * different formats. ! * ! * @author Jason Baldridge ! * @version $Revision$, $Date$ ! * */ ! public interface EventStream extends Iterator<Event>{ ! /** ! * Returns the next Event object held in this EventStream. ! * ! * @return the Event object which is next in this EventStream ! */ ! public Event next (); ! ! /** ! * Test whether there are any Events remaining in this EventStream. ! * ! * @return true if this EventStream has more Events ! */ ! public boolean hasNext (); ! ! } --- 18,48 ---- package opennlp.model; ! import java.io.IOException; /** ! * A object which can deliver a stream of training events for the GIS procedure ! * (or others such as IIS if and when they are implemented). EventStreams don't ! * need to use opennlp.maxent.DataStreams, but doing so would provide greater ! * flexibility for producing events from data stored in different formats. ! * ! * @author Jason Baldridge ! * @version $Revision$, $Date$ ! * */ ! public interface EventStream { ! /** ! * Returns the next Event object held in this EventStream. ! * ! * @return the Event object which is next in this EventStream ! */ ! public Event next() throws IOException; ! ! /** ! * Test whether there are any Events remaining in this EventStream. ! * ! * @return true if this EventStream has more Events ! */ ! public boolean hasNext() throws IOException; + } Index: OnePassDataIndexer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/model/OnePassDataIndexer.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** OnePassDataIndexer.java 22 Jan 2009 23:23:33 -0000 1.1 --- OnePassDataIndexer.java 10 Aug 2010 14:33:22 -0000 1.2 *************** *** 18,21 **** --- 18,22 ---- package opennlp.model; + import java.io.IOException; import java.util.ArrayList; import java.util.Arrays; *************** *** 46,54 **** * seen in the training data. */ ! public OnePassDataIndexer(EventStream eventStream) { this(eventStream, 0); } ! public OnePassDataIndexer(EventStream eventStream, int cutoff) { this(eventStream,cutoff,true); } --- 47,55 ---- * seen in the training data. */ ! public OnePassDataIndexer(EventStream eventStream) throws IOException { this(eventStream, 0); } ! public OnePassDataIndexer(EventStream eventStream, int cutoff) throws IOException { this(eventStream,cutoff,true); } *************** *** 61,65 **** * observed in order to be included in the model. */ ! public OnePassDataIndexer(EventStream eventStream, int cutoff, boolean sort) { Map<String,Integer> predicateIndex = new HashMap<String,Integer>(); LinkedList<Event> events; --- 62,66 ---- * observed in order to be included in the model. */ ! public OnePassDataIndexer(EventStream eventStream, int cutoff, boolean sort) throws IOException { Map<String,Integer> predicateIndex = new HashMap<String,Integer>(); LinkedList<Event> events; *************** *** 100,104 **** */ private LinkedList<Event> computeEventCounts(EventStream eventStream,Map<String,Integer> predicatesInOut, ! int cutoff) { Set predicateSet = new HashSet(); Map<String,Integer> counter = new HashMap<String,Integer>(); --- 101,105 ---- */ private LinkedList<Event> computeEventCounts(EventStream eventStream,Map<String,Integer> predicatesInOut, ! int cutoff) throws IOException { Set predicateSet = new HashSet(); Map<String,Integer> counter = new HashMap<String,Integer>(); |
From: Joern K. <joe...@us...> - 2010-08-10 14:33:32
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/perceptron In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv28662/src/main/java/opennlp/perceptron Modified Files: SimplePerceptronSequenceTrainer.java Log Message: [ maxent-Bugs-3042561 ] EventStream should throw IOException Index: SimplePerceptronSequenceTrainer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/perceptron/SimplePerceptronSequenceTrainer.java,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** SimplePerceptronSequenceTrainer.java 25 Mar 2009 23:15:34 -0000 1.3 --- SimplePerceptronSequenceTrainer.java 10 Aug 2010 14:33:22 -0000 1.4 *************** *** 18,21 **** --- 18,22 ---- package opennlp.perceptron; + import java.io.IOException; import java.util.HashMap; import java.util.Map; *************** *** 78,82 **** int numSequences; ! public AbstractModel trainModel(int iterations, SequenceStream sequenceStream, int cutoff, boolean useAverage) { this.iterations = iterations; this.sequenceStream = sequenceStream; --- 79,83 ---- int numSequences; ! public AbstractModel trainModel(int iterations, SequenceStream sequenceStream, int cutoff, boolean useAverage) throws IOException { this.iterations = iterations; this.sequenceStream = sequenceStream; |
From: Joern K. <joe...@us...> - 2010-08-10 09:27:04
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/io In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv11976/src/main/java/opennlp/maxent/io Modified Files: BinaryGISModelWriter.java Log Message: Organized imports. Index: BinaryGISModelWriter.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/io/BinaryGISModelWriter.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** BinaryGISModelWriter.java 22 Jan 2009 23:23:33 -0000 1.1 --- BinaryGISModelWriter.java 10 Aug 2010 09:26:55 -0000 1.2 *************** *** 18,26 **** package opennlp.maxent.io; ! import opennlp.maxent.*; ! import opennlp.model.AbstractModel; ! import java.io.*; ! import java.util.zip.*; /** --- 18,28 ---- package opennlp.maxent.io; ! import java.io.DataOutputStream; ! import java.io.File; ! import java.io.FileOutputStream; ! import java.io.IOException; ! import java.util.zip.GZIPOutputStream; ! import opennlp.model.AbstractModel; /** |
From: Joern K. <joe...@us...> - 2010-08-10 07:38:33
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv27395/src/main/java/opennlp/maxent Modified Files: ModelTrainer.java ModelApplier.java Log Message: Formated to apply to maxent code style. Index: ModelTrainer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/ModelTrainer.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** ModelTrainer.java 5 Aug 2010 17:42:27 -0000 1.1 --- ModelTrainer.java 10 Aug 2010 07:38:24 -0000 1.2 *************** *** 30,37 **** /** ! * Main class which calls the GIS procedure after building the EventStream ! * from the data. ! * ! * @author Chieu Hai Leong and Jason Baldridge * @version $Revision$, $Date$ */ --- 30,37 ---- /** ! * Main class which calls the GIS procedure after building the EventStream from ! * the data. ! * ! * @author Chieu Hai Leong and Jason Baldridge * @version $Revision$, $Date$ */ *************** *** 39,55 **** // some parameters if you want to play around with the smoothing option ! // for model training. This can improve model accuracy, though training ! // will potentially take longer and use more memory. Model size will also ! // be larger. Initial testing indicates improvements for models built on // small data sets and few outcomes, but performance degradation for those // with large data sets and lots of outcomes. public static boolean USE_SMOOTHING = false; public static double SMOOTHING_OBSERVATION = 0.1; ! private static void usage() { System.err.println("java ModelTrainer [-real] dataFile modelFile"); System.exit(1); } ! /** * Main method. Call as follows: --- 39,55 ---- // some parameters if you want to play around with the smoothing option ! // for model training. This can improve model accuracy, though training ! // will potentially take longer and use more memory. Model size will also ! // be larger. Initial testing indicates improvements for models built on // small data sets and few outcomes, but performance degradation for those // with large data sets and lots of outcomes. public static boolean USE_SMOOTHING = false; public static double SMOOTHING_OBSERVATION = 0.1; ! private static void usage() { System.err.println("java ModelTrainer [-real] dataFile modelFile"); System.exit(1); } ! /** * Main method. Call as follows: *************** *** 57,61 **** * java ModelTrainer dataFile modelFile */ ! public static void main (String[] args) { int ai = 0; boolean real = false; --- 57,61 ---- * java ModelTrainer dataFile modelFile */ ! public static void main(String[] args) { int ai = 0; boolean real = false; *************** *** 65,90 **** double sigma = 1.0; ! if(args.length == 0) { usage(); } while (args[ai].startsWith("-")) { if (args[ai].equals("-real")) { ! real = true; ! } ! else if (args[ai].equals("-perceptron")) { ! type = "perceptron"; ! } ! else if (args[ai].equals("-maxit")) { ! maxit = Integer.parseInt(args[++ai]); ! } ! else if (args[ai].equals("-cutoff")) { ! cutoff = Integer.parseInt(args[++ai]); ! } ! else if (args[ai].equals("-sigma")) { ! sigma = Double.parseDouble(args[++ai]); ! } ! else { ! System.err.println("Unknown option: "+args[ai]); ! usage(); } ai++; --- 65,85 ---- double sigma = 1.0; ! if (args.length == 0) { usage(); } while (args[ai].startsWith("-")) { if (args[ai].equals("-real")) { ! real = true; ! } else if (args[ai].equals("-perceptron")) { ! type = "perceptron"; ! } else if (args[ai].equals("-maxit")) { ! maxit = Integer.parseInt(args[++ai]); ! } else if (args[ai].equals("-cutoff")) { ! cutoff = Integer.parseInt(args[++ai]); ! } else if (args[ai].equals("-sigma")) { ! sigma = Double.parseDouble(args[++ai]); ! } else { ! System.err.println("Unknown option: " + args[ai]); ! usage(); } ai++; *************** *** 95,126 **** FileReader datafr = new FileReader(new File(dataFileName)); EventStream es; ! if (!real) { ! es = new BasicEventStream(new PlainTextByLineDataStream(datafr),","); ! } ! else { ! es = new RealBasicEventStream(new PlainTextByLineDataStream(datafr)); } GIS.SMOOTHING_OBSERVATION = SMOOTHING_OBSERVATION; AbstractModel model; if (type.equals("maxent")) { ! ! if (!real) { ! model = GIS.trainModel(es,maxit,cutoff,sigma); ! } ! else { ! model = GIS.trainModel(maxit, new OnePassRealValueDataIndexer(es,0), USE_SMOOTHING); ! } ! } ! else if (type.equals("perceptron")){ ! System.err.println("Perceptron training"); ! model = new PerceptronTrainer().trainModel(10, new OnePassDataIndexer(es,0),0); ! } ! else { ! System.err.println("Unknown model type: "+type); ! model = null; } ! File outputFile = new File(modelFileName); ! GISModelWriter writer = new SuffixSensitiveGISModelWriter(model, outputFile); writer.persist(); } catch (Exception e) { --- 90,120 ---- FileReader datafr = new FileReader(new File(dataFileName)); EventStream es; ! if (!real) { ! es = new BasicEventStream(new PlainTextByLineDataStream(datafr), ","); ! } else { ! es = new RealBasicEventStream(new PlainTextByLineDataStream(datafr)); } GIS.SMOOTHING_OBSERVATION = SMOOTHING_OBSERVATION; AbstractModel model; if (type.equals("maxent")) { ! ! if (!real) { ! model = GIS.trainModel(es, maxit, cutoff, sigma); ! } else { ! model = GIS.trainModel(maxit, new OnePassRealValueDataIndexer(es, 0), ! USE_SMOOTHING); ! } ! } else if (type.equals("perceptron")) { ! System.err.println("Perceptron training"); ! model = new PerceptronTrainer().trainModel(10, new OnePassDataIndexer( ! es, 0), 0); ! } else { ! System.err.println("Unknown model type: " + type); ! model = null; } ! File outputFile = new File(modelFileName); ! GISModelWriter writer = new SuffixSensitiveGISModelWriter(model, ! outputFile); writer.persist(); } catch (Exception e) { *************** *** 132,134 **** } - --- 126,127 ---- Index: ModelApplier.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/ModelApplier.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** ModelApplier.java 9 Aug 2010 18:58:42 -0000 1.2 --- ModelApplier.java 10 Aug 2010 07:38:24 -0000 1.3 *************** *** 23,31 **** import opennlp.model.Event; - import opennlp.maxent.BasicContextGenerator; - import opennlp.maxent.ContextGenerator; - import opennlp.maxent.DataStream; import opennlp.model.EventStream; - import opennlp.maxent.PlainTextByLineDataStream; import opennlp.model.GenericModelReader; import opennlp.model.MaxentModel; --- 23,27 ---- *************** *** 34,39 **** /** * Test the model on some input. ! * ! * @author Jason Baldridge * @version $Revision$, $Date$ */ --- 30,35 ---- /** * Test the model on some input. ! * ! * @author Jason Baldridge * @version $Revision$, $Date$ */ *************** *** 46,58 **** public static final DecimalFormat ROUNDED_FORMAT = new DecimalFormat("0.000"); ! public ModelApplier (MaxentModel m) { _model = m; } ! ! private void eval (Event event) { ! eval(event,false); } ! ! private void eval (Event event, boolean real) { String outcome = event.getOutcome(); --- 42,54 ---- public static final DecimalFormat ROUNDED_FORMAT = new DecimalFormat("0.000"); ! public ModelApplier(MaxentModel m) { _model = m; } ! ! private void eval(Event event) { ! eval(event, false); } ! ! private void eval(Event event, boolean real) { String outcome = event.getOutcome(); *************** *** 64,84 **** } else { float[] values = RealValueFileEventStream.parseContexts(context); ! ocs = _model.eval(context,values); } int best = 0; ! for (int i = 1; i<ocs.length; i++) ! if (ocs[i] > ocs[best]) best = i; String predictedLabel = _model.getOutcome(best); String madeError = "+"; ! if (predictedLabel.equals(outcome)) madeError = ""; ! System.out.println(counter + "\t0:"+outcome+"\t0:" + _model.getOutcome(best) + "\t"+madeError+"\t" + ROUNDED_FORMAT.format(ocs[best])); counter++; ! } ! private static void usage() { System.err.println("java ModelApplier [-real] modelFile dataFile"); --- 60,83 ---- } else { float[] values = RealValueFileEventStream.parseContexts(context); ! ocs = _model.eval(context, values); } int best = 0; ! for (int i = 1; i < ocs.length; i++) ! if (ocs[i] > ocs[best]) ! best = i; String predictedLabel = _model.getOutcome(best); String madeError = "+"; ! if (predictedLabel.equals(outcome)) madeError = ""; ! System.out.println(counter + "\t0:" + outcome + "\t0:" ! + _model.getOutcome(best) + "\t" + madeError + "\t" ! + ROUNDED_FORMAT.format(ocs[best])); counter++; ! } ! private static void usage() { System.err.println("java ModelApplier [-real] modelFile dataFile"); *************** *** 98,112 **** if (args.length > 0) { while (args[ai].startsWith("-")) { ! if (args[ai].equals("-real")) { ! real = true; ! } ! else if (args[ai].equals("-perceptron")) { ! type = "perceptron"; ! } ! else { ! usage(); ! } ! ai++; ! } modelFileName = args[ai++]; dataFileName = args[ai++]; --- 97,109 ---- if (args.length > 0) { while (args[ai].startsWith("-")) { ! if (args[ai].equals("-real")) { ! real = true; ! } else if (args[ai].equals("-perceptron")) { ! type = "perceptron"; ! } else { ! usage(); ! } ! ai++; ! } modelFileName = args[ai++]; dataFileName = args[ai++]; *************** *** 114,122 **** ModelApplier predictor = null; try { ! MaxentModel m = new GenericModelReader(new File(modelFileName)).getModel(); ! predictor = new ModelApplier(m); } catch (Exception e) { ! e.printStackTrace(); ! System.exit(0); } --- 111,120 ---- ModelApplier predictor = null; try { ! MaxentModel m = new GenericModelReader(new File(modelFileName)) ! .getModel(); ! predictor = new ModelApplier(m); } catch (Exception e) { ! e.printStackTrace(); ! System.exit(0); } *************** *** 124,141 **** System.out.println(" inst# actual predicted error prediction"); try { ! EventStream es = ! new BasicEventStream(new PlainTextByLineDataStream( ! new FileReader(new File(dataFileName))), ","); ! ! while (es.hasNext()) { ! predictor.eval(es.next(),real); ! } ! ! return; ! } ! catch (Exception e) { ! System.out.println("Unable to read from specified file: "+modelFileName); ! System.out.println(); ! e.printStackTrace(); } } --- 122,138 ---- System.out.println(" inst# actual predicted error prediction"); try { ! EventStream es = new BasicEventStream(new PlainTextByLineDataStream( ! new FileReader(new File(dataFileName))), ","); ! ! while (es.hasNext()) { ! predictor.eval(es.next(), real); ! } ! ! return; ! } catch (Exception e) { ! System.out.println("Unable to read from specified file: " ! + modelFileName); ! System.out.println(); ! e.printStackTrace(); } } |
From: Jason B. <jas...@us...> - 2010-08-10 03:25:15
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv20282/src/main/java/opennlp/maxent Modified Files: GISModel.java GISTrainer.java Log Message: Fixed bug in which correction constant was being used incorrectly in GISModel. It is now used in GISTrainer for parameter updates. Index: GISModel.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/GISModel.java,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** GISModel.java 5 Aug 2010 17:42:27 -0000 1.3 --- GISModel.java 10 Aug 2010 03:25:06 -0000 1.4 *************** *** 169,173 **** for (int oid = 0; oid < model.getNumOutcomes(); oid++) { if (model.getCorrectionParam() != 0) { ! prior[oid] = Math.exp(prior[oid]*model.getConstantInverse()+((1.0 - ((double) numfeats[oid] / model.getCorrectionConstant())) * model.getCorrectionParam())); } else { --- 169,173 ---- for (int oid = 0; oid < model.getNumOutcomes(); oid++) { if (model.getCorrectionParam() != 0) { ! prior[oid] = Math.exp(prior[oid]+((1.0 - ((double) numfeats[oid] / model.getCorrectionConstant())) * model.getCorrectionParam())); } else { Index: GISTrainer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/GISTrainer.java,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** GISTrainer.java 5 Aug 2010 17:42:27 -0000 1.3 --- GISTrainer.java 10 Aug 2010 03:25:06 -0000 1.4 *************** *** 511,515 **** System.err.println("Model expects == 0 for "+predLabels[pi]+" "+outcomeLabels[aoi]); } ! params[pi].updateParameter(aoi,(Math.log(observed[aoi]) - Math.log(model[aoi]))); } modelExpects[pi].setParameter(aoi,0.0); // re-initialize to 0.0's --- 511,516 ---- System.err.println("Model expects == 0 for "+predLabels[pi]+" "+outcomeLabels[aoi]); } ! //params[pi].updateParameter(aoi,(Math.log(observed[aoi]) - Math.log(model[aoi]))); ! params[pi].updateParameter(aoi,((Math.log(observed[aoi]) - Math.log(model[aoi]))/evalParams.getCorrectionConstant())); } modelExpects[pi].setParameter(aoi,0.0); // re-initialize to 0.0's |
From: Jason B. <jas...@us...> - 2010-08-09 18:58:52
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv5183 Modified Files: ModelApplier.java Log Message: Fixed tabs -> spaces. Index: ModelApplier.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/ModelApplier.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** ModelApplier.java 5 Aug 2010 17:42:27 -0000 1.1 --- ModelApplier.java 9 Aug 2010 18:58:42 -0000 1.2 *************** *** 39,143 **** */ public class ModelApplier { ! MaxentModel _model; ! ContextGenerator _cg = new BasicContextGenerator(","); ! int counter = 1; ! // The format for printing percentages ! public static final DecimalFormat ROUNDED_FORMAT = new DecimalFormat("0.000"); ! public ModelApplier (MaxentModel m) { ! _model = m; ! } ! private void eval (Event event) { ! eval(event,false); ! } ! private void eval (Event event, boolean real) { ! String outcome = event.getOutcome(); ! String[] context = event.getContext(); ! double[] ocs; ! if (!real) { ! ocs = _model.eval(context); ! } else { ! float[] values = RealValueFileEventStream.parseContexts(context); ! ocs = _model.eval(context,values); ! } ! int best = 0; ! for (int i = 1; i<ocs.length; i++) ! if (ocs[i] > ocs[best]) best = i; ! String predictedLabel = _model.getOutcome(best); ! String madeError = "+"; ! if (predictedLabel.equals(outcome)) ! madeError = ""; ! System.out.println(counter + "\t0:"+outcome+"\t0:" + _model.getOutcome(best) + "\t"+madeError+"\t" + ROUNDED_FORMAT.format(ocs[best])); ! counter++; ! } ! private static void usage() { ! System.err.println("java ModelApplier [-real] modelFile dataFile"); ! System.exit(1); ! } ! /** ! * Main method. Call as follows: ! * <p> ! * java ModelApplier modelFile dataFile ! */ ! public static void main(String[] args) { ! String dataFileName, modelFileName; ! boolean real = false; ! String type = "maxent"; ! int ai = 0; ! if (args.length > 0) { ! while (args[ai].startsWith("-")) { ! if (args[ai].equals("-real")) { ! real = true; ! } ! else if (args[ai].equals("-perceptron")) { ! type = "perceptron"; ! } ! else { ! usage(); ! } ! ai++; ! } ! modelFileName = args[ai++]; ! dataFileName = args[ai++]; ! ModelApplier predictor = null; ! try { ! MaxentModel m = new GenericModelReader(new File(modelFileName)).getModel(); ! predictor = new ModelApplier(m); ! } catch (Exception e) { ! e.printStackTrace(); ! System.exit(0); ! } ! System.out.println("=== Predictions on test data ===\n"); ! System.out.println(" inst# actual predicted error prediction"); ! try { ! EventStream es = ! new BasicEventStream(new PlainTextByLineDataStream( ! new FileReader(new File(dataFileName))), ","); ! while (es.hasNext()) { ! predictor.eval(es.next(),real); ! } ! ! return; ! } ! catch (Exception e) { ! System.out.println("Unable to read from specified file: "+modelFileName); ! System.out.println(); ! e.printStackTrace(); ! } } } } --- 39,143 ---- */ public class ModelApplier { ! MaxentModel _model; ! ContextGenerator _cg = new BasicContextGenerator(","); ! int counter = 1; ! // The format for printing percentages ! public static final DecimalFormat ROUNDED_FORMAT = new DecimalFormat("0.000"); ! public ModelApplier (MaxentModel m) { ! _model = m; ! } ! private void eval (Event event) { ! eval(event,false); ! } ! private void eval (Event event, boolean real) { ! String outcome = event.getOutcome(); ! String[] context = event.getContext(); ! double[] ocs; ! if (!real) { ! ocs = _model.eval(context); ! } else { ! float[] values = RealValueFileEventStream.parseContexts(context); ! ocs = _model.eval(context,values); ! } ! int best = 0; ! for (int i = 1; i<ocs.length; i++) ! if (ocs[i] > ocs[best]) best = i; ! String predictedLabel = _model.getOutcome(best); ! String madeError = "+"; ! if (predictedLabel.equals(outcome)) ! madeError = ""; ! System.out.println(counter + "\t0:"+outcome+"\t0:" + _model.getOutcome(best) + "\t"+madeError+"\t" + ROUNDED_FORMAT.format(ocs[best])); ! counter++; ! } ! private static void usage() { ! System.err.println("java ModelApplier [-real] modelFile dataFile"); ! System.exit(1); ! } ! /** ! * Main method. Call as follows: ! * <p> ! * java ModelApplier modelFile dataFile ! */ ! public static void main(String[] args) { ! String dataFileName, modelFileName; ! boolean real = false; ! String type = "maxent"; ! int ai = 0; ! if (args.length > 0) { ! while (args[ai].startsWith("-")) { ! if (args[ai].equals("-real")) { ! real = true; ! } ! else if (args[ai].equals("-perceptron")) { ! type = "perceptron"; ! } ! else { ! usage(); ! } ! ai++; ! } ! modelFileName = args[ai++]; ! dataFileName = args[ai++]; ! ModelApplier predictor = null; ! try { ! MaxentModel m = new GenericModelReader(new File(modelFileName)).getModel(); ! predictor = new ModelApplier(m); ! } catch (Exception e) { ! e.printStackTrace(); ! System.exit(0); ! } ! System.out.println("=== Predictions on test data ===\n"); ! System.out.println(" inst# actual predicted error prediction"); ! try { ! EventStream es = ! new BasicEventStream(new PlainTextByLineDataStream( ! new FileReader(new File(dataFileName))), ","); ! while (es.hasNext()) { ! predictor.eval(es.next(),real); } + + return; + } + catch (Exception e) { + System.out.println("Unable to read from specified file: "+modelFileName); + System.out.println(); + e.printStackTrace(); + } } + } } |
From: Jason B. <jas...@us...> - 2010-08-09 18:44:32
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv2252 Modified Files: BasicEventStream.java Log Message: Made separator variable private. Constructor which takes only a datastream calls constructor with datastream and separator string, with a single space as the separator. Index: BasicEventStream.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/BasicEventStream.java,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** BasicEventStream.java 5 Aug 2010 17:42:27 -0000 1.3 --- BasicEventStream.java 9 Aug 2010 18:44:23 -0000 1.4 *************** *** 39,52 **** Event next; ! String separator = " "; ! ! public BasicEventStream (DataStream ds) { ! this.ds = ds; ! cg = new BasicContextGenerator(); ! if (this.ds.hasNext()) ! next = createEvent((String)this.ds.nextToken()); ! } ! public BasicEventStream (DataStream ds, String sep) { separator = sep; cg = new BasicContextGenerator(separator); --- 39,45 ---- Event next; ! private String separator = " "; ! public BasicEventStream (DataStream ds, String sep) { separator = sep; cg = new BasicContextGenerator(separator); *************** *** 56,60 **** } ! /** * Returns the next Event object held in this EventStream. Each call to nextEvent advances the EventStream. --- 49,56 ---- } ! public BasicEventStream (DataStream ds) { ! this(ds, " "); ! } ! /** * Returns the next Event object held in this EventStream. Each call to nextEvent advances the EventStream. |
From: Jason B. <jas...@us...> - 2010-08-09 18:43:27
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv2053 Modified Files: BasicContextGenerator.java Log Message: Made separator variable private. Index: BasicContextGenerator.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/BasicContextGenerator.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** BasicContextGenerator.java 5 Aug 2010 17:42:27 -0000 1.2 --- BasicContextGenerator.java 9 Aug 2010 18:43:17 -0000 1.3 *************** *** 34,38 **** public class BasicContextGenerator implements ContextGenerator { ! String separator = " "; public BasicContextGenerator () {} --- 34,38 ---- public class BasicContextGenerator implements ContextGenerator { ! private String separator = " "; public BasicContextGenerator () {} |
From: Jason B. <jba...@ma...> - 2010-08-05 17:48:59
|
Oops -- that was only supposed to be a commit of GISModel.java. The other changes are there to support easy command line training and use of models, supporting the following feature request I opened: https://sourceforge.net/tracker/?func=detail&aid=3039649&group_id=5961&atid=355961 Jason On Thu, Aug 5, 2010 at 12:42 PM, Jason Baldridge < jas...@us...> wrote: > Update of /cvsroot/maxent/maxent/src/main/java/opennlp/maxent > In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv7201 > > Modified Files: > BasicContextGenerator.java BasicEventStream.java GIS.java > GISModel.java GISTrainer.java > Added Files: > ModelApplier.java ModelTrainer.java > Log Message: > Fixed bug in which the prior value was multiplied with the constant inverse > when correction constant was not being used. > -- Jason Baldridge Assistant Professor, Department of Linguistics The University of Texas at Austin http://comp.ling.utexas.edu/people/jason_baldridge |
From: Jason B. <jas...@us...> - 2010-08-05 17:42:36
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv7201 Modified Files: BasicContextGenerator.java BasicEventStream.java GIS.java GISModel.java GISTrainer.java Added Files: ModelApplier.java ModelTrainer.java Log Message: Fixed bug in which the prior value was multiplied with the constant inverse when correction constant was not being used. --- NEW FILE: ModelApplier.java --- /* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file distributed with * this work for additional information regarding copyright ownership. * The ASF licenses this file to You under the Apache License, Version 2.0 * (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package opennlp.maxent; import java.io.File; import java.io.FileReader; import java.text.DecimalFormat; import opennlp.model.Event; import opennlp.maxent.BasicContextGenerator; import opennlp.maxent.ContextGenerator; import opennlp.maxent.DataStream; import opennlp.model.EventStream; import opennlp.maxent.PlainTextByLineDataStream; import opennlp.model.GenericModelReader; import opennlp.model.MaxentModel; import opennlp.model.RealValueFileEventStream; /** * Test the model on some input. * * @author Jason Baldridge * @version $Revision: 1.1 $, $Date: 2010/08/05 17:42:27 $ */ public class ModelApplier { MaxentModel _model; ContextGenerator _cg = new BasicContextGenerator(","); int counter = 1; // The format for printing percentages public static final DecimalFormat ROUNDED_FORMAT = new DecimalFormat("0.000"); public ModelApplier (MaxentModel m) { _model = m; } private void eval (Event event) { eval(event,false); } private void eval (Event event, boolean real) { String outcome = event.getOutcome(); String[] context = event.getContext(); double[] ocs; if (!real) { ocs = _model.eval(context); } else { float[] values = RealValueFileEventStream.parseContexts(context); ocs = _model.eval(context,values); } int best = 0; for (int i = 1; i<ocs.length; i++) if (ocs[i] > ocs[best]) best = i; String predictedLabel = _model.getOutcome(best); String madeError = "+"; if (predictedLabel.equals(outcome)) madeError = ""; System.out.println(counter + "\t0:"+outcome+"\t0:" + _model.getOutcome(best) + "\t"+madeError+"\t" + ROUNDED_FORMAT.format(ocs[best])); counter++; } private static void usage() { System.err.println("java ModelApplier [-real] modelFile dataFile"); System.exit(1); } /** * Main method. Call as follows: * <p> * java ModelApplier modelFile dataFile */ public static void main(String[] args) { String dataFileName, modelFileName; boolean real = false; String type = "maxent"; int ai = 0; if (args.length > 0) { while (args[ai].startsWith("-")) { if (args[ai].equals("-real")) { real = true; } else if (args[ai].equals("-perceptron")) { type = "perceptron"; } else { usage(); } ai++; } modelFileName = args[ai++]; dataFileName = args[ai++]; ModelApplier predictor = null; try { MaxentModel m = new GenericModelReader(new File(modelFileName)).getModel(); predictor = new ModelApplier(m); } catch (Exception e) { e.printStackTrace(); System.exit(0); } System.out.println("=== Predictions on test data ===\n"); System.out.println(" inst# actual predicted error prediction"); try { EventStream es = new BasicEventStream(new PlainTextByLineDataStream( new FileReader(new File(dataFileName))), ","); while (es.hasNext()) { predictor.eval(es.next(),real); } return; } catch (Exception e) { System.out.println("Unable to read from specified file: "+modelFileName); System.out.println(); e.printStackTrace(); } } } } --- NEW FILE: ModelTrainer.java --- /* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file distributed with * this work for additional information regarding copyright ownership. * The ASF licenses this file to You under the Apache License, Version 2.0 * (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package opennlp.maxent; import java.io.File; import java.io.FileReader; import opennlp.maxent.io.GISModelWriter; import opennlp.maxent.io.SuffixSensitiveGISModelWriter; import opennlp.model.AbstractModel; import opennlp.model.EventStream; import opennlp.model.OnePassDataIndexer; import opennlp.model.OnePassRealValueDataIndexer; import opennlp.perceptron.PerceptronTrainer; /** * Main class which calls the GIS procedure after building the EventStream * from the data. * * @author Chieu Hai Leong and Jason Baldridge * @version $Revision: 1.1 $, $Date: 2010/08/05 17:42:27 $ */ public class ModelTrainer { // some parameters if you want to play around with the smoothing option // for model training. This can improve model accuracy, though training // will potentially take longer and use more memory. Model size will also // be larger. Initial testing indicates improvements for models built on // small data sets and few outcomes, but performance degradation for those // with large data sets and lots of outcomes. public static boolean USE_SMOOTHING = false; public static double SMOOTHING_OBSERVATION = 0.1; private static void usage() { System.err.println("java ModelTrainer [-real] dataFile modelFile"); System.exit(1); } /** * Main method. Call as follows: * <p> * java ModelTrainer dataFile modelFile */ public static void main (String[] args) { int ai = 0; boolean real = false; String type = "maxent"; int maxit = 100; int cutoff = 1; double sigma = 1.0; if(args.length == 0) { usage(); } while (args[ai].startsWith("-")) { if (args[ai].equals("-real")) { real = true; } else if (args[ai].equals("-perceptron")) { type = "perceptron"; } else if (args[ai].equals("-maxit")) { maxit = Integer.parseInt(args[++ai]); } else if (args[ai].equals("-cutoff")) { cutoff = Integer.parseInt(args[++ai]); } else if (args[ai].equals("-sigma")) { sigma = Double.parseDouble(args[++ai]); } else { System.err.println("Unknown option: "+args[ai]); usage(); } ai++; } String dataFileName = new String(args[ai++]); String modelFileName = new String(args[ai]); try { FileReader datafr = new FileReader(new File(dataFileName)); EventStream es; if (!real) { es = new BasicEventStream(new PlainTextByLineDataStream(datafr),","); } else { es = new RealBasicEventStream(new PlainTextByLineDataStream(datafr)); } GIS.SMOOTHING_OBSERVATION = SMOOTHING_OBSERVATION; AbstractModel model; if (type.equals("maxent")) { if (!real) { model = GIS.trainModel(es,maxit,cutoff,sigma); } else { model = GIS.trainModel(maxit, new OnePassRealValueDataIndexer(es,0), USE_SMOOTHING); } } else if (type.equals("perceptron")){ System.err.println("Perceptron training"); model = new PerceptronTrainer().trainModel(10, new OnePassDataIndexer(es,0),0); } else { System.err.println("Unknown model type: "+type); model = null; } File outputFile = new File(modelFileName); GISModelWriter writer = new SuffixSensitiveGISModelWriter(model, outputFile); writer.persist(); } catch (Exception e) { System.out.print("Unable to create model due to exception: "); System.out.println(e); e.printStackTrace(); } } } Index: BasicContextGenerator.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/BasicContextGenerator.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** BasicContextGenerator.java 22 Jan 2009 23:23:34 -0000 1.1 --- BasicContextGenerator.java 5 Aug 2010 17:42:27 -0000 1.2 *************** *** 34,37 **** --- 34,45 ---- public class BasicContextGenerator implements ContextGenerator { + String separator = " "; + + public BasicContextGenerator () {} + + public BasicContextGenerator (String sep) { + separator = sep; + } + /** * Builds up the list of contextual predicates given a String. *************** *** 39,43 **** public String[] getContext(Object o) { String s = (String) o; ! return (String[]) s.split(" "); } --- 47,51 ---- public String[] getContext(Object o) { String s = (String) o; ! return (String[]) s.split(separator); } Index: BasicEventStream.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/BasicEventStream.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** BasicEventStream.java 15 Mar 2009 03:24:00 -0000 1.2 --- BasicEventStream.java 5 Aug 2010 17:42:27 -0000 1.3 *************** *** 23,32 **** /** * A object which can deliver a stream of training events assuming ! * that each event is represented as a space separated list containing * all the contextual predicates, with the last item being the ! * outcome. * e.g.: * * <p> cp_1 cp_2 ... cp_n outcome * * @author Jason Baldridge --- 23,33 ---- /** * A object which can deliver a stream of training events assuming ! * that each event is represented as a separated list containing * all the contextual predicates, with the last item being the ! * outcome. The default separator is the space " ". * e.g.: * * <p> cp_1 cp_2 ... cp_n outcome + * <p> cp_1,cp_2,...,cp_n,outcome * * @author Jason Baldridge *************** *** 34,47 **** */ public class BasicEventStream extends AbstractEventStream { ! ContextGenerator cg = new BasicContextGenerator(); DataStream ds; Event next; public BasicEventStream (DataStream ds) { this.ds = ds; if (this.ds.hasNext()) next = createEvent((String)this.ds.nextToken()); } /** * Returns the next Event object held in this EventStream. Each call to nextEvent advances the EventStream. --- 35,60 ---- */ public class BasicEventStream extends AbstractEventStream { ! ContextGenerator cg; DataStream ds; Event next; + + String separator = " "; public BasicEventStream (DataStream ds) { this.ds = ds; + cg = new BasicContextGenerator(); if (this.ds.hasNext()) next = createEvent((String)this.ds.nextToken()); } + public BasicEventStream (DataStream ds, String sep) { + separator = sep; + cg = new BasicContextGenerator(separator); + this.ds = ds; + if (this.ds.hasNext()) + next = createEvent((String)this.ds.nextToken()); + } + + /** * Returns the next Event object held in this EventStream. Each call to nextEvent advances the EventStream. *************** *** 75,79 **** private Event createEvent(String obs) { ! int lastSpace = obs.lastIndexOf(' '); if (lastSpace == -1) return null; --- 88,92 ---- private Event createEvent(String obs) { ! int lastSpace = obs.lastIndexOf(separator); if (lastSpace == -1) return null; Index: GIS.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/GIS.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** GIS.java 15 Mar 2009 03:09:05 -0000 1.2 --- GIS.java 5 Aug 2010 17:42:27 -0000 1.3 *************** *** 112,115 **** --- 112,136 ---- } + /** + * Train a model using the GIS algorithm. + * @param eventStream The EventStream holding the data on which this model + * will be trained. + * @param iterations The number of GIS iterations to perform. + * @param cutoff The number of times a feature must be seen in order + * to be relevant for training. + * @param sigma The standard deviation for the gaussian smoother. + * @return The newly trained model, which can be used immediately or saved + * to disk using an opennlp.maxent.io.GISModelWriter object. + */ + public static GISModel trainModel(EventStream eventStream, + int iterations, + int cutoff, + double sigma) { + GISTrainer trainer = new GISTrainer(PRINT_MESSAGES); + if (sigma > 0) + trainer.setGaussianSigma(sigma); + return trainer.trainModel(eventStream, iterations, cutoff); + } + /** * Train a model using the GIS algorithm. Index: GISModel.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/GISModel.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** GISModel.java 19 Mar 2009 13:04:31 -0000 1.2 --- GISModel.java 5 Aug 2010 17:42:27 -0000 1.3 *************** *** 172,176 **** } else { ! prior[oid] = Math.exp(prior[oid]*model.getConstantInverse()); } normal += prior[oid]; --- 172,176 ---- } else { ! prior[oid] = Math.exp(prior[oid]); } normal += prior[oid]; Index: GISTrainer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/GISTrainer.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** GISTrainer.java 15 Mar 2009 03:09:47 -0000 1.2 --- GISTrainer.java 5 Aug 2010 17:42:27 -0000 1.3 *************** *** 179,182 **** --- 179,193 ---- } + /** + * Sets whether this trainer will use smoothing while training the model. + * This can improve model accuracy, though training will potentially take + * longer and use more memory. Model size will also be larger. + * + * @param smooth true if smoothing is desired, false if not + */ + public void setGaussianSigma(double sigmaValue) { + useGaussianSmoothing = true; + sigma = sigmaValue; + } /** |
From: Joern K. <joe...@us...> - 2010-08-04 13:16:23
|
Update of /cvsroot/maxent/maxent/.settings In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv2420/.settings Added Files: Tag: v2_5_release_branch .cvsignore Log Message: Added eclipse files to ignore list. --- NEW FILE: .cvsignore --- org.eclipse.jdt.core.prefs |
From: Joern K. <joe...@us...> - 2010-08-04 09:10:21
|
Update of /cvsroot/maxent/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv9646 Modified Files: Tag: v2_5_release_branch CHANGES Log Message: Updated for 2.5.3 release. Index: CHANGES =================================================================== RCS file: /cvsroot/maxent/maxent/CHANGES,v retrieving revision 1.23.2.1 retrieving revision 1.23.2.2 diff -C2 -d -r1.23.2.1 -r1.23.2.2 *** CHANGES 28 Nov 2008 13:55:35 -0000 1.23.2.1 --- CHANGES 4 Aug 2010 09:10:12 -0000 1.23.2.2 *************** *** 1,2 **** --- 1,7 ---- + 2.5.3 + ----- + Fixed bug in GIS class, a cutoff value was not passed on to the GISTrainer + Fixed bug in GISTrainer, a cutoff value compare was incorrect + 2.5.2 ----- |
From: Joern K. <joe...@us...> - 2010-08-04 09:08:48
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv9420/src/java/opennlp/maxent Modified Files: Tag: v2_5_release_branch GIS.java Log Message: Fixed bug where cutoff was not passed. Index: GIS.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/Attic/GIS.java,v retrieving revision 1.9 retrieving revision 1.9.2.1 diff -C2 -d -r1.9 -r1.9.2.1 *** GIS.java 13 Apr 2007 16:13:35 -0000 1.9 --- GIS.java 4 Aug 2010 09:08:40 -0000 1.9.2.1 *************** *** 162,166 **** } else { ! return trainer.trainModel(iterations, indexer,0); } } --- 162,166 ---- } else { ! return trainer.trainModel(iterations, indexer, cutoff); } } |
From: Joern K. <joe...@us...> - 2010-08-04 08:03:52
|
Update of /cvsroot/maxent/maxent In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv1352 Modified Files: Tag: v2_5_release_branch build.xml Log Message: Updated version to 2.5.3 Index: build.xml =================================================================== RCS file: /cvsroot/maxent/maxent/build.xml,v retrieving revision 1.26.2.1 retrieving revision 1.26.2.2 diff -C2 -d -r1.26.2.1 -r1.26.2.2 *** build.xml 28 Nov 2008 13:55:35 -0000 1.26.2.1 --- build.xml 4 Aug 2010 08:03:41 -0000 1.26.2.2 *************** *** 10,15 **** <property name="Name" value="Maxent" /> <property name="name" value="maxent" /> ! <property name="version" value="2.5.2" /> ! <property name="year" value="2008"/> <echo message="----------- ${Name} ${version} [${year}] ------------"/> --- 10,15 ---- <property name="Name" value="Maxent" /> <property name="name" value="maxent" /> ! <property name="version" value="2.5.3" /> ! <property name="year" value="2010"/> <echo message="----------- ${Name} ${version} [${year}] ------------"/> |
From: Joern K. <joe...@us...> - 2010-08-02 13:31:31
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/model In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv7055/src/main/java/opennlp/model Modified Files: IndexHashTable.java Log Message: toArray now return passed array. Index: IndexHashTable.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/model/IndexHashTable.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** IndexHashTable.java 23 Jun 2010 11:12:36 -0000 1.2 --- IndexHashTable.java 2 Aug 2010 13:31:22 -0000 1.3 *************** *** 133,141 **** @SuppressWarnings("unchecked") ! public void toArray(T array[]) { for (int i = 0; i < keys.length; i++) { if (keys[i] != null) array[values[i]] = (T) keys[i]; } } } --- 133,143 ---- @SuppressWarnings("unchecked") ! public T[] toArray(T array[]) { for (int i = 0; i < keys.length; i++) { if (keys[i] != null) array[values[i]] = (T) keys[i]; } + + return array; } } |
From: Joern K. <joe...@us...> - 2010-06-23 11:12:45
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/model In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv28134/src/main/java/opennlp/model Modified Files: IndexHashTable.java Log Message: Updated for loop, condition was always true. Index: IndexHashTable.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/model/IndexHashTable.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** IndexHashTable.java 23 Jun 2010 08:34:02 -0000 1.1 --- IndexHashTable.java 23 Jun 2010 11:12:36 -0000 1.2 *************** *** 82,92 **** private int searchKey(int startIndex, Object key, boolean insert) { ! for (int index = startIndex; index < startIndex ? ! index < startIndex : true; index = (index+1) % keys.length) { if (keys[index] == null) { if (insert) ! return index; ! else ! return -1; } --- 82,95 ---- private int searchKey(int startIndex, Object key, boolean insert) { ! ! for (int index = startIndex; true; index = (index+1) % keys.length) { ! ! // The keys array contains at least one null element, which guarantees ! // termination of the loop if (keys[index] == null) { if (insert) ! return index; ! else ! return -1; } *************** *** 98,103 **** } } - - return -1; } --- 101,104 ---- |
From: Joern K. <joe...@us...> - 2010-06-23 08:34:15
|
Update of /cvsroot/maxent/maxent/src/test/java/opennlp/model In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv28030/src/test/java/opennlp/model Added Files: IndexHashTableTest.java Log Message: Replaced the java.util.HashMap with a custom hash table which can index an array. --- NEW FILE: IndexHashTableTest.java --- package opennlp.model; import junit.framework.TestCase; public class IndexHashTableTest extends TestCase { public void testWithoutCollision() { String array[] = new String[3]; array[0] = "4"; array[1] = "7"; array[2] = "5"; IndexHashTable<String> arrayIndex = new IndexHashTable<String>(array, 1d); for (int i = 0; i < array.length; i++) assertEquals(i, arrayIndex.get(array[i])); } public void testWitCollision() { String array[] = new String[3]; array[0] = "7"; array[1] = "21"; array[2] = "0"; IndexHashTable<String> arrayIndex = new IndexHashTable<String>(array, 1d); for (int i = 0; i < array.length; i++) assertEquals(i, arrayIndex.get(array[i])); // has the same slot as as "" assertEquals(-1, arrayIndex.get("4")); } } |
From: Joern K. <joe...@us...> - 2010-06-23 08:34:11
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/io In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv28030/src/main/java/opennlp/maxent/io Modified Files: GISModelWriter.java Log Message: Replaced the java.util.HashMap with a custom hash table which can index an array. Index: GISModelWriter.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/io/GISModelWriter.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** GISModelWriter.java 22 Jan 2009 23:23:33 -0000 1.1 --- GISModelWriter.java 23 Jun 2010 08:34:02 -0000 1.2 *************** *** 28,31 **** --- 28,32 ---- import opennlp.model.ComparablePredicate; import opennlp.model.Context; + import opennlp.model.IndexHashTable; /** *************** *** 49,61 **** PARAMS = (Context[]) data[0]; ! Map<String,Integer> pmap = (Map<String,Integer>)data[1]; ! OUTCOME_LABELS = (String[])data[2]; ! CORRECTION_CONSTANT = ((Integer)data[3]).intValue(); ! CORRECTION_PARAM = ((Double)data[4]).doubleValue(); PRED_LABELS = new String[pmap.size()]; ! for (String pred : pmap.keySet()) { ! PRED_LABELS[pmap.get(pred)] = pred; ! } } --- 50,60 ---- PARAMS = (Context[]) data[0]; ! IndexHashTable<String> pmap = (IndexHashTable<String>) data[1]; ! OUTCOME_LABELS = (String[]) data[2]; ! CORRECTION_CONSTANT = ((Integer) data[3]).intValue(); ! CORRECTION_PARAM = ((Double) data[4]).doubleValue(); PRED_LABELS = new String[pmap.size()]; ! pmap.toArray(PRED_LABELS); } |
From: Joern K. <joe...@us...> - 2010-06-23 08:34:05
|
Update of /cvsroot/maxent/maxent/src/test/java/opennlp/model In directory sfp-cvsdas-4.v30.ch3.sourceforge.com:/tmp/cvs-serv27995/src/test/java/opennlp/model Log Message: Directory /cvsroot/maxent/maxent/src/test/java/opennlp/model added to the repository |
From: Thomas M. <tsm...@us...> - 2009-03-25 23:15:56
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/perceptron In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv9955/src/main/java/opennlp/perceptron Modified Files: SimplePerceptronSequenceTrainer.java Log Message: Fixed to properly do sequence updating. Index: SimplePerceptronSequenceTrainer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/main/java/opennlp/perceptron/SimplePerceptronSequenceTrainer.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** SimplePerceptronSequenceTrainer.java 19 Mar 2009 13:04:31 -0000 1.2 --- SimplePerceptronSequenceTrainer.java 25 Mar 2009 23:15:34 -0000 1.3 *************** *** 49,53 **** /** Number of predicates. */ private int numPreds; - /** Number of outcomes. */ private int numOutcomes; --- 49,52 ---- *************** *** 77,80 **** --- 76,80 ---- private int[] allOutcomesPattern; private String[] predLabels; + int numSequences; public AbstractModel trainModel(int iterations, SequenceStream sequenceStream, int cutoff, boolean useAverage) { *************** *** 82,85 **** --- 82,89 ---- this.sequenceStream = sequenceStream; DataIndexer di = new OnePassDataIndexer(new SequenceStreamEventStream(sequenceStream),cutoff,false); + numSequences = 0; + for (Sequence s : sequenceStream) { + numSequences++; + } outcomeList = di.getOutcomeList(); predLabels = di.getPredLabels(); *************** *** 180,252 **** int oei=0; int si=0; for (Sequence sequence : sequenceStream) { ! Event[] taggerEvents = sequenceStream.updateContext(sequence, new PerceptronModel(params,predLabels,pmap,outcomeLabels)); Event[] events = sequence.getEvents(); for (int ei=0;ei<events.length;ei++,oei++) { ! String[] contextStrings = events[ei].getContext(); ! int[] contexts = new int[contextStrings.length]; ! float values[] = events[ei].getValues(); ! for (int ci=0;ci<contexts.length;ci++) { ! Integer cmi = pmap.get(contextStrings[ci]); ! contexts[ci] = cmi; } ! int max = omap.get(taggerEvents[ei].getOutcome()); ! boolean correct = max == outcomeList[oei]; ! if (correct) { ! numCorrect ++; } ! for (int oi = 0;oi<numOutcomes;oi++) { ! if (oi == outcomeList[oei]) { ! if (!correct) { ! for (int ci = 0; ci < contexts.length; ci++) { ! int pi = contexts[ci]; ! if (values == null) { ! params[pi].updateParameter(oi, 1); ! } ! else { ! params[pi].updateParameter(oi, values[ci]); ! } ! if (useAverage) { ! if (updates[pi][oi][VALUE] != 0) { ! averageParams[pi].updateParameter(oi,updates[pi][oi][VALUE]*(numEvents*(iteration-updates[pi][oi][ITER])+(oei-updates[pi][oi][EVENT]))); ! //System.err.println("p avp["+pi+"]."+oi+"="+averageParams[pi].getParameters()[oi]); ! } ! //System.err.println("p updates["+pi+"]["+oi+"]=("+updates[pi][oi][ITER]+","+updates[pi][oi][EVENT]+","+updates[pi][oi][VALUE]+") + ("+iteration+","+oei+","+params[pi].getParameters()[oi]+") -> "+averageParams[pi].getParameters()[oi]); ! updates[pi][oi][VALUE] = (int) params[pi].getParameters()[oi]; ! updates[pi][oi][ITER] = iteration; ! updates[pi][oi][EVENT] = oei; ! } ! } } } ! else { ! if (oi == max) { //case where it correct is taken by above if. ! for (int ci = 0; ci < contexts.length; ci++) { ! int pi = contexts[ci]; ! if (values == null) { ! params[pi].updateParameter(oi,-1); ! } ! else { ! params[pi].updateParameter(oi, values[ci]*-1); ! } ! if (useAverage) { ! if (updates[pi][oi][VALUE] != 0) { ! averageParams[pi].updateParameter(oi,updates[pi][oi][VALUE]*(numEvents*(iteration-updates[pi][oi][ITER])+(oei-updates[pi][oi][EVENT]))); ! //System.err.println(oei+" d avp["+pi+"]."+oi+"="+averageParams[pi].getParameters()[oi]); ! } ! //System.err.println(oei+" d updates["+pi+"]["+oi+"]=("+updates[pi][oi][ITER]+","+updates[pi][oi][EVENT]+","+updates[pi][oi][VALUE]+") + ("+iteration+","+oei+","+params[pi].getParameters()[oi]+") -> "+averageParams[pi].getParameters()[oi]); ! updates[pi][oi][VALUE] = (int) params[pi].getParameters()[oi]; ! updates[pi][oi][ITER] = iteration; ! updates[pi][oi][EVENT] = oei; } } } } } } si++; } //finish average computation ! double totIterations = (double) iterations*numEvents; if (useAverage && iteration == iterations-1) { for (int pi = 0; pi < numPreds; pi++) { --- 184,281 ---- int oei=0; int si=0; + Map<String,Float>[] featureCounts = new Map[numOutcomes]; + for (int oi=0;oi<numOutcomes;oi++) { + featureCounts[oi] = new HashMap<String,Float>(); + } + PerceptronModel model = new PerceptronModel(params,predLabels,pmap,outcomeLabels); for (Sequence sequence : sequenceStream) { ! Event[] taggerEvents = sequenceStream.updateContext(sequence, model); Event[] events = sequence.getEvents(); + boolean update = false; for (int ei=0;ei<events.length;ei++,oei++) { ! if (!taggerEvents[ei].getOutcome().equals(events[ei].getOutcome())) { ! update = true; ! //break; } ! else { ! numCorrect++; } ! } ! if (update) { ! for (int oi=0;oi<numOutcomes;oi++) { ! featureCounts[oi].clear(); ! } ! //System.err.print("train:");for (int ei=0;ei<events.length;ei++) {System.err.print(" "+events[ei].getOutcome());} System.err.println(); ! //training feature count computation ! for (int ei=0;ei<events.length;ei++,oei++) { ! String[] contextStrings = events[ei].getContext(); ! float values[] = events[ei].getValues(); ! int oi = omap.get(events[ei].getOutcome()); ! for (int ci=0;ci<contextStrings.length;ci++) { ! float value = 1; ! if (values != null) { ! value = values[ci]; ! } ! Float c = featureCounts[oi].get(contextStrings[ci]); ! if (c == null) { ! c = value; } + else { + c+=value; + } + featureCounts[oi].put(contextStrings[ci], c); } ! } ! //evaluation feature count computation ! //System.err.print("test: ");for (int ei=0;ei<taggerEvents.length;ei++) {System.err.print(" "+taggerEvents[ei].getOutcome());} System.err.println(); ! for (int ei=0;ei<taggerEvents.length;ei++) { ! String[] contextStrings = taggerEvents[ei].getContext(); ! float values[] =taggerEvents[ei].getValues(); ! int oi = omap.get(taggerEvents[ei].getOutcome()); ! for (int ci=0;ci<contextStrings.length;ci++) { ! float value = 1; ! if (values != null) { ! value = values[ci]; ! } ! Float c = featureCounts[oi].get(contextStrings[ci]); ! if (c == null) { ! c = -1*value; ! } ! else { ! c-=value; ! } ! if (c == 0f) { ! featureCounts[oi].remove(contextStrings[ci]); ! } ! else { ! featureCounts[oi].put(contextStrings[ci], c); ! } ! } ! } ! for (int oi=0;oi<numOutcomes;oi++) { ! for (String feature : featureCounts[oi].keySet()) { ! Integer pi = pmap.get(feature); ! if (pi != null) { ! //System.err.println(si+" "+outcomeLabels[oi]+" "+feature+" "+featureCounts[oi].get(feature)); ! params[pi].updateParameter(oi, featureCounts[oi].get(feature)); ! if (useAverage) { ! if (updates[pi][oi][VALUE] != 0) { ! averageParams[pi].updateParameter(oi,updates[pi][oi][VALUE]*(numSequences*(iteration-updates[pi][oi][ITER])+(si-updates[pi][oi][EVENT]))); ! //System.err.println("p avp["+pi+"]."+oi+"="+averageParams[pi].getParameters()[oi]); } + //System.err.println("p updates["+pi+"]["+oi+"]=("+updates[pi][oi][ITER]+","+updates[pi][oi][EVENT]+","+updates[pi][oi][VALUE]+") + ("+iteration+","+oei+","+params[pi].getParameters()[oi]+") -> "+averageParams[pi].getParameters()[oi]); + updates[pi][oi][VALUE] = (int) params[pi].getParameters()[oi]; + updates[pi][oi][ITER] = iteration; + updates[pi][oi][EVENT] = si; } } } } + model = new PerceptronModel(params,predLabels,pmap,outcomeLabels); } si++; } //finish average computation ! double totIterations = (double) iterations*si; if (useAverage && iteration == iterations-1) { for (int pi = 0; pi < numPreds; pi++) { *************** *** 254,258 **** for (int oi = 0;oi<numOutcomes;oi++) { if (updates[pi][oi][VALUE] != 0) { ! predParams[oi] += updates[pi][oi][VALUE]*(numEvents*(iterations-updates[pi][oi][ITER])-updates[pi][oi][EVENT]); } if (predParams[oi] != 0) { --- 283,287 ---- for (int oi = 0;oi<numOutcomes;oi++) { if (updates[pi][oi][VALUE] != 0) { ! predParams[oi] += updates[pi][oi][VALUE]*(numSequences*(iterations-updates[pi][oi][ITER])-updates[pi][oi][EVENT]); } if (predParams[oi] != 0) { |