You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(5) |
Sep
|
Oct
(14) |
Nov
(37) |
Dec
(13) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(14) |
Feb
|
Mar
|
Apr
(15) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
(2) |
2003 |
Jan
(4) |
Feb
|
Mar
(1) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
2004 |
Jan
(1) |
Feb
(3) |
Mar
|
Apr
|
May
(4) |
Jun
(3) |
Jul
(1) |
Aug
(6) |
Sep
|
Oct
|
Nov
|
Dec
|
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(17) |
Nov
(3) |
Dec
|
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(23) |
Dec
|
2007 |
Jan
|
Feb
|
Mar
(7) |
Apr
(17) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2008 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
(3) |
Sep
(20) |
Oct
|
Nov
(15) |
Dec
(2) |
2009 |
Jan
(38) |
Feb
(4) |
Mar
(20) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(4) |
Jul
|
Aug
(17) |
Sep
(26) |
Oct
|
Nov
(2) |
Dec
|
From: Joern K. <joe...@us...> - 2009-01-22 23:23:35
|
Update of /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/io In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv20839/src/main/java/opennlp/maxent/io Log Message: Directory /cvsroot/maxent/maxent/src/main/java/opennlp/maxent/io added to the repository |
From: Joern K. <joe...@us...> - 2009-01-22 23:23:35
|
Update of /cvsroot/maxent/maxent/src/main In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv20839/src/main Log Message: Directory /cvsroot/maxent/maxent/src/main added to the repository |
From: Joern K. <joe...@us...> - 2009-01-22 23:23:35
|
Update of /cvsroot/maxent/maxent/src/test/java In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv20839/src/test/java Log Message: Directory /cvsroot/maxent/maxent/src/test/java added to the repository |
From: Joern K. <joe...@us...> - 2009-01-22 23:23:35
|
Update of /cvsroot/maxent/maxent/src/test/java/opennlp/maxent In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv20839/src/test/java/opennlp/maxent Log Message: Directory /cvsroot/maxent/maxent/src/test/java/opennlp/maxent added to the repository |
From: Joern K. <joe...@us...> - 2009-01-22 23:23:34
|
Update of /cvsroot/maxent/maxent/src/test/resources/data/opennlp/maxent In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv20839/src/test/resources/data/opennlp/maxent Log Message: Directory /cvsroot/maxent/maxent/src/test/resources/data/opennlp/maxent added to the repository |
From: Joern K. <joe...@us...> - 2009-01-22 23:23:32
|
Update of /cvsroot/maxent/maxent/src/test/resources/data/opennlp In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv20839/src/test/resources/data/opennlp Log Message: Directory /cvsroot/maxent/maxent/src/test/resources/data/opennlp added to the repository |
From: Joern K. <joe...@us...> - 2009-01-22 23:23:31
|
Update of /cvsroot/maxent/maxent/src/test/resources/data In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv20839/src/test/resources/data Log Message: Directory /cvsroot/maxent/maxent/src/test/resources/data added to the repository |
From: Joern K. <joe...@us...> - 2009-01-22 23:23:31
|
Update of /cvsroot/maxent/maxent/src/main/java In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv20839/src/main/java Log Message: Directory /cvsroot/maxent/maxent/src/main/java added to the repository |
From: Joern K. <joe...@us...> - 2009-01-22 23:23:30
|
Update of /cvsroot/maxent/maxent/src/test/resources In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv20839/src/test/resources Log Message: Directory /cvsroot/maxent/maxent/src/test/resources added to the repository |
From: Thomas M. <tsm...@us...> - 2009-01-02 04:08:38
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv22250/src/java/opennlp/maxent Modified Files: GISTrainer.java Log Message: fix to cutoff comparison Index: GISTrainer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/GISTrainer.java,v retrieving revision 1.31 retrieving revision 1.32 diff -C2 -d -r1.31 -r1.32 *** GISTrainer.java 6 Nov 2008 19:59:44 -0000 1.31 --- GISTrainer.java 2 Jan 2009 04:08:25 -0000 1.32 *************** *** 303,307 **** else { //determine active outcomes for (int oi = 0; oi < numOutcomes; oi++) { ! if (predCount[pi][oi] > 0 && predicateCounts[pi] > cutoff) { activeOutcomes[numActiveOutcomes] = oi; numActiveOutcomes++; --- 303,307 ---- else { //determine active outcomes for (int oi = 0; oi < numOutcomes; oi++) { ! if (predCount[pi][oi] > 0 && predicateCounts[pi] >= cutoff) { activeOutcomes[numActiveOutcomes] = oi; numActiveOutcomes++; |
From: Thomas M. <tsm...@us...> - 2009-01-02 04:08:06
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv22185/src/java/opennlp/maxent Modified Files: Tag: v2_5_release_branch GISTrainer.java Log Message: fix to cutoff comparison Index: GISTrainer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/GISTrainer.java,v retrieving revision 1.29 retrieving revision 1.29.2.1 diff -C2 -d -r1.29 -r1.29.2.1 *** GISTrainer.java 21 Sep 2008 03:20:15 -0000 1.29 --- GISTrainer.java 2 Jan 2009 04:07:56 -0000 1.29.2.1 *************** *** 295,299 **** else { //determine active outcomes for (int oi = 0; oi < numOutcomes; oi++) { ! if (predCount[pi][oi] > 0 && predicateCounts[pi] > cutoff) { activeOutcomes[numActiveOutcomes] = oi; numActiveOutcomes++; --- 295,299 ---- else { //determine active outcomes for (int oi = 0; oi < numOutcomes; oi++) { ! if (predCount[pi][oi] > 0 && predicateCounts[pi] >= cutoff) { activeOutcomes[numActiveOutcomes] = oi; numActiveOutcomes++; |
From: Thomas M. <tsm...@us...> - 2008-12-02 02:47:05
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv8042/src/java/opennlp/maxent Modified Files: GISModel.java Log Message: Updates to makes multi-threaded usage and extension safer. Index: GISModel.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/GISModel.java,v retrieving revision 1.25 retrieving revision 1.26 diff -C2 -d -r1.25 -r1.26 *** GISModel.java 10 Nov 2008 14:51:40 -0000 1.25 --- GISModel.java 2 Dec 2008 02:46:58 -0000 1.26 *************** *** 59,64 **** */ public GISModel (Context[] params, String[] predLabels, String[] outcomeNames, int correctionConstant,double correctionParam, Prior prior) { ! super(params,predLabels,outcomeNames); ! this.evalParams = new EvalParameters(params,correctionParam,correctionConstant,ocNames.length); this.prior = prior; prior.setLabels(ocNames, predLabels); --- 59,63 ---- */ public GISModel (Context[] params, String[] predLabels, String[] outcomeNames, int correctionConstant,double correctionParam, Prior prior) { ! super(params,predLabels,outcomeNames,correctionConstant,correctionParam); this.prior = prior; prior.setLabels(ocNames, predLabels); |
From: Thomas M. <tsm...@us...> - 2008-12-02 02:47:05
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/model In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv8042/src/java/opennlp/model Modified Files: EvalParameters.java AbstractModel.java Log Message: Updates to makes multi-threaded usage and extension safer. Index: EvalParameters.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/model/EvalParameters.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** EvalParameters.java 6 Nov 2008 19:59:44 -0000 1.2 --- EvalParameters.java 2 Dec 2008 02:46:58 -0000 1.3 *************** *** 20,26 **** /** * This class encapsulates the varibales used in producing probabilities from a model ! * and facilitaes passing these variables to the eval method. Variables are declared ! * non-private so that they may be accessed and updated without a method call for efficiency ! * reasons. * @author Tom Morton * --- 20,24 ---- /** * This class encapsulates the varibales used in producing probabilities from a model ! * and facilitaes passing these variables to the eval method. * @author Tom Morton * Index: AbstractModel.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/model/AbstractModel.java,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** AbstractModel.java 10 Nov 2008 14:51:40 -0000 1.3 --- AbstractModel.java 2 Dec 2008 02:46:58 -0000 1.4 *************** *** 28,32 **** /** The names of the outcomes. */ protected String[] ocNames; - private DecimalFormat df; protected EvalParameters evalParams; protected Prior prior; --- 28,31 ---- *************** *** 89,101 **** } else { ! if (df == null) { //lazy initilazation ! df = new DecimalFormat("0.0000"); ! } ! StringBuffer sb = new StringBuffer(ocs.length*2); ! sb.append(ocNames[0]).append("[").append(df.format(ocs[0])).append("]"); ! for (int i = 1; i<ocs.length; i++) { ! sb.append(" ").append(ocNames[i]).append("[").append(df.format(ocs[i])).append("]"); ! } ! return sb.toString(); } } --- 88,98 ---- } else { ! DecimalFormat df = new DecimalFormat("0.0000"); ! StringBuffer sb = new StringBuffer(ocs.length*2); ! sb.append(ocNames[0]).append("[").append(df.format(ocs[0])).append("]"); ! for (int i = 1; i<ocs.length; i++) { ! sb.append(" ").append(ocNames[i]).append("[").append(df.format(ocs[i])).append("]"); ! } ! return sb.toString(); } } |
From: Thomas M. <tsm...@us...> - 2008-11-28 23:38:17
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/model In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv17694/src/java/opennlp/model Modified Files: BinaryFileDataReader.java PlainTextFileDataReader.java Log Message: ported buffered input stream optimization to trunk. Index: BinaryFileDataReader.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/model/BinaryFileDataReader.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** BinaryFileDataReader.java 6 Nov 2008 19:59:44 -0000 1.1 --- BinaryFileDataReader.java 28 Nov 2008 23:38:10 -0000 1.2 *************** *** 1,4 **** --- 1,5 ---- package opennlp.model; + import java.io.BufferedInputStream; import java.io.DataInputStream; import java.io.File; *************** *** 14,22 **** public BinaryFileDataReader(File f) throws IOException { if (f.getName().endsWith(".gz")) { ! input = new DataInputStream( ! new GZIPInputStream(new FileInputStream(f))); } else { ! input = new DataInputStream(new FileInputStream(f)); } } --- 15,23 ---- public BinaryFileDataReader(File f) throws IOException { if (f.getName().endsWith(".gz")) { ! input = new DataInputStream(new BufferedInputStream( ! new GZIPInputStream(new BufferedInputStream(new FileInputStream(f))))); } else { ! input = new DataInputStream(new BufferedInputStream(new FileInputStream(f))); } } Index: PlainTextFileDataReader.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/model/PlainTextFileDataReader.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** PlainTextFileDataReader.java 6 Nov 2008 19:59:44 -0000 1.1 --- PlainTextFileDataReader.java 28 Nov 2008 23:38:10 -0000 1.2 *************** *** 1,4 **** --- 1,5 ---- package opennlp.model; + import java.io.BufferedInputStream; import java.io.BufferedReader; import java.io.File; *************** *** 15,22 **** public PlainTextFileDataReader(File f) throws IOException { if (f.getName().endsWith(".gz")) { ! input = new BufferedReader(new InputStreamReader(new GZIPInputStream(new FileInputStream(f)))); } else { ! input = new BufferedReader(new InputStreamReader(new FileInputStream(f))); } } --- 16,23 ---- public PlainTextFileDataReader(File f) throws IOException { if (f.getName().endsWith(".gz")) { ! input = new BufferedReader(new InputStreamReader(new BufferedInputStream(new GZIPInputStream(new BufferedInputStream(new FileInputStream(f)))))); } else { ! input = new BufferedReader(new InputStreamReader(new BufferedInputStream(new FileInputStream(f)))); } } |
From: Thomas M. <tsm...@us...> - 2008-11-28 13:55:48
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent/io In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv22323/src/java/opennlp/maxent/io Modified Files: Tag: v2_5_release_branch PlainTextGISModelReader.java SuffixSensitiveGISModelReader.java BinaryGISModelReader.java Log Message: Updated model reading to use BufferedInputStream for improved load times. Index: PlainTextGISModelReader.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/PlainTextGISModelReader.java,v retrieving revision 1.1.1.1 retrieving revision 1.1.1.1.4.1 diff -C2 -d -r1.1.1.1 -r1.1.1.1.4.1 *** PlainTextGISModelReader.java 23 Oct 2001 14:06:53 -0000 1.1.1.1 --- PlainTextGISModelReader.java 28 Nov 2008 13:55:35 -0000 1.1.1.1.4.1 *************** *** 50,55 **** if (f.getName().endsWith(".gz")) { ! input = new BufferedReader(new InputStreamReader( ! new GZIPInputStream(new FileInputStream(f)))); } else { --- 50,55 ---- if (f.getName().endsWith(".gz")) { ! input = new BufferedReader(new InputStreamReader(new BufferedInputStream( ! new GZIPInputStream(new BufferedInputStream(new FileInputStream(f)))))); } else { Index: SuffixSensitiveGISModelReader.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/SuffixSensitiveGISModelReader.java,v retrieving revision 1.3 retrieving revision 1.3.2.1 diff -C2 -d -r1.3 -r1.3.2.1 *** SuffixSensitiveGISModelReader.java 22 Aug 2008 01:17:04 -0000 1.3 --- SuffixSensitiveGISModelReader.java 28 Nov 2008 13:55:35 -0000 1.3.2.1 *************** *** 48,56 **** // handle the zipped/not zipped distinction if (filename.endsWith(".gz")) { ! input = new GZIPInputStream(new FileInputStream(f)); filename = filename.substring(0,filename.length()-3); } else { ! input = new FileInputStream(f); } --- 48,56 ---- // handle the zipped/not zipped distinction if (filename.endsWith(".gz")) { ! input = new BufferedInputStream(new GZIPInputStream(new BufferedInputStream(new FileInputStream(f)))); filename = filename.substring(0,filename.length()-3); } else { ! input = new BufferedInputStream(new FileInputStream(f)); } Index: BinaryGISModelReader.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/BinaryGISModelReader.java,v retrieving revision 1.1.1.1 retrieving revision 1.1.1.1.4.1 diff -C2 -d -r1.1.1.1 -r1.1.1.1.4.1 *** BinaryGISModelReader.java 23 Oct 2001 14:06:53 -0000 1.1.1.1 --- BinaryGISModelReader.java 28 Nov 2008 13:55:35 -0000 1.1.1.1.4.1 *************** *** 51,58 **** if (f.getName().endsWith(".gz")) { input = new DataInputStream( ! new GZIPInputStream(new FileInputStream(f))); } else { ! input = new DataInputStream(new FileInputStream(f)); } --- 51,58 ---- if (f.getName().endsWith(".gz")) { input = new DataInputStream( ! new BufferedInputStream(new GZIPInputStream(new BufferedInputStream(new FileInputStream(f))))); } else { ! input = new DataInputStream(new BufferedInputStream(new FileInputStream(f))); } |
From: Thomas M. <tsm...@us...> - 2008-11-28 13:55:42
|
Update of /cvsroot/maxent/maxent In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv22323 Modified Files: Tag: v2_5_release_branch build.xml CHANGES Log Message: Updated model reading to use BufferedInputStream for improved load times. Index: build.xml =================================================================== RCS file: /cvsroot/maxent/maxent/build.xml,v retrieving revision 1.26 retrieving revision 1.26.2.1 diff -C2 -d -r1.26 -r1.26.2.1 *** build.xml 28 Sep 2008 16:57:01 -0000 1.26 --- build.xml 28 Nov 2008 13:55:35 -0000 1.26.2.1 *************** *** 10,14 **** <property name="Name" value="Maxent" /> <property name="name" value="maxent" /> ! <property name="version" value="2.5.1" /> <property name="year" value="2008"/> --- 10,14 ---- <property name="Name" value="Maxent" /> <property name="name" value="maxent" /> ! <property name="version" value="2.5.2" /> <property name="year" value="2008"/> *************** *** 91,95 **** debug="${debug}" classpathref="build.classpath" ! optimize="${optimize}"/> </target> --- 91,96 ---- debug="${debug}" classpathref="build.classpath" ! optimize="${optimize}" ! source="1.4" /> </target> Index: CHANGES =================================================================== RCS file: /cvsroot/maxent/maxent/CHANGES,v retrieving revision 1.23 retrieving revision 1.23.2.1 diff -C2 -d -r1.23 -r1.23.2.1 *** CHANGES 26 Sep 2008 03:54:39 -0000 1.23 --- CHANGES 28 Nov 2008 13:55:35 -0000 1.23.2.1 *************** *** 1,2 **** --- 1,6 ---- + 2.5.2 + ----- + Wrapped model reading input streams with BufferedInputStream for performance gain. + 2.5.1 ----- |
From: Thomas M. <tsm...@us...> - 2008-11-10 14:51:55
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/perceptron In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv11499/src/java/opennlp/perceptron Modified Files: BinaryPerceptronModelWriter.java PlainTextPerceptronModelWriter.java PerceptronModel.java SuffixSensitivePerceptronModelWriter.java PerceptronModelWriter.java Log Message: Updates to better support generic model writting of different model types. Index: BinaryPerceptronModelWriter.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/perceptron/BinaryPerceptronModelWriter.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** BinaryPerceptronModelWriter.java 6 Nov 2008 19:59:44 -0000 1.1 --- BinaryPerceptronModelWriter.java 10 Nov 2008 14:51:40 -0000 1.2 *************** *** 68,84 **** } ! protected void writeUTF (String s) throws java.io.IOException { output.writeUTF(s); } ! protected void writeInt (int i) throws java.io.IOException { output.writeInt(i); } ! protected void writeDouble (double d) throws java.io.IOException { output.writeDouble(d); } ! protected void close () throws java.io.IOException { output.flush(); output.close(); --- 68,84 ---- } ! public void writeUTF (String s) throws java.io.IOException { output.writeUTF(s); } ! public void writeInt (int i) throws java.io.IOException { output.writeInt(i); } ! public void writeDouble (double d) throws java.io.IOException { output.writeDouble(d); } ! public void close () throws java.io.IOException { output.flush(); output.close(); Index: PlainTextPerceptronModelWriter.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/perceptron/PlainTextPerceptronModelWriter.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** PlainTextPerceptronModelWriter.java 6 Nov 2008 19:59:44 -0000 1.1 --- PlainTextPerceptronModelWriter.java 10 Nov 2008 14:51:40 -0000 1.2 *************** *** 71,90 **** } ! protected void writeUTF (String s) throws java.io.IOException { output.write(s); output.newLine(); } ! protected void writeInt (int i) throws java.io.IOException { output.write(Integer.toString(i)); output.newLine(); } ! protected void writeDouble (double d) throws java.io.IOException { output.write(Double.toString(d)); output.newLine(); } ! protected void close () throws java.io.IOException { output.flush(); output.close(); --- 71,90 ---- } ! public void writeUTF (String s) throws java.io.IOException { output.write(s); output.newLine(); } ! public void writeInt (int i) throws java.io.IOException { output.write(Integer.toString(i)); output.newLine(); } ! public void writeDouble (double d) throws java.io.IOException { output.write(Double.toString(d)); output.newLine(); } ! public void close () throws java.io.IOException { output.flush(); output.close(); Index: PerceptronModel.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/perceptron/PerceptronModel.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** PerceptronModel.java 6 Nov 2008 19:59:44 -0000 1.1 --- PerceptronModel.java 10 Nov 2008 14:51:40 -0000 1.2 *************** *** 15,18 **** --- 15,19 ---- public PerceptronModel(Context[] params, String[] predLabels, String[] outcomeNames) { super(params,predLabels,outcomeNames); + modelType = ModelType.Perceptron; } *************** *** 87,90 **** --- 88,92 ---- } } + //System.err.println("Perceptron Model: "+java.util.Arrays.asList(prior)); return prior; } Index: SuffixSensitivePerceptronModelWriter.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/perceptron/SuffixSensitivePerceptronModelWriter.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** SuffixSensitivePerceptronModelWriter.java 6 Nov 2008 19:59:44 -0000 1.1 --- SuffixSensitivePerceptronModelWriter.java 10 Nov 2008 14:51:40 -0000 1.2 *************** *** 28,31 **** --- 28,32 ---- import opennlp.model.AbstractModel; + import opennlp.model.AbstractModelWriter; /** *************** *** 42,46 **** */ public class SuffixSensitivePerceptronModelWriter extends PerceptronModelWriter { ! private final PerceptronModelWriter suffixAppropriateWriter; /** --- 43,47 ---- */ public class SuffixSensitivePerceptronModelWriter extends PerceptronModelWriter { ! private final AbstractModelWriter suffixAppropriateWriter; /** *************** *** 81,98 **** } ! protected void writeUTF (String s) throws java.io.IOException { ! suffixAppropriateWriter.writeUTF(s); } ! protected void writeInt (int i) throws java.io.IOException { ! suffixAppropriateWriter.writeInt(i); } ! protected void writeDouble (double d) throws java.io.IOException { ! suffixAppropriateWriter.writeDouble(d); } ! protected void close () throws java.io.IOException { ! suffixAppropriateWriter.close(); } --- 82,99 ---- } ! public void writeUTF (String s) throws java.io.IOException { ! suffixAppropriateWriter.writeUTF(s); } ! public void writeInt (int i) throws java.io.IOException { ! suffixAppropriateWriter.writeInt(i); } ! public void writeDouble (double d) throws java.io.IOException { ! suffixAppropriateWriter.writeDouble(d); } ! public void close () throws java.io.IOException { ! suffixAppropriateWriter.close(); } Index: PerceptronModelWriter.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/perceptron/PerceptronModelWriter.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** PerceptronModelWriter.java 6 Nov 2008 19:59:44 -0000 1.1 --- PerceptronModelWriter.java 10 Nov 2008 14:51:40 -0000 1.2 *************** *** 8,11 **** --- 8,12 ---- import opennlp.model.AbstractModel; + import opennlp.model.AbstractModelWriter; import opennlp.model.ComparablePredicate; import opennlp.model.Context; *************** *** 19,23 **** * @version $Revision$, $Date$ */ ! public abstract class PerceptronModelWriter { protected Context[] PARAMS; protected String[] OUTCOME_LABELS; --- 20,24 ---- * @version $Revision$, $Date$ */ ! public abstract class PerceptronModelWriter extends AbstractModelWriter { protected Context[] PARAMS; protected String[] OUTCOME_LABELS; *************** *** 39,94 **** } - protected abstract void writeUTF (String s) throws java.io.IOException; - protected abstract void writeInt (int i) throws java.io.IOException; - protected abstract void writeDouble (double d) throws java.io.IOException; - protected abstract void close () throws java.io.IOException; - - /** - * Writes the model to disk, using the <code>writeX()</code> methods - * provided by extending classes. - * - * <p>If you wish to create a PerceptronModelWriter which uses a different - * structure, it will be necessary to override the persist method in - * addition to implementing the <code>writeX()</code> methods. - */ - public void persist() throws IOException { - - // the type of model (GIS) - writeUTF("Perceptron"); - - // the mapping from outcomes to their integer indexes - writeInt(OUTCOME_LABELS.length); - - for (int i=0; i<OUTCOME_LABELS.length; i++) - writeUTF(OUTCOME_LABELS[i]); - - // the mapping from predicates to the outcomes they contributed to. - // The sorting is done so that we actually can write this out more - // compactly than as the entire list. - ComparablePredicate[] sorted = sortValues(); - List compressed = compressOutcomes(sorted); - - writeInt(compressed.size()); - - for (int i=0; i<compressed.size(); i++) { - List a = (List)compressed.get(i); - writeUTF(a.size() - + ((ComparablePredicate)a.get(0)).toString()); - } - - // the mapping from predicate names to their integer indexes - writeInt(sorted.length); - - for (int i=0; i<sorted.length; i++) - writeUTF(sorted[i].name); - - // write out the parameters - for (int i=0; i<sorted.length; i++) - for (int j=0; j<sorted[i].params.length; j++) - writeDouble(sorted[i].params[j]); - - close(); - } - protected ComparablePredicate[] sortValues () { ComparablePredicate[] sortPreds; --- 40,43 ---- *************** *** 148,151 **** } ! } --- 97,145 ---- } ! /** ! * Writes the model to disk, using the <code>writeX()</code> methods ! * provided by extending classes. ! * ! * <p>If you wish to create a PerceptronModelWriter which uses a different ! * structure, it will be necessary to override the persist method in ! * addition to implementing the <code>writeX()</code> methods. ! */ ! public void persist() throws IOException { ! ! // the type of model (Perceptron) ! writeUTF("Perceptron"); ! ! // the mapping from outcomes to their integer indexes ! writeInt(OUTCOME_LABELS.length); ! ! for (int i=0; i<OUTCOME_LABELS.length; i++) ! writeUTF(OUTCOME_LABELS[i]); ! ! // the mapping from predicates to the outcomes they contributed to. ! // The sorting is done so that we actually can write this out more ! // compactly than as the entire list. ! ComparablePredicate[] sorted = sortValues(); ! List compressed = compressOutcomes(sorted); ! ! writeInt(compressed.size()); ! ! for (int i=0; i<compressed.size(); i++) { ! List a = (List)compressed.get(i); ! writeUTF(a.size() ! + ((ComparablePredicate)a.get(0)).toString()); ! } ! ! // the mapping from predicate names to their integer indexes ! writeInt(sorted.length); ! ! for (int i=0; i<sorted.length; i++) ! writeUTF(sorted[i].name); ! ! // write out the parameters ! for (int i=0; i<sorted.length; i++) ! for (int j=0; j<sorted[i].params.length; j++) ! writeDouble(sorted[i].params[j]); ! ! close(); ! } } |
From: Thomas M. <tsm...@us...> - 2008-11-10 14:51:53
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv11499/src/java/opennlp/maxent Modified Files: GISModel.java Log Message: Updates to better support generic model writting of different model types. Index: GISModel.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/GISModel.java,v retrieving revision 1.24 retrieving revision 1.25 diff -C2 -d -r1.24 -r1.25 *** GISModel.java 6 Nov 2008 19:59:44 -0000 1.24 --- GISModel.java 10 Nov 2008 14:51:40 -0000 1.25 *************** *** 63,66 **** --- 63,67 ---- this.prior = prior; prior.setLabels(ocNames, predLabels); + modelType = ModelType.Maxent; } |
Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent/io In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv11499/src/java/opennlp/maxent/io Modified Files: GISModelWriter.java PlainTextGISModelWriter.java BinaryGISModelWriter.java SuffixSensitiveGISModelWriter.java ObjectGISModelWriter.java Log Message: Updates to better support generic model writting of different model types. Index: GISModelWriter.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/GISModelWriter.java,v retrieving revision 1.8 retrieving revision 1.9 diff -C2 -d -r1.8 -r1.9 *** GISModelWriter.java 28 Sep 2008 18:04:29 -0000 1.8 --- GISModelWriter.java 10 Nov 2008 14:51:39 -0000 1.9 *************** *** 25,28 **** --- 25,29 ---- import opennlp.model.AbstractModel; + import opennlp.model.AbstractModelWriter; import opennlp.model.ComparablePredicate; import opennlp.model.Context; *************** *** 36,40 **** * @version $Revision$, $Date$ */ ! public abstract class GISModelWriter { protected Context[] PARAMS; protected String[] OUTCOME_LABELS; --- 37,41 ---- * @version $Revision$, $Date$ */ ! public abstract class GISModelWriter extends AbstractModelWriter { protected Context[] PARAMS; protected String[] OUTCOME_LABELS; *************** *** 59,66 **** } - protected abstract void writeUTF (String s) throws java.io.IOException; - protected abstract void writeInt (int i) throws java.io.IOException; - protected abstract void writeDouble (double d) throws java.io.IOException; - protected abstract void close () throws java.io.IOException; /** --- 60,63 ---- Index: PlainTextGISModelWriter.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/PlainTextGISModelWriter.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** PlainTextGISModelWriter.java 28 Sep 2008 18:04:25 -0000 1.2 --- PlainTextGISModelWriter.java 10 Nov 2008 14:51:39 -0000 1.3 *************** *** 18,26 **** package opennlp.maxent.io; ! import opennlp.maxent.*; ! import opennlp.model.AbstractModel; ! import java.io.*; ! import java.util.zip.*; /** --- 18,31 ---- package opennlp.maxent.io; ! import java.io.BufferedWriter; ! import java.io.File; ! import java.io.FileNotFoundException; ! import java.io.FileOutputStream; ! import java.io.FileWriter; ! import java.io.IOException; ! import java.io.OutputStreamWriter; ! import java.util.zip.GZIPOutputStream; ! import opennlp.model.AbstractModel; /** *************** *** 31,88 **** */ public class PlainTextGISModelWriter extends GISModelWriter { ! BufferedWriter output; ! ! /** ! * Constructor which takes a GISModel and a File and prepares itself to ! * write the model to that file. Detects whether the file is gzipped or not ! * based on whether the suffix contains ".gz". ! * ! * @param model The GISModel which is to be persisted. ! * @param f The File in which the model is to be persisted. ! */ ! public PlainTextGISModelWriter (AbstractModel model, File f) ! throws IOException, FileNotFoundException { ! super(model); ! if (f.getName().endsWith(".gz")) { ! output = new BufferedWriter(new OutputStreamWriter( ! new GZIPOutputStream(new FileOutputStream(f)))); ! } ! else { ! output = new BufferedWriter(new FileWriter(f)); ! } ! } ! /** ! * Constructor which takes a GISModel and a BufferedWriter and prepares ! * itself to write the model to that writer. ! * ! * @param model The GISModel which is to be persisted. ! * @param bw The BufferedWriter which will be used to persist the model. ! */ ! public PlainTextGISModelWriter (AbstractModel model, BufferedWriter bw) { ! super(model); ! output = bw; } ! ! protected void writeUTF (String s) throws java.io.IOException { ! output.write(s); ! output.newLine(); } ! protected void writeInt (int i) throws java.io.IOException { ! output.write(Integer.toString(i)); ! output.newLine(); ! } ! ! protected void writeDouble (double d) throws java.io.IOException { ! output.write(Double.toString(d)); ! output.newLine(); ! } ! protected void close () throws java.io.IOException { ! output.flush(); ! output.close(); ! } ! } --- 36,92 ---- */ public class PlainTextGISModelWriter extends GISModelWriter { ! BufferedWriter output; ! /** ! * Constructor which takes a GISModel and a File and prepares itself to ! * write the model to that file. Detects whether the file is gzipped or not ! * based on whether the suffix contains ".gz". ! * ! * @param model The GISModel which is to be persisted. ! * @param f The File in which the model is to be persisted. ! */ ! public PlainTextGISModelWriter (AbstractModel model, File f) ! throws IOException, FileNotFoundException { ! super(model); ! if (f.getName().endsWith(".gz")) { ! output = new BufferedWriter(new OutputStreamWriter( ! new GZIPOutputStream(new FileOutputStream(f)))); } ! else { ! output = new BufferedWriter(new FileWriter(f)); } + } ! /** ! * Constructor which takes a GISModel and a BufferedWriter and prepares ! * itself to write the model to that writer. ! * ! * @param model The GISModel which is to be persisted. ! * @param bw The BufferedWriter which will be used to persist the model. ! */ ! public PlainTextGISModelWriter (AbstractModel model, BufferedWriter bw) { ! super(model); ! output = bw; ! } ! public void writeUTF (String s) throws java.io.IOException { ! output.write(s); ! output.newLine(); ! } ! ! public void writeInt (int i) throws java.io.IOException { ! output.write(Integer.toString(i)); ! output.newLine(); ! } ! ! public void writeDouble (double d) throws java.io.IOException { ! output.write(Double.toString(d)); ! output.newLine(); ! } ! ! public void close () throws java.io.IOException { ! output.flush(); ! output.close(); ! } } Index: BinaryGISModelWriter.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/BinaryGISModelWriter.java,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** BinaryGISModelWriter.java 28 Sep 2008 18:04:30 -0000 1.3 --- BinaryGISModelWriter.java 10 Nov 2008 14:51:39 -0000 1.4 *************** *** 54,58 **** } ! /** * Constructor which takes a GISModel and a DataOutputStream and prepares * itself to write the model to that stream. --- 54,58 ---- } ! /** * Constructor which takes a GISModel and a DataOutputStream and prepares * itself to write the model to that stream. *************** *** 62,84 **** */ public BinaryGISModelWriter (AbstractModel model, DataOutputStream dos) { ! super(model); ! output = dos; } ! protected void writeUTF (String s) throws java.io.IOException { ! output.writeUTF(s); } ! protected void writeInt (int i) throws java.io.IOException { ! output.writeInt(i); } ! ! protected void writeDouble (double d) throws java.io.IOException { ! output.writeDouble(d); } ! protected void close () throws java.io.IOException { ! output.flush(); ! output.close(); } --- 62,84 ---- */ public BinaryGISModelWriter (AbstractModel model, DataOutputStream dos) { ! super(model); ! output = dos; } ! public void writeUTF (String s) throws java.io.IOException { ! output.writeUTF(s); } ! public void writeInt (int i) throws java.io.IOException { ! output.writeInt(i); } ! ! public void writeDouble (double d) throws java.io.IOException { ! output.writeDouble(d); } ! public void close () throws java.io.IOException { ! output.flush(); ! output.close(); } Index: SuffixSensitiveGISModelWriter.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/SuffixSensitiveGISModelWriter.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** SuffixSensitiveGISModelWriter.java 28 Sep 2008 18:04:22 -0000 1.2 --- SuffixSensitiveGISModelWriter.java 10 Nov 2008 14:51:39 -0000 1.3 *************** *** 18,26 **** package opennlp.maxent.io; ! import opennlp.maxent.*; ! import opennlp.model.AbstractModel; ! import java.io.*; ! import java.util.zip.*; /** --- 18,31 ---- package opennlp.maxent.io; ! import java.io.BufferedWriter; ! import java.io.DataOutputStream; ! import java.io.File; ! import java.io.FileOutputStream; ! import java.io.IOException; ! import java.io.OutputStream; ! import java.io.OutputStreamWriter; ! import java.util.zip.GZIPOutputStream; ! import opennlp.model.AbstractModel; /** *************** *** 37,94 **** */ public class SuffixSensitiveGISModelWriter extends GISModelWriter { ! private final GISModelWriter suffixAppropriateWriter; ! /** ! * Constructor which takes a GISModel and a File and invokes the ! * GISModelWriter appropriate for the suffix. ! * ! * @param model The GISModel which is to be persisted. ! * @param f The File in which the model is to be stored. ! */ ! public SuffixSensitiveGISModelWriter (AbstractModel model, File f) ! throws IOException { ! super (model); ! ! OutputStream output; ! String filename = f.getName(); ! // handle the zipped/not zipped distinction ! if (filename.endsWith(".gz")) { ! output = new GZIPOutputStream(new FileOutputStream(f)); ! filename = filename.substring(0,filename.length()-3); ! } ! else { ! output = new DataOutputStream(new FileOutputStream(f)); ! } ! // handle the different formats ! if (filename.endsWith(".bin")) { ! suffixAppropriateWriter = ! new BinaryGISModelWriter(model, ! new DataOutputStream(output)); ! } ! else { // default is ".txt" ! suffixAppropriateWriter = ! new PlainTextGISModelWriter(model, ! new BufferedWriter(new OutputStreamWriter(output))); ! } } ! ! protected void writeUTF (String s) throws java.io.IOException { ! suffixAppropriateWriter.writeUTF(s); } ! protected void writeInt (int i) throws java.io.IOException { ! suffixAppropriateWriter.writeInt(i); ! } ! ! protected void writeDouble (double d) throws java.io.IOException { ! suffixAppropriateWriter.writeDouble(d); } ! protected void close () throws java.io.IOException { ! suffixAppropriateWriter.close(); ! } } --- 42,98 ---- */ public class SuffixSensitiveGISModelWriter extends GISModelWriter { ! private final GISModelWriter suffixAppropriateWriter; ! /** ! * Constructor which takes a GISModel and a File and invokes the ! * GISModelWriter appropriate for the suffix. ! * ! * @param model The GISModel which is to be persisted. ! * @param f The File in which the model is to be stored. ! */ ! public SuffixSensitiveGISModelWriter (AbstractModel model, File f) ! throws IOException { ! super (model); ! OutputStream output; ! String filename = f.getName(); ! // handle the zipped/not zipped distinction ! if (filename.endsWith(".gz")) { ! output = new GZIPOutputStream(new FileOutputStream(f)); ! filename = filename.substring(0,filename.length()-3); } ! else { ! output = new DataOutputStream(new FileOutputStream(f)); } ! // handle the different formats ! if (filename.endsWith(".bin")) { ! suffixAppropriateWriter = ! new BinaryGISModelWriter(model, ! new DataOutputStream(output)); } + else { // default is ".txt" + suffixAppropriateWriter = + new PlainTextGISModelWriter(model, + new BufferedWriter(new OutputStreamWriter(output))); + } + } ! public void writeUTF (String s) throws java.io.IOException { ! suffixAppropriateWriter.writeUTF(s); ! } ! ! public void writeInt (int i) throws java.io.IOException { ! suffixAppropriateWriter.writeInt(i); ! } ! ! public void writeDouble (double d) throws java.io.IOException { ! suffixAppropriateWriter.writeDouble(d); ! } + public void close () throws java.io.IOException { + suffixAppropriateWriter.close(); + } } Index: ObjectGISModelWriter.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/ObjectGISModelWriter.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** ObjectGISModelWriter.java 28 Sep 2008 18:04:25 -0000 1.2 --- ObjectGISModelWriter.java 10 Nov 2008 14:51:39 -0000 1.3 *************** *** 40,56 **** ! protected void writeUTF(String s) throws IOException { output.writeUTF(s); } ! protected void writeInt(int i) throws IOException { output.writeInt(i); } ! protected void writeDouble(double d) throws IOException { output.writeDouble(d); } ! protected void close() throws IOException { output.flush(); output.close(); --- 40,56 ---- ! public void writeUTF(String s) throws IOException { output.writeUTF(s); } ! public void writeInt(int i) throws IOException { output.writeInt(i); } ! public void writeDouble(double d) throws IOException { output.writeDouble(d); } ! public void close() throws IOException { output.flush(); output.close(); |
From: Thomas M. <tsm...@us...> - 2008-11-10 14:51:46
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/model In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv11499/src/java/opennlp/model Modified Files: AbstractModel.java GenericModelReader.java Added Files: AbstractModelWriter.java GenericModelWriter.java Log Message: Updates to better support generic model writting of different model types. --- NEW FILE: AbstractModelWriter.java --- package opennlp.model; public abstract class AbstractModelWriter { public AbstractModelWriter() { super(); } public abstract void writeUTF(String s) throws java.io.IOException; public abstract void writeInt(int i) throws java.io.IOException; public abstract void writeDouble(double d) throws java.io.IOException; public abstract void close() throws java.io.IOException; public abstract void persist() throws java.io.IOException; } --- NEW FILE: GenericModelWriter.java --- package opennlp.model; import java.io.BufferedWriter; import java.io.DataOutputStream; import java.io.File; import java.io.FileOutputStream; import java.io.IOException; import java.io.OutputStream; import java.io.OutputStreamWriter; import java.util.zip.GZIPOutputStream; import opennlp.maxent.io.BinaryGISModelWriter; import opennlp.maxent.io.PlainTextGISModelWriter; import opennlp.model.AbstractModel.ModelType; import opennlp.perceptron.BinaryPerceptronModelWriter; import opennlp.perceptron.PlainTextPerceptronModelWriter; public class GenericModelWriter extends AbstractModelWriter { private AbstractModelWriter delegateWriter; public GenericModelWriter(AbstractModel model, File file) throws IOException { String filename = file.getName(); OutputStream os; // handle the zipped/not zipped distinction if (filename.endsWith(".gz")) { os = new GZIPOutputStream(new FileOutputStream(file)); filename = filename.substring(0,filename.length()-3); } else { os = new FileOutputStream(file); } // handle the different formats if (filename.endsWith(".bin")) { init(model,new DataOutputStream(os)); } else { // filename ends with ".txt" init(model,new BufferedWriter(new OutputStreamWriter(os))); } } public GenericModelWriter(AbstractModel model, DataOutputStream dos) { init(model,dos); } private void init(AbstractModel model, DataOutputStream dos) { if (model.getModelType() == ModelType.Perceptron) { delegateWriter = new BinaryPerceptronModelWriter(model,dos); } else if (model.getModelType() == ModelType.Maxent) { delegateWriter = new BinaryGISModelWriter(model,dos); } } private void init(AbstractModel model, BufferedWriter bw) { if (model.getModelType() == ModelType.Perceptron) { delegateWriter = new PlainTextPerceptronModelWriter(model,bw); } else if (model.getModelType() == ModelType.Maxent) { delegateWriter = new PlainTextGISModelWriter(model,bw); } } @Override public void close() throws IOException { delegateWriter.close(); } @Override public void persist() throws IOException { delegateWriter.persist(); } @Override public void writeDouble(double d) throws IOException { delegateWriter.writeDouble(d); } @Override public void writeInt(int i) throws IOException { delegateWriter.writeInt(i); } @Override public void writeUTF(String s) throws IOException { delegateWriter.writeUTF(s); } } Index: AbstractModel.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/model/AbstractModel.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** AbstractModel.java 6 Nov 2008 19:59:44 -0000 1.2 --- AbstractModel.java 10 Nov 2008 14:51:40 -0000 1.3 *************** *** 31,34 **** --- 31,36 ---- protected EvalParameters evalParams; protected Prior prior; + public enum ModelType {Maxent,Perceptron}; + protected ModelType modelType; public AbstractModel(Context[] params, String[] predLabels, String[] outcomeNames) { *************** *** 65,68 **** --- 67,74 ---- return ocNames[best]; } + + public ModelType getModelType(){ + return modelType; + } /** Index: GenericModelReader.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/model/GenericModelReader.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** GenericModelReader.java 9 Nov 2008 00:06:42 -0000 1.2 --- GenericModelReader.java 10 Nov 2008 14:51:40 -0000 1.3 *************** *** 15,18 **** --- 15,22 ---- } + public GenericModelReader(DataReader dataReader) { + super(dataReader); + } + public void checkModelType() throws IOException { String modelType = readUTF(); *************** *** 24,28 **** } else { ! System.err.println("Unknown model type: "+modelType); } } --- 28,32 ---- } else { ! throw new IOException("Unknown model format: "+modelType); } } |
From: Thomas M. <tsm...@us...> - 2008-11-09 00:06:48
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/model In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv10061/src/java/opennlp/model Modified Files: GenericModelReader.java Log Message: Fixed bug in model name checks. Index: GenericModelReader.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/model/GenericModelReader.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** GenericModelReader.java 6 Nov 2008 19:59:44 -0000 1.1 --- GenericModelReader.java 9 Nov 2008 00:06:42 -0000 1.2 *************** *** 20,26 **** delegateModelReader = new PerceptronModelReader(this.dataReader); } ! else if (modelType.equals("Maxent")) { delegateModelReader = new GISModelReader(this.dataReader); } } --- 20,29 ---- delegateModelReader = new PerceptronModelReader(this.dataReader); } ! else if (modelType.equals("GIS")) { delegateModelReader = new GISModelReader(this.dataReader); } + else { + System.err.println("Unknown model type: "+modelType); + } } |
From: Thomas M. <tsm...@us...> - 2008-11-06 20:00:48
|
Update of /cvsroot/maxent/maxent/samples/sports In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv1172/samples/sports Modified Files: CreateModel.java Predict.java Log Message: updates to docs for perceptron code. Index: CreateModel.java =================================================================== RCS file: /cvsroot/maxent/maxent/samples/sports/CreateModel.java,v retrieving revision 1.6 retrieving revision 1.7 diff -C2 -d -r1.6 -r1.7 *** CreateModel.java 13 Apr 2007 16:24:06 -0000 1.6 --- CreateModel.java 6 Nov 2008 20:00:34 -0000 1.7 *************** *** 21,32 **** import opennlp.maxent.BasicEventStream; - import opennlp.maxent.EventStream; import opennlp.maxent.GIS; - import opennlp.maxent.GISModel; - import opennlp.maxent.OnePassRealValueDataIndexer; import opennlp.maxent.PlainTextByLineDataStream; import opennlp.maxent.RealBasicEventStream; import opennlp.maxent.io.GISModelWriter; import opennlp.maxent.io.SuffixSensitiveGISModelWriter; /** --- 21,34 ---- import opennlp.maxent.BasicEventStream; import opennlp.maxent.GIS; import opennlp.maxent.PlainTextByLineDataStream; import opennlp.maxent.RealBasicEventStream; import opennlp.maxent.io.GISModelWriter; import opennlp.maxent.io.SuffixSensitiveGISModelWriter; + import opennlp.model.AbstractModel; + import opennlp.model.EventStream; + import opennlp.model.OnePassDataIndexer; + import opennlp.model.OnePassRealValueDataIndexer; + import opennlp.perceptron.PerceptronTrainer; /** *************** *** 61,68 **** --- 63,77 ---- int ai = 0; boolean real = false; + String type = "maxent"; + if(args.length == 0) { + usage(); + } while (args[ai].startsWith("-")) { if (args[ai].equals("-real")) { real = true; } + else if (args[ai].equals("-perceptron")) { + type = "perceptron"; + } else { System.err.println("Unknown option: "+args[ai]); *************** *** 85,99 **** } GIS.SMOOTHING_OBSERVATION = SMOOTHING_OBSERVATION; ! GISModel model; ! if (!real) { ! model = GIS.trainModel(es,USE_SMOOTHING); } else { ! model = GIS.trainModel(100, new OnePassRealValueDataIndexer(es,0), USE_SMOOTHING); } File outputFile = new File(modelFileName); ! GISModelWriter writer = ! new SuffixSensitiveGISModelWriter(model, outputFile); writer.persist(); } catch (Exception e) { --- 94,118 ---- } GIS.SMOOTHING_OBSERVATION = SMOOTHING_OBSERVATION; ! AbstractModel model; ! if (type.equals("maxent")) { ! ! if (!real) { ! model = GIS.trainModel(es,USE_SMOOTHING); ! } ! else { ! model = GIS.trainModel(100, new OnePassRealValueDataIndexer(es,0), USE_SMOOTHING); ! } ! } ! else if (type.equals("perceptron")){ ! System.err.println("Perceptron training"); ! model = new PerceptronTrainer().trainModel(10, new OnePassDataIndexer(es,0),0); } else { ! System.err.println("Unknown model type: "+type); ! model = null; } File outputFile = new File(modelFileName); ! GISModelWriter writer = new SuffixSensitiveGISModelWriter(model, outputFile); writer.persist(); } catch (Exception e) { Index: Predict.java =================================================================== RCS file: /cvsroot/maxent/maxent/samples/sports/Predict.java,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** Predict.java 13 Apr 2007 16:24:06 -0000 1.3 --- Predict.java 6 Nov 2008 20:00:34 -0000 1.4 *************** *** 17,23 **** ////////////////////////////////////////////////////////////////////////////// ! import opennlp.maxent.*; ! import opennlp.maxent.io.*; ! import java.io.*; /** --- 17,30 ---- ////////////////////////////////////////////////////////////////////////////// ! import java.io.File; ! import java.io.FileReader; ! ! import opennlp.maxent.BasicContextGenerator; ! import opennlp.maxent.ContextGenerator; ! import opennlp.maxent.DataStream; ! import opennlp.maxent.PlainTextByLineDataStream; ! import opennlp.model.GenericModelReader; ! import opennlp.model.MaxentModel; ! import opennlp.model.RealValueFileEventStream; /** *************** *** 65,68 **** --- 72,76 ---- String dataFileName, modelFileName; boolean real = false; + String type = "maxent"; int ai = 0; if (args.length > 0) { *************** *** 71,74 **** --- 79,85 ---- real = true; } + else if (args[ai].equals("-perceptron")) { + type = "perceptron"; + } else { usage(); *************** *** 90,97 **** Predict predictor = null; try { ! GISModel m = ! new SuffixSensitiveGISModelReader( ! new File(modelFileName)).getModel(); ! predictor = new Predict(m); } catch (Exception e) { e.printStackTrace(); --- 101,106 ---- Predict predictor = null; try { ! MaxentModel m = new GenericModelReader(new File(modelFileName)).getModel(); ! predictor = new Predict(m); } catch (Exception e) { e.printStackTrace(); |
From: Thomas M. <tsm...@us...> - 2008-11-06 20:00:39
|
Update of /cvsroot/maxent/maxent/docs In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv1172/docs Modified Files: about.html Log Message: updates to docs for perceptron code. Index: about.html =================================================================== RCS file: /cvsroot/maxent/maxent/docs/about.html,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** about.html 30 Aug 2008 17:46:59 -0000 1.3 --- about.html 6 Nov 2008 20:00:34 -0000 1.4 *************** *** 126,130 **** package, EventStream and ContextGenerator. These have fairly simple specifications, and example implementations can be found in the ! <a href="http://grok.sourceforge.net">OpenNLP Grok Library</a> preprocessing components. More details are given in the opennlp.maxent <a href="howto.html">HOWTO</a>. --- 126,130 ---- package, EventStream and ContextGenerator. These have fairly simple specifications, and example implementations can be found in the ! <a href="http://opennlp.sourceforge.net">OpenNLP Tools</a> preprocessing components. More details are given in the opennlp.maxent <a href="howto.html">HOWTO</a>. *************** *** 189,193 **** <h3> ! Email: <a href="mailto:tsm...@us...">tsm...@us...</a> <br> <script language="JavaScript"> --- 189,193 ---- <h3> ! Email: <a href="mailto:tsm...@us...">tsm...@us...</a> <br> <script language="JavaScript"> |
From: Thomas M. <tsm...@us...> - 2008-11-06 19:59:58
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv1009/src/java/opennlp/maxent Modified Files: RealBasicEventStream.java GISTrainer.java GISModel.java BasicEventStream.java Log Message: Updates to support perceptron models. Index: RealBasicEventStream.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/RealBasicEventStream.java,v retrieving revision 1.4 retrieving revision 1.5 diff -C2 -d -r1.4 -r1.5 *** RealBasicEventStream.java 28 Sep 2008 18:03:49 -0000 1.4 --- RealBasicEventStream.java 6 Nov 2008 19:59:44 -0000 1.5 *************** *** 18,26 **** package opennlp.maxent; import opennlp.model.Event; import opennlp.model.EventStream; import opennlp.model.RealValueFileEventStream; ! public class RealBasicEventStream implements EventStream { ContextGenerator cg = new BasicContextGenerator(); DataStream ds; --- 18,27 ---- package opennlp.maxent; + import opennlp.model.AbstractEventStream; import opennlp.model.Event; import opennlp.model.EventStream; import opennlp.model.RealValueFileEventStream; ! public class RealBasicEventStream extends AbstractEventStream { ContextGenerator cg = new BasicContextGenerator(); DataStream ds; *************** *** 34,38 **** } ! public Event nextEvent() { while (next == null && this.ds.hasNext()) next = createEvent((String)this.ds.nextToken()); --- 35,39 ---- } ! public Event next() { while (next == null && this.ds.hasNext()) next = createEvent((String)this.ds.nextToken()); *************** *** 68,72 **** EventStream es = new RealBasicEventStream(new PlainTextByLineDataStream(new java.io.FileReader(args[0]))); while (es.hasNext()) { ! System.out.println(es.nextEvent()); } } --- 69,73 ---- EventStream es = new RealBasicEventStream(new PlainTextByLineDataStream(new java.io.FileReader(args[0]))); while (es.hasNext()) { ! System.out.println(es.next()); } } Index: GISTrainer.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/GISTrainer.java,v retrieving revision 1.30 retrieving revision 1.31 diff -C2 -d -r1.30 -r1.31 *** GISTrainer.java 28 Sep 2008 18:03:38 -0000 1.30 --- GISTrainer.java 6 Nov 2008 19:59:44 -0000 1.31 *************** *** 365,369 **** /*************** Create and return the model ******************/ ! return new GISModel(params, predLabels, outcomeLabels, correctionConstant, evalParams.correctionParam); } --- 365,369 ---- /*************** Create and return the model ******************/ ! return new GISModel(params, predLabels, outcomeLabels, correctionConstant, evalParams.getCorrectionParam()); } *************** *** 468,472 **** } if (useSlackParameter) ! CFMOD += (evalParams.correctionConstant - contexts[ei].length) * numTimesEventsSeen[ei]; loglikelihood += Math.log(modelDistribution[outcomeList[ei]]) * numTimesEventsSeen[ei]; --- 468,472 ---- } if (useSlackParameter) ! CFMOD += (evalParams.getCorrectionConstant() - contexts[ei].length) * numTimesEventsSeen[ei]; loglikelihood += Math.log(modelDistribution[outcomeList[ei]]) * numTimesEventsSeen[ei]; *************** *** 494,498 **** for (int aoi=0;aoi<activeOutcomes.length;aoi++) { if (useGaussianSmoothing) { ! params[pi].updateParameter(aoi,gaussianUpdate(pi,aoi,numEvents,evalParams.correctionConstant)); } else { --- 494,498 ---- for (int aoi=0;aoi<activeOutcomes.length;aoi++) { if (useGaussianSmoothing) { ! params[pi].updateParameter(aoi,gaussianUpdate(pi,aoi,numEvents,evalParams.getCorrectionConstant())); } else { *************** *** 506,510 **** } if (CFMOD > 0.0 && useSlackParameter) ! evalParams.correctionParam += (cfObservedExpect - Math.log(CFMOD)); display(". loglikelihood=" + loglikelihood + "\t" + ((double) numCorrect / numEvents) + "\n"); --- 506,510 ---- } if (CFMOD > 0.0 && useSlackParameter) ! evalParams.setCorrectionParam(evalParams.getCorrectionParam() + (cfObservedExpect - Math.log(CFMOD))); display(". loglikelihood=" + loglikelihood + "\t" + ((double) numCorrect / numEvents) + "\n"); Index: GISModel.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/GISModel.java,v retrieving revision 1.23 retrieving revision 1.24 diff -C2 -d -r1.23 -r1.24 *** GISModel.java 28 Sep 2008 18:03:50 -0000 1.23 --- GISModel.java 6 Nov 2008 19:59:44 -0000 1.24 *************** *** 78,86 **** */ public final double[] eval(String[] context) { ! return(eval(context,new double[evalParams.numOutcomes])); } public final double[] eval(String[] context, float[] values) { ! return(eval(context,values,new double[evalParams.numOutcomes])); } --- 78,86 ---- */ public final double[] eval(String[] context) { ! return(eval(context,new double[evalParams.getNumOutcomes()])); } public final double[] eval(String[] context, float[] values) { ! return(eval(context,values,new double[evalParams.getNumOutcomes()])); } *************** *** 145,150 **** */ public static double[] eval(int[] context, float[] values, double[] prior, EvalParameters model) { ! Context[] params = model.params; ! int numfeats[] = new int[model.numOutcomes]; int[] activeOutcomes; double[] activeParameters; --- 145,150 ---- */ public static double[] eval(int[] context, float[] values, double[] prior, EvalParameters model) { ! Context[] params = model.getParams(); ! int numfeats[] = new int[model.getNumOutcomes()]; int[] activeOutcomes; double[] activeParameters; *************** *** 167,181 **** double normal = 0.0; ! for (int oid = 0; oid < model.numOutcomes; oid++) { ! if (model.correctionParam != 0) { ! prior[oid] = Math.exp(prior[oid]*model.constantInverse+((1.0 - ((double) numfeats[oid] / model.correctionConstant)) * model.correctionParam)); } else { ! prior[oid] = Math.exp(prior[oid]*model.constantInverse); } normal += prior[oid]; } ! for (int oid = 0; oid < model.numOutcomes; oid++) { prior[oid] /= normal; } --- 167,181 ---- double normal = 0.0; ! for (int oid = 0; oid < model.getNumOutcomes(); oid++) { ! if (model.getCorrectionParam() != 0) { ! prior[oid] = Math.exp(prior[oid]*model.getConstantInverse()+((1.0 - ((double) numfeats[oid] / model.getCorrectionConstant())) * model.getCorrectionParam())); } else { ! prior[oid] = Math.exp(prior[oid]*model.getConstantInverse()); } normal += prior[oid]; } ! for (int oid = 0; oid < model.getNumOutcomes(); oid++) { prior[oid] /= normal; } Index: BasicEventStream.java =================================================================== RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/BasicEventStream.java,v retrieving revision 1.4 retrieving revision 1.5 diff -C2 -d -r1.4 -r1.5 *** BasicEventStream.java 28 Sep 2008 18:03:47 -0000 1.4 --- BasicEventStream.java 6 Nov 2008 19:59:44 -0000 1.5 *************** *** 18,23 **** package opennlp.maxent; import opennlp.model.Event; - import opennlp.model.EventStream; /** --- 18,23 ---- package opennlp.maxent; + import opennlp.model.AbstractEventStream; import opennlp.model.Event; /** *************** *** 33,37 **** * @version $Revision$, $Date$ */ ! public class BasicEventStream implements EventStream { ContextGenerator cg = new BasicContextGenerator(); DataStream ds; --- 33,37 ---- * @version $Revision$, $Date$ */ ! public class BasicEventStream extends AbstractEventStream { ContextGenerator cg = new BasicContextGenerator(); DataStream ds; *************** *** 49,53 **** * @return the Event object which is next in this EventStream */ ! public Event nextEvent () { while (next == null && this.ds.hasNext()) next = createEvent((String)this.ds.nextToken()); --- 49,53 ---- * @return the Event object which is next in this EventStream */ ! public Event next () { while (next == null && this.ds.hasNext()) next = createEvent((String)this.ds.nextToken()); |
From: Thomas M. <tsm...@us...> - 2008-11-06 19:59:58
|
Update of /cvsroot/maxent/maxent/src/java/opennlp/perceptron In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv1009/src/java/opennlp/perceptron Added Files: BinaryPerceptronModelWriter.java PerceptronModel.java PlainTextPerceptronModelReader.java BinaryPerceptronModelReader.java SuffixSensitivePerceptronModelWriter.java PerceptronModelWriter.java PerceptronTrainer.java PlainTextPerceptronModelWriter.java PerceptronModelReader.java Log Message: Updates to support perceptron models. --- NEW FILE: BinaryPerceptronModelWriter.java --- /////////////////////////////////////////////////////////////////////////////// //Copyright (C) 2001 Jason Baldridge and Gann Bierner //This library is free software; you can redistribute it and/or //modify it under the terms of the GNU Lesser General Public //License as published by the Free Software Foundation; either //version 2.1 of the License, or (at your option) any later version. //This library is distributed in the hope that it will be useful, //but WITHOUT ANY WARRANTY; without even the implied warranty of //MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the //GNU General Public License for more details. //You should have received a copy of the GNU Lesser General Public //License along with this program; if not, write to the Free Software //Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. package opennlp.perceptron; import java.io.DataOutputStream; import java.io.File; import java.io.FileOutputStream; import java.io.IOException; import java.util.zip.GZIPOutputStream; import opennlp.model.AbstractModel; /** * Model writer that saves models in binary format. * * @author Jason Baldridge * @version $Revision: 1.1 $, $Date: 2008/11/06 19:59:44 $ */ public class BinaryPerceptronModelWriter extends PerceptronModelWriter { DataOutputStream output; /** * Constructor which takes a GISModel and a File and prepares itself to * write the model to that file. Detects whether the file is gzipped or not * based on whether the suffix contains ".gz". * * @param model The GISModel which is to be persisted. * @param f The File in which the model is to be persisted. */ public BinaryPerceptronModelWriter (AbstractModel model, File f) throws IOException { super(model); if (f.getName().endsWith(".gz")) { output = new DataOutputStream( new GZIPOutputStream(new FileOutputStream(f))); } else { output = new DataOutputStream(new FileOutputStream(f)); } } /** * Constructor which takes a GISModel and a DataOutputStream and prepares * itself to write the model to that stream. * * @param model The GISModel which is to be persisted. * @param dos The stream which will be used to persist the model. */ public BinaryPerceptronModelWriter (AbstractModel model, DataOutputStream dos) { super(model); output = dos; } protected void writeUTF (String s) throws java.io.IOException { output.writeUTF(s); } protected void writeInt (int i) throws java.io.IOException { output.writeInt(i); } protected void writeDouble (double d) throws java.io.IOException { output.writeDouble(d); } protected void close () throws java.io.IOException { output.flush(); output.close(); } } --- NEW FILE: PerceptronModel.java --- package opennlp.perceptron; import java.io.BufferedReader; import java.io.File; import java.io.InputStreamReader; import java.text.DecimalFormat; import opennlp.model.AbstractModel; import opennlp.model.Context; import opennlp.model.EvalParameters; public class PerceptronModel extends AbstractModel { public PerceptronModel(Context[] params, String[] predLabels, String[] outcomeNames) { super(params,predLabels,outcomeNames); } public double[] eval(String[] context) { return eval(context,new double[evalParams.getNumOutcomes()]); } public double[] eval(String[] context, float[] values) { return eval(context,values,new double[evalParams.getNumOutcomes()]); } public double[] eval(String[] context, double[] probs) { return eval(context,null,probs); } public double[] eval(String[] context, float[] values,double[] outsums) { int[] scontexts = new int[context.length]; java.util.Arrays.fill(outsums, 0); for (int i=0; i<context.length; i++) { Integer ci = pmap.get(context[i]); scontexts[i] = ci == null ? -1 : ci; } return eval(scontexts,values,outsums,evalParams,true); } public static double[] eval(int[] context, double[] prior, EvalParameters model) { return eval(context,null,prior,model,true); } public static double[] eval(int[] context, float[] values, double[] prior, EvalParameters model, boolean normalize) { Context[] params = model.getParams(); double[] activeParameters; int[] activeOutcomes; double value = 1; for (int ci = 0; ci < context.length; ci++) { if (context[ci] >= 0) { Context predParams = params[context[ci]]; activeOutcomes = predParams.getOutcomes(); activeParameters = predParams.getParameters(); if (values != null) { value = values[ci]; } for (int ai = 0; ai < activeOutcomes.length; ai++) { int oid = activeOutcomes[ai]; prior[oid] += activeParameters[ai] * value; } } } if (normalize) { double normal = 0.0; double min = prior[0]; for (int oid = 0; oid < model.getNumOutcomes(); oid++) { if (prior[oid] < min) { min = prior[oid]; } } for (int oid = 0; oid < model.getNumOutcomes(); oid++) { if (min < 0) { prior[oid]+=(-1*min); } normal += prior[oid]; } if (normal == 0.0) { for (int oid = 0; oid < model.getNumOutcomes(); oid++) { prior[oid] = (double) 1/model.getNumOutcomes(); } } else { for (int oid = 0; oid < model.getNumOutcomes(); oid++) { prior[oid] /= normal; } } } return prior; } public static void main(String[] args) throws java.io.IOException { if (args.length == 0) { System.err.println("Usage: PerceptronModel modelname < contexts"); System.exit(1); } AbstractModel m = new PerceptronModelReader(new File(args[0])).getModel(); BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); DecimalFormat df = new java.text.DecimalFormat(".###"); for (String line = in.readLine(); line != null; line = in.readLine()) { String[] context = line.split(" "); double[] dist = m.eval(context); for (int oi=0;oi<dist.length;oi++) { System.out.print("["+m.getOutcome(oi)+" "+df.format(dist[oi])+"] "); } System.out.println(); } } } --- NEW FILE: PlainTextPerceptronModelReader.java --- package opennlp.perceptron; import java.io.BufferedReader; import java.io.File; import java.io.IOException; import opennlp.model.PlainTextFileDataReader; public class PlainTextPerceptronModelReader extends PerceptronModelReader { /** * Constructor which directly instantiates the BufferedReader containing * the model contents. * * @param br The BufferedReader containing the model information. */ public PlainTextPerceptronModelReader(BufferedReader br) { super(new PlainTextFileDataReader(br)); } /** * Constructor which takes a File and creates a reader for it. Detects * whether the file is gzipped or not based on whether the suffix contains * ".gz". * * @param f The File in which the model is stored. */ public PlainTextPerceptronModelReader (File f) throws IOException { super(f); } } --- NEW FILE: BinaryPerceptronModelReader.java --- package opennlp.perceptron; import java.io.DataInputStream; import java.io.File; import java.io.IOException; import opennlp.model.BinaryFileDataReader; public class BinaryPerceptronModelReader extends PerceptronModelReader { /** * Constructor which directly instantiates the DataInputStream containing * the model contents. * * @param dis The DataInputStream containing the model information. */ public BinaryPerceptronModelReader(DataInputStream dis) { super(new BinaryFileDataReader(dis)); } /** * Constructor which takes a File and creates a reader for it. Detects * whether the file is gzipped or not based on whether the suffix contains * ".gz" * * @param f The File in which the model is stored. */ public BinaryPerceptronModelReader (File f) throws IOException { super(f); } } --- NEW FILE: SuffixSensitivePerceptronModelWriter.java --- /////////////////////////////////////////////////////////////////////////////// // Copyright (C) 2001 Jason Baldridge and Gann Bierner // // This library is free software; you can redistribute it and/or // modify it under the terms of the GNU Lesser General Public // License as published by the Free Software Foundation; either // version 2.1 of the License, or (at your option) any later version. // // This library is distributed in the hope that it will be useful, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // GNU General Public License for more details. // // You should have received a copy of the GNU Lesser General Public // License along with this program; if not, write to the Free Software // Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. ////////////////////////////////////////////////////////////////////////////// package opennlp.perceptron; import java.io.BufferedWriter; import java.io.DataOutputStream; import java.io.File; import java.io.FileOutputStream; import java.io.IOException; import java.io.OutputStream; import java.io.OutputStreamWriter; import java.util.zip.GZIPOutputStream; import opennlp.model.AbstractModel; /** * A writer for GIS models which inspects the filename and invokes the * appropriate GISModelWriter depending on the filename's suffixes. * * <p>The following assumption are made about suffixes: * <li>.gz --> the file is gzipped (must be the last suffix) * <li>.txt --> the file is plain text * <li>.bin --> the file is binary * * @author Jason Baldridge * @version $Revision: 1.1 $, $Date: 2008/11/06 19:59:44 $ */ public class SuffixSensitivePerceptronModelWriter extends PerceptronModelWriter { private final PerceptronModelWriter suffixAppropriateWriter; /** * Constructor which takes a GISModel and a File and invokes the * GISModelWriter appropriate for the suffix. * * @param model The GISModel which is to be persisted. * @param f The File in which the model is to be stored. */ public SuffixSensitivePerceptronModelWriter (AbstractModel model, File f) throws IOException { super (model); OutputStream output; String filename = f.getName(); // handle the zipped/not zipped distinction if (filename.endsWith(".gz")) { output = new GZIPOutputStream(new FileOutputStream(f)); filename = filename.substring(0,filename.length()-3); } else { output = new DataOutputStream(new FileOutputStream(f)); } // handle the different formats if (filename.endsWith(".bin")) { suffixAppropriateWriter = new BinaryPerceptronModelWriter(model, new DataOutputStream(output)); } else { // default is ".txt" suffixAppropriateWriter = new PlainTextPerceptronModelWriter(model, new BufferedWriter(new OutputStreamWriter(output))); } } protected void writeUTF (String s) throws java.io.IOException { suffixAppropriateWriter.writeUTF(s); } protected void writeInt (int i) throws java.io.IOException { suffixAppropriateWriter.writeInt(i); } protected void writeDouble (double d) throws java.io.IOException { suffixAppropriateWriter.writeDouble(d); } protected void close () throws java.io.IOException { suffixAppropriateWriter.close(); } } --- NEW FILE: PerceptronModelWriter.java --- package opennlp.perceptron; import java.io.IOException; import java.util.ArrayList; import java.util.Arrays; import java.util.List; import java.util.Map; import opennlp.model.AbstractModel; import opennlp.model.ComparablePredicate; import opennlp.model.Context; /** * Abstract parent class for Perceptron writers. It provides the persist method * which takes care of the structure of a stored document, and requires an * extending class to define precisely how the data should be stored. * * @author Jason Baldridge * @version $Revision: 1.1 $, $Date: 2008/11/06 19:59:44 $ */ public abstract class PerceptronModelWriter { protected Context[] PARAMS; protected String[] OUTCOME_LABELS; protected String[] PRED_LABELS; int numOutcomes; public PerceptronModelWriter (AbstractModel model) { Object[] data = model.getDataStructures(); this.numOutcomes = model.getNumOutcomes(); PARAMS = (Context[]) data[0]; Map<String,Integer> pmap = (Map<String,Integer>)data[1]; OUTCOME_LABELS = (String[])data[2]; PRED_LABELS = new String[pmap.size()]; for (String pred : pmap.keySet()) { PRED_LABELS[pmap.get(pred)] = pred; } } protected abstract void writeUTF (String s) throws java.io.IOException; protected abstract void writeInt (int i) throws java.io.IOException; protected abstract void writeDouble (double d) throws java.io.IOException; protected abstract void close () throws java.io.IOException; /** * Writes the model to disk, using the <code>writeX()</code> methods * provided by extending classes. * * <p>If you wish to create a PerceptronModelWriter which uses a different * structure, it will be necessary to override the persist method in * addition to implementing the <code>writeX()</code> methods. */ public void persist() throws IOException { // the type of model (GIS) writeUTF("Perceptron"); // the mapping from outcomes to their integer indexes writeInt(OUTCOME_LABELS.length); for (int i=0; i<OUTCOME_LABELS.length; i++) writeUTF(OUTCOME_LABELS[i]); // the mapping from predicates to the outcomes they contributed to. // The sorting is done so that we actually can write this out more // compactly than as the entire list. ComparablePredicate[] sorted = sortValues(); List compressed = compressOutcomes(sorted); writeInt(compressed.size()); for (int i=0; i<compressed.size(); i++) { List a = (List)compressed.get(i); writeUTF(a.size() + ((ComparablePredicate)a.get(0)).toString()); } // the mapping from predicate names to their integer indexes writeInt(sorted.length); for (int i=0; i<sorted.length; i++) writeUTF(sorted[i].name); // write out the parameters for (int i=0; i<sorted.length; i++) for (int j=0; j<sorted[i].params.length; j++) writeDouble(sorted[i].params[j]); close(); } protected ComparablePredicate[] sortValues () { ComparablePredicate[] sortPreds; ComparablePredicate[] tmpPreds = new ComparablePredicate[PARAMS.length]; int[] tmpOutcomes = new int[numOutcomes]; double[] tmpParams = new double[numOutcomes]; int numPreds = 0; //remove parameters with 0 weight and predicates with no parameters for (int pid=0; pid<PARAMS.length; pid++) { int numParams = 0; double[] predParams = PARAMS[pid].getParameters(); for (int pi=0;pi<predParams.length;pi++) { if (predParams[pi] != 0d) { tmpOutcomes[numParams]=pi; tmpParams[numParams]=predParams[pi]; numParams++; } } int[] activeOutcomes = new int[numParams]; double[] activeParams = new double[numParams]; for (int pi=0;pi<numParams;pi++) { activeOutcomes[pi] = tmpOutcomes[pi]; activeParams[pi] = tmpParams[pi]; } if (numParams != 0) { tmpPreds[numPreds] = new ComparablePredicate(PRED_LABELS[pid],activeOutcomes,activeParams); numPreds++; } } sortPreds = new ComparablePredicate[numPreds]; for (int pid=0;pid<numPreds;pid++) { sortPreds[pid] = tmpPreds[pid]; } Arrays.sort(sortPreds); return sortPreds; } protected List compressOutcomes (ComparablePredicate[] sorted) { ComparablePredicate cp = sorted[0]; List outcomePatterns = new ArrayList(); List newGroup = new ArrayList(); for (int i=0; i<sorted.length; i++) { if (cp.compareTo(sorted[i]) == 0) { newGroup.add(sorted[i]); } else { cp = sorted[i]; outcomePatterns.add(newGroup); newGroup = new ArrayList(); newGroup.add(sorted[i]); } } outcomePatterns.add(newGroup); return outcomePatterns; } } --- NEW FILE: PerceptronTrainer.java --- package opennlp.perceptron; import opennlp.model.AbstractModel; import opennlp.model.DataIndexer; import opennlp.model.EvalParameters; import opennlp.model.EvalParameters; import opennlp.model.MutableContext; public class PerceptronTrainer { /** Number of unique events which occured in the event set. */ private int numUniqueEvents; /** Number of events in the event set. */ private int numEvents; /** Number of predicates. */ private int numPreds; /** Number of outcomes. */ private int numOutcomes; /** Records the array of predicates seen in each event. */ private int[][] contexts; /** The value associates with each context. If null then context values are assumes to be 1. */ private float[][] values; /** List of outcomes for each event i, in context[i]. */ private int[] outcomeList; /** Records the num of times an event has been seen for each event i, in context[i]. */ private int[] numTimesEventsSeen; /** Stores the String names of the outcomes. The GIS only tracks outcomes as ints, and so this array is needed to save the model to disk and thereby allow users to know what the outcome was in human understandable terms. */ private String[] outcomeLabels; /** Stores the String names of the predicates. The GIS only tracks predicates as ints, and so this array is needed to save the model to disk and thereby allow users to know what the outcome was in human understandable terms. */ private String[] predLabels; /** Stores the estimated parameter value of each predicate during iteration. */ private MutableContext[] params; private int[][][] updates; private int VALUE = 0; private int ITER = 1; private int EVENT = 2; /** Stores the average parameter values of each predicate during iteration. */ private MutableContext[] averageParams; private EvalParameters evalParams; private boolean printMessages = true; double[] modelDistribution; private int iterations; private boolean useAverage; public AbstractModel trainModel(int iterations, DataIndexer di, int cutoff) { this.iterations = iterations; return trainModel(iterations,di,cutoff,true); } public AbstractModel trainModel(int iterations, DataIndexer di, int cutoff, boolean useAverage) { display("Incorporating indexed data for training... \n"); this.useAverage = useAverage; contexts = di.getContexts(); values = di.getValues(); numTimesEventsSeen = di.getNumTimesEventsSeen(); numEvents = di.getNumEvents(); numUniqueEvents = contexts.length; this.iterations = iterations; outcomeLabels = di.getOutcomeLabels(); outcomeList = di.getOutcomeList(); predLabels = di.getPredLabels(); numPreds = predLabels.length; numOutcomes = outcomeLabels.length; if (useAverage) updates = new int[numPreds][numOutcomes][3]; display("done.\n"); display("\tNumber of Event Tokens: " + numUniqueEvents + "\n"); display("\t Number of Outcomes: " + numOutcomes + "\n"); display("\t Number of Predicates: " + numPreds + "\n"); params = new MutableContext[numPreds]; if (useAverage) averageParams = new MutableContext[numPreds]; evalParams = new EvalParameters(params,numOutcomes); int[] allOutcomesPattern= new int[numOutcomes]; for (int oi = 0; oi < numOutcomes; oi++) { allOutcomesPattern[oi] = oi; } int numActiveOutcomes = numOutcomes; for (int pi = 0; pi < numPreds; pi++) { params[pi] = new MutableContext(allOutcomesPattern,new double[numActiveOutcomes]); if (useAverage) averageParams[pi] = new MutableContext(allOutcomesPattern,new double[numActiveOutcomes]); for (int aoi=0;aoi<numActiveOutcomes;aoi++) { params[pi].setParameter(aoi, 0.0); if (useAverage) averageParams[pi].setParameter(aoi, 0.0); } } modelDistribution = new double[numOutcomes]; display("Computing model parameters...\n"); findParameters(iterations); display("...done.\n"); /*************** Create and return the model ******************/ if (useAverage) { return new PerceptronModel(averageParams, predLabels, outcomeLabels); } else { return new PerceptronModel(params, predLabels, outcomeLabels); } } private void display(String s) { if (printMessages) System.out.print(s); } private void findParameters(int iterations) { display("Performing " + iterations + " iterations.\n"); for (int i = 1; i <= iterations; i++) { if (i < 10) display(" " + i + ": "); else if (i < 100) display(" " + i + ": "); else display(i + ": "); nextIteration(i); } // kill a bunch of these big objects now that we don't need them numTimesEventsSeen = null; contexts = null; } /* Compute one iteration of Perceptron and retutn log-likelihood.*/ private void nextIteration(int iteration) { iteration--; //move to 0-based index int numCorrect = 0; for (int ei = 0; ei < numUniqueEvents; ei++) { for (int ni=0;ni<this.numTimesEventsSeen[ei];ni++) { for (int oi = 0; oi < numOutcomes; oi++) { modelDistribution[oi] = 0; } if (values != null) { PerceptronModel.eval(contexts[ei], values[ei], modelDistribution, evalParams,false); } else { PerceptronModel.eval(contexts[ei], null, modelDistribution, evalParams, false); } int max = 0; for (int oi = 1; oi < numOutcomes; oi++) { if (modelDistribution[oi] > modelDistribution[max]) { max = oi; } } if (max == outcomeList[ei]) { numCorrect += numTimesEventsSeen[ei]; } for (int oi = 0;oi<numOutcomes;oi++) { if (oi == outcomeList[ei]) { if (modelDistribution[oi] <= 0) { for (int ci = 0; ci < contexts[ei].length; ci++) { int pi = contexts[ei][ci]; if (values == null) { params[pi].updateParameter(oi, 1); } else { params[pi].updateParameter(oi, values[ei][ci]); } if (useAverage) { if (updates[pi][oi][VALUE] != 0) { averageParams[pi].updateParameter(oi,updates[pi][oi][VALUE]*(numEvents*(iteration-updates[pi][oi][ITER])+(ei-updates[pi][oi][EVENT]))); } //System.err.println("updates["+pi+"]["+oi+"]=("+updates[pi][oi][ITER]+","+updates[pi][oi][EVENT]+","+updates[pi][oi][VALUE]+") + ("+iteration+","+ei+","+params[pi].getParameters()[oi]+") -> "+averageParams[pi].getParameters()[oi]); updates[pi][oi][VALUE] = (int) params[pi].getParameters()[oi]; updates[pi][oi][ITER] = iteration; updates[pi][oi][EVENT] = ei; } } } } else { if (modelDistribution[oi] > 0) { for (int ci = 0; ci < contexts[ei].length; ci++) { int pi = contexts[ei][ci]; if (values == null) { params[pi].updateParameter(oi,-1); } else { params[pi].updateParameter(oi, values[ei][ci]*-1); } if (useAverage) { if (updates[pi][oi][VALUE] != 0) { averageParams[pi].updateParameter(oi,updates[pi][oi][VALUE]*(numEvents*(iteration-updates[pi][oi][ITER])+(ei-updates[pi][oi][EVENT]))); } //System.err.println("updates["+pi+"]["+oi+"]=("+updates[pi][oi][ITER]+","+updates[pi][oi][EVENT]+","+updates[pi][oi][VALUE]+") + ("+iteration+","+ei+","+params[pi].getParameters()[oi]+") -> "+averageParams[pi].getParameters()[oi]); updates[pi][oi][VALUE] = (int) params[pi].getParameters()[oi]; updates[pi][oi][ITER] = iteration; updates[pi][oi][EVENT] = ei; } } } } } } } //finish average computation double totIterations = (double) iterations*numEvents; if (useAverage && iteration == iterations-1) { for (int pi = 0; pi < numPreds; pi++) { double[] predParams = averageParams[pi].getParameters(); for (int oi = 0;oi<numOutcomes;oi++) { if (updates[pi][oi][VALUE] != 0) { predParams[oi] += updates[pi][oi][VALUE]*(numEvents*(iterations-updates[pi][oi][ITER])-updates[pi][oi][EVENT]); } if (predParams[oi] != 0) { predParams[oi] /=totIterations; averageParams[pi].setParameter(oi, predParams[oi]); //System.err.println("updates["+pi+"]["+oi+"]=("+updates[pi][oi][ITER]+","+updates[pi][oi][EVENT]+","+updates[pi][oi][VALUE]+") + ("+iterations+","+0+","+params[pi].getParameters()[oi]+") -> "+averageParams[pi].getParameters()[oi]); } } } } display(". "+((double) numCorrect / numEvents) + "\n"); } } --- NEW FILE: PlainTextPerceptronModelWriter.java --- /////////////////////////////////////////////////////////////////////////////// // Copyright (C) 2001 Jason Baldridge and Gann Bierner // // This library is free software; you can redistribute it and/or // modify it under the terms of the GNU Lesser General Public // License as published by the Free Software Foundation; either // version 2.1 of the License, or (at your option) any later version. // // This library is distributed in the hope that it will be useful, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // GNU General Public License for more details. // // You should have received a copy of the GNU Lesser General Public // License along with this program; if not, write to the Free Software // Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. ////////////////////////////////////////////////////////////////////////////// package opennlp.perceptron; import java.io.BufferedWriter; import java.io.File; import java.io.FileNotFoundException; import java.io.FileOutputStream; import java.io.FileWriter; import java.io.IOException; import java.io.OutputStreamWriter; import java.util.zip.GZIPOutputStream; import opennlp.model.AbstractModel; /** * Model writer that saves models in plain text format. * * @author Jason Baldridge * @version $Revision: 1.1 $, $Date: 2008/11/06 19:59:44 $ */ public class PlainTextPerceptronModelWriter extends PerceptronModelWriter { BufferedWriter output; /** * Constructor which takes a PerceptronModel and a File and prepares itself to * write the model to that file. Detects whether the file is gzipped or not * based on whether the suffix contains ".gz". * * @param model The PerceptronModel which is to be persisted. * @param f The File in which the model is to be persisted. */ public PlainTextPerceptronModelWriter (AbstractModel model, File f) throws IOException, FileNotFoundException { super(model); if (f.getName().endsWith(".gz")) { output = new BufferedWriter(new OutputStreamWriter( new GZIPOutputStream(new FileOutputStream(f)))); } else { output = new BufferedWriter(new FileWriter(f)); } } /** * Constructor which takes a PerceptronModel and a BufferedWriter and prepares * itself to write the model to that writer. * * @param model The PerceptronModel which is to be persisted. * @param bw The BufferedWriter which will be used to persist the model. */ public PlainTextPerceptronModelWriter (AbstractModel model, BufferedWriter bw) { super(model); output = bw; } protected void writeUTF (String s) throws java.io.IOException { output.write(s); output.newLine(); } protected void writeInt (int i) throws java.io.IOException { output.write(Integer.toString(i)); output.newLine(); } protected void writeDouble (double d) throws java.io.IOException { output.write(Double.toString(d)); output.newLine(); } protected void close () throws java.io.IOException { output.flush(); output.close(); } } --- NEW FILE: PerceptronModelReader.java --- package opennlp.perceptron; import java.io.File; import java.io.IOException; import opennlp.model.AbstractModel; import opennlp.model.AbstractModelReader; import opennlp.model.Context; import opennlp.model.DataReader; /** * Abstract parent class for readers of GISModels. * * @author Jason Baldridge * @version $Revision: 1.1 $, $Date: 2008/11/06 19:59:44 $ */ public class PerceptronModelReader extends AbstractModelReader { public PerceptronModelReader(File file) throws IOException { super(file); } public PerceptronModelReader(DataReader dataReader) { super(dataReader); } /** * Retrieve a model from disk. It assumes that models are saved in the * following sequence: * * <br>Perceptron (model type identifier) * <br>1. # of parameters (int) * <br>2. # of outcomes (int) * <br> * list of outcome names (String) * <br>3. # of different types of outcome patterns (int) * <br> * list of (int int[]) * <br> [# of predicates for which outcome pattern is true] [outcome pattern] * <br>4. # of predicates (int) * <br> * list of predicate names (String) * * <p>If you are creating a reader for a format which won't work with this * (perhaps a database or xml file), override this method and ignore the * other methods provided in this abstract class. * * @return The PerceptronModel stored in the format and location specified to * this PerceptronModelReader (usually via its the constructor). */ public AbstractModel constructModel() throws IOException { String[] outcomeLabels = getOutcomes(); int[][] outcomePatterns = getOutcomePatterns(); String[] predLabels = getPredicates(); Context[] params = getParameters(outcomePatterns); return new PerceptronModel(params, predLabels, outcomeLabels); } public void checkModelType() throws java.io.IOException { String modelType = readUTF(); if (!modelType.equals("Perceptron")) System.out.println("Error: attempting to load a "+modelType+ " model as a Perceptron model."+ " You should expect problems."); } } |