maxent-commit Mailing List for The OpenNLP Maximum Entropy Package (Page 12)

Brought to you by: gann, jasonbaldridge, joernkottmann, tsmorton

maxent-commit — Receive email whenever a new commit is made to the repository.

You can subscribe to this list here.

2001	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug (5)	Sep	Oct (14)	Nov (37)	Dec (13)
2002	Jan (14)	Feb	Mar	Apr (15)	May	Jun	Jul	Aug	Sep	Oct	Nov (3)	Dec (2)
2003	Jan (4)	Feb	Mar (1)	Apr (2)	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec (4)
2004	Jan (1)	Feb (3)	Mar	Apr	May (4)	Jun (3)	Jul (1)	Aug (6)	Sep	Oct	Nov	Dec
2005	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (17)	Nov (3)	Dec
2006	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (23)	Dec
2007	Jan	Feb	Mar (7)	Apr (17)	May (1)	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2008	Jan	Feb	Mar	Apr (1)	May	Jun	Jul	Aug (3)	Sep (20)	Oct	Nov (15)	Dec (2)
2009	Jan (38)	Feb (4)	Mar (20)	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2010	Jan	Feb	Mar	Apr	May	Jun (4)	Jul	Aug (17)	Sep (26)	Oct	Nov (2)	Dec

Flat | Threaded

<< < 1 .. 10 11 12 13 14 15 > >> (Page 12 of 15)

[Maxent-commit] CVS: maxent CHANGES,1.7,1.8

From: Eric F. <er...@us...> - 2002-01-03 14:34:34

Update of /cvsroot/maxent/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv12842

Modified Files:
	CHANGES 
Log Message:
indexer now drops events with 0 active features


Index: CHANGES
===================================================================
RCS file: /cvsroot/maxent/maxent/CHANGES,v
retrieving revision 1.7
retrieving revision 1.8
diff -C2 -d -r1.7 -r1.8
*** CHANGES	2002/01/02 20:00:39	1.7
--- CHANGES	2002/01/03 14:34:29	1.8
***************
*** 2,5 ****
--- 2,9 ----
  _____
  
+ 
+ (opennlp.maxent.DataIndexer)  Do not index events with 0 active features.
+ (Eric)
+ 
  Upgraded trove dependency to 0.1.1 (includes TIntArrayList, with reset()) 
  (Eric)

[Maxent-commit] CVS: maxent/src/java/opennlp/maxent DataIndexer.java,1.6,1.7

From: Eric F. <er...@us...> - 2002-01-03 14:34:34

Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv12842/src/java/opennlp/maxent

Modified Files:
	DataIndexer.java 
Log Message:
indexer now drops events with 0 active features


Index: DataIndexer.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/DataIndexer.java,v
retrieving revision 1.6
retrieving revision 1.7
diff -C2 -d -r1.6 -r1.7
*** DataIndexer.java	2002/01/02 20:00:39	1.6
--- DataIndexer.java	2002/01/03 14:34:29	1.7
***************
*** 198,203 ****
                  }
              }
!             eventsToCompare[eventIndex] =
!                 new ComparableEvent(ocID, indexedContext.toNativeArray());
              // recycle the TIntArrayList
              indexedContext.resetQuick();
--- 198,207 ----
                  }
              }
! 
!             // drop events with no active features
!             if (indexedContext.size() > 0) {
!                 eventsToCompare[eventIndex] =
!                     new ComparableEvent(ocID, indexedContext.toNativeArray());
!             }
              // recycle the TIntArrayList
              indexedContext.resetQuick();

[Maxent-commit] CVS: maxent/lib LIBNOTES,1.6,1.7 trove.jar,1.7,1.8

From: Eric F. <er...@us...> - 2002-01-02 20:00:44

Update of /cvsroot/maxent/maxent/lib
In directory usw-pr-cvs1:/tmp/cvs-serv11395/lib

Modified Files:
	LIBNOTES trove.jar 
Log Message:
[copied from CHANGES file]
Upgraded trove dependency to 0.1.1 (includes TIntArrayList, with reset())
(Eric)

(opennlp.maxent.DataIndexer)
Refactored event count computation so that the cutoff can be applied while
events are read.  This obviates the need for a separate pass over the
predicates between event count computation and indexing.  It also saves
memory by reducing the amount of temporary data needed and by avoiding
creation of instances of the Counter class.  the applyCutoff() method
was no longer needed and so is gone. (Eric)

(opennlp.maxent.DataIndexer)
Made the event count computation + cutoff application also handle the
assignment of unique indexes to predicates that "make the cut."  This
saves a fair amount of time in the indexing process. (Eric)

(opennlp.maxent.DataIndexer)
Refactored the indexing implementation so that TIntArrayLists are
(re-)used for constructing the array of predicate references associated
with each ComparableEvent.  Using the TIntArrayList instead of an
ArrayList of Integers dramatically reduces the amount of garbage
produced during indexing; it's also smaller. (Eric)

(opennlp.maxent.DataIndexer)
removed toIntArray() method, since TIntArrayList provides the same
behavior without the cost of a loop over a List of Integers (Eric)

(opennlp.maxent.DataIndexer)
changed indexing Maps to TObjectIntHashMaps to save space in several
places. (Eric)


Index: LIBNOTES
===================================================================
RCS file: /cvsroot/maxent/maxent/lib/LIBNOTES,v
retrieving revision 1.6
retrieving revision 1.7
diff -C2 -d -r1.6 -r1.7
*** LIBNOTES	2002/01/02 11:20:22	1.6
--- LIBNOTES	2002/01/02 20:00:39	1.7
***************
*** 29,33 ****
  trove.jar
  
! GNU Trove, version 0.1.0
  Homepage: http://trove4j.sf.net
  License: LGPL
--- 29,33 ----
  trove.jar
  
! GNU Trove, version 0.1.1
  Homepage: http://trove4j.sf.net
  License: LGPL

Index: trove.jar
===================================================================
RCS file: /cvsroot/maxent/maxent/lib/trove.jar,v
retrieving revision 1.7
retrieving revision 1.8
diff -C2 -d -r1.7 -r1.8
Binary files /tmp/cvsgbBB3W and /tmp/cvs8e1VSO differ

[Maxent-commit] CVS: maxent/src/java/opennlp/maxent DataIndexer.java,1.5,1.6

From: Eric F. <er...@us...> - 2002-01-02 20:00:44

Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv11395/src/java/opennlp/maxent

Modified Files:
	DataIndexer.java 
Log Message:
[copied from CHANGES file]
Upgraded trove dependency to 0.1.1 (includes TIntArrayList, with reset())
(Eric)

(opennlp.maxent.DataIndexer)
Refactored event count computation so that the cutoff can be applied while
events are read.  This obviates the need for a separate pass over the
predicates between event count computation and indexing.  It also saves
memory by reducing the amount of temporary data needed and by avoiding
creation of instances of the Counter class.  the applyCutoff() method
was no longer needed and so is gone. (Eric)

(opennlp.maxent.DataIndexer)
Made the event count computation + cutoff application also handle the
assignment of unique indexes to predicates that "make the cut."  This
saves a fair amount of time in the indexing process. (Eric)

(opennlp.maxent.DataIndexer)
Refactored the indexing implementation so that TIntArrayLists are
(re-)used for constructing the array of predicate references associated
with each ComparableEvent.  Using the TIntArrayList instead of an
ArrayList of Integers dramatically reduces the amount of garbage
produced during indexing; it's also smaller. (Eric)

(opennlp.maxent.DataIndexer)
removed toIntArray() method, since TIntArrayList provides the same
behavior without the cost of a loop over a List of Integers (Eric)

(opennlp.maxent.DataIndexer)
changed indexing Maps to TObjectIntHashMaps to save space in several
places. (Eric)


Index: DataIndexer.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/DataIndexer.java,v
retrieving revision 1.5
retrieving revision 1.6
diff -C2 -d -r1.5 -r1.6
*** DataIndexer.java	2001/12/27 19:20:26	1.5
--- DataIndexer.java	2002/01/02 20:00:39	1.6
***************
*** 36,40 ****
      public String[] predLabels;
      public String[] outcomeLabels;
-     private static final IntegerPool intPool = new IntegerPool(50);
  
      /**
--- 36,39 ----
***************
*** 58,82 ****
       */
      public DataIndexer(EventStream eventStream, int cutoff) {
!         Map count;
          TLinkedList events;
  
!         System.out.println("Indexing events");
  
          System.out.print("\tComputing event counts...  ");
!         count = new THashMap();
!         events = computeEventCounts(eventStream,count);
!         //for(int tid=0; tid<events.length; tid++) {
          System.out.println("done.");
  
-         System.out.print("\tPerforming cutoff of " + cutoff + "...  ");
-         applyCutoff(count, cutoff);
-         System.out.println("done.");
- 	
          System.out.print("\tIndexing...  ");
!         ComparableEvent[] eventsToCompare = index(events,count);
          // done with event list
          events = null;
!         // done with predicate counts
!         count = null;
  
          System.out.println("done.");
--- 57,77 ----
       */
      public DataIndexer(EventStream eventStream, int cutoff) {
!         TObjectIntHashMap predicateIndex;
          TLinkedList events;
+         ComparableEvent[] eventsToCompare;
  
!         predicateIndex = new TObjectIntHashMap();
!         System.out.println("Indexing events using cutoff of " + cutoff + "\n");
  
          System.out.print("\tComputing event counts...  ");
!         events = computeEventCounts(eventStream,predicateIndex,cutoff);
          System.out.println("done.");
  
          System.out.print("\tIndexing...  ");
!         eventsToCompare = index(events,predicateIndex);
          // done with event list
          events = null;
!         // done with predicates
!         predicateIndex = null;
  
          System.out.println("done.");
***************
*** 135,141 ****
  
      
      private TLinkedList computeEventCounts(EventStream eventStream,
! 					   Map count) {
          TLinkedList events = new TLinkedList();
          while (eventStream.hasNext()) {
              Event ev = eventStream.nextEvent();
--- 130,151 ----
  
      
+     /**
+      * Reads events from <tt>eventStream</tt> into a linked list.  The
+      * predicates associated with each event are counted and any which
+      * occur at least <tt>cutoff</tt> times are added to the
+      * <tt>predicatesInOut</tt> map along with a unique integer index.
+      *
+      * @param eventStream an <code>EventStream</code> value
+      * @param predicatesInOut a <code>TObjectIntHashMap</code> value
+      * @param cutoff an <code>int</code> value
+      * @return a <code>TLinkedList</code> value
+      */
      private TLinkedList computeEventCounts(EventStream eventStream,
!                                            TObjectIntHashMap predicatesInOut,
!                                            int cutoff) {
!         TObjectIntHashMap counter = new TObjectIntHashMap();
          TLinkedList events = new TLinkedList();
+         int predicateIndex = 0;
+ 
          while (eventStream.hasNext()) {
              Event ev = eventStream.nextEvent();
***************
*** 143,173 ****
              String[] ec = ev.getContext();
              for (int j=0; j<ec.length; j++) {
!                 Counter counter = (Counter)count.get(ec[j]);
!                 if (counter!=null) {
!                     counter.increment();
!                 } else {
!                     count.put(ec[j], new Counter());
                  }
              }
          }
          return events;
      }
  
-     private void applyCutoff(Map count, int cutoff) {
-         if (cutoff == 0) {
-             return;             // nothing to do
-         }
-         
-         for (Iterator cit=count.keySet().iterator(); cit.hasNext();) {
-             String pred = (String)cit.next();
-             if (! ((Counter)count.get(pred)).passesCutoff(cutoff)) {
-                 cit.remove();
-             }
-         }
-     }
- 
      private ComparableEvent[] index(TLinkedList events,
!                                     Map count) {
!         Map omap = new THashMap(), pmap = new THashMap();
  
          int numEvents = events.size();
--- 153,174 ----
              String[] ec = ev.getContext();
              for (int j=0; j<ec.length; j++) {
!                 if (! predicatesInOut.containsKey(ec[j])) {
!                     int count = counter.get(ec[j]) + 1;
!                     if (count >= cutoff) {
!                         predicatesInOut.put(ec[j], predicateIndex++);
!                         counter.remove(ec[j]);
!                     } else {
!                         counter.put(ec[j], count);
!                     }
                  }
              }
          }
+         predicatesInOut.trimToSize();
          return events;
      }
  
      private ComparableEvent[] index(TLinkedList events,
!                                     TObjectIntHashMap predicateIndex) {
!         TObjectIntHashMap omap = new TObjectIntHashMap();
  
          int numEvents = events.size();
***************
*** 175,178 ****
--- 176,180 ----
          int predCount = 0;
          ComparableEvent[] eventsToCompare = new ComparableEvent[numEvents];
+         TIntArrayList indexedContext = new TIntArrayList();
  
          for (int eventIndex=0; eventIndex<numEvents; eventIndex++) {
***************
*** 180,212 ****
              String[] econtext = ev.getContext();
  	    
!             Integer predID, ocID;
              String oc = ev.getOutcome();
  	    
              if (omap.containsKey(oc)) {
!                 ocID = (Integer)omap.get(oc);
              } else {
!                 ocID = intPool.get(outcomeCount++);
                  omap.put(oc, ocID);
              }
  
-             List indexedContext = new ArrayList();
              for (int i=0; i<econtext.length; i++) {
                  String pred = econtext[i];
!                 if (count.containsKey(pred)) {
!                     if (pmap.containsKey(pred)) {
!                         predID = (Integer)pmap.get(pred);
!                     } else {
!                         predID = intPool.get(predCount++);
!                         pmap.put(pred, predID);
!                     }
!                     indexedContext.add(predID);
                  }
              }
              eventsToCompare[eventIndex] =
!                 new ComparableEvent(ocID.intValue(),
!                                     toIntArray(indexedContext));
          }
          outcomeLabels = toIndexedStringArray(omap);
!         predLabels = toIndexedStringArray(pmap);
          return eventsToCompare;
      }
--- 182,208 ----
              String[] econtext = ev.getContext();
  	    
!             int predID, ocID;
              String oc = ev.getOutcome();
  	    
              if (omap.containsKey(oc)) {
!                 ocID = omap.get(oc);
              } else {
!                 ocID = outcomeCount++;
                  omap.put(oc, ocID);
              }
  
              for (int i=0; i<econtext.length; i++) {
                  String pred = econtext[i];
!                 if (predicateIndex.containsKey(pred)) {
!                     indexedContext.add(predicateIndex.get(pred));
                  }
              }
              eventsToCompare[eventIndex] =
!                 new ComparableEvent(ocID, indexedContext.toNativeArray());
!             // recycle the TIntArrayList
!             indexedContext.resetQuick();
          }
          outcomeLabels = toIndexedStringArray(omap);
!         predLabels = toIndexedStringArray(predicateIndex);
          return eventsToCompare;
      }
***************
*** 218,250 ****
       * labels should be inserted.
       *
!      * @param labelToIndexMap a <code>Map</code> value
       * @return a <code>String[]</code> value
       * @since maxent 1.2.6
       */
!     static String[] toIndexedStringArray(Map labelToIndexMap) {
!         String[] array = new String[labelToIndexMap.size()];
!         for (Iterator i = labelToIndexMap.keySet().iterator(); i.hasNext();) {
!             String label = (String)i.next();
!             int index = ((Integer)labelToIndexMap.get(label)).intValue();
!             array[index] = label;
!         }
          return array;
-     }
- 
-     /**
-      * Utility method for turning a list of Integer objects into a
-      * native array of primitive ints.
-      *
-      * @param integers a <code>List</code> value
-      * @return an <code>int[]</code> value
-      * @since maxent 1.2.6
-      */
-     static final int[] toIntArray(List integers) {
-         int[] rv = new int[integers.size()];
-         int i = 0;
-         for (Iterator it = integers.iterator(); it.hasNext();) {
-             rv[i++] = ((Integer)it.next()).intValue();
-         }
-         return rv;
      }
  }
--- 214,230 ----
       * labels should be inserted.
       *
!      * @param labelToIndexMap a <code>TObjectIntHashMap</code> value
       * @return a <code>String[]</code> value
       * @since maxent 1.2.6
       */
!     static String[] toIndexedStringArray(TObjectIntHashMap labelToIndexMap) {
!         final String[] array = new String[labelToIndexMap.size()];
!         labelToIndexMap.forEachEntry(new TObjectIntProcedure() {
!                 public boolean execute(Object str, int index) {
!                     array[index] = (String)str;
!                     return true;
!                 }
!             });
          return array;
      }
  }

[Maxent-commit] CVS: maxent CHANGES,1.6,1.7

From: Eric F. <er...@us...> - 2002-01-02 20:00:43

Update of /cvsroot/maxent/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv11395

Modified Files:
	CHANGES 
Log Message:
[copied from CHANGES file]
Upgraded trove dependency to 0.1.1 (includes TIntArrayList, with reset())
(Eric)

(opennlp.maxent.DataIndexer)
Refactored event count computation so that the cutoff can be applied while
events are read.  This obviates the need for a separate pass over the
predicates between event count computation and indexing.  It also saves
memory by reducing the amount of temporary data needed and by avoiding
creation of instances of the Counter class.  the applyCutoff() method
was no longer needed and so is gone. (Eric)

(opennlp.maxent.DataIndexer)
Made the event count computation + cutoff application also handle the
assignment of unique indexes to predicates that "make the cut."  This
saves a fair amount of time in the indexing process. (Eric)

(opennlp.maxent.DataIndexer)
Refactored the indexing implementation so that TIntArrayLists are
(re-)used for constructing the array of predicate references associated
with each ComparableEvent.  Using the TIntArrayList instead of an
ArrayList of Integers dramatically reduces the amount of garbage
produced during indexing; it's also smaller. (Eric)

(opennlp.maxent.DataIndexer)
removed toIntArray() method, since TIntArrayList provides the same
behavior without the cost of a loop over a List of Integers (Eric)

(opennlp.maxent.DataIndexer)
changed indexing Maps to TObjectIntHashMaps to save space in several
places. (Eric)


Index: CHANGES
===================================================================
RCS file: /cvsroot/maxent/maxent/CHANGES,v
retrieving revision 1.6
retrieving revision 1.7
diff -C2 -d -r1.6 -r1.7
*** CHANGES	2002/01/02 11:31:30	1.6
--- CHANGES	2002/01/02 20:00:39	1.7
***************
*** 1,2 ****
--- 1,36 ----
+ 1.2.7
+ _____
+ 
+ Upgraded trove dependency to 0.1.1 (includes TIntArrayList, with reset()) 
+ (Eric)
+ 
+ (opennlp.maxent.DataIndexer)
+ Refactored event count computation so that the cutoff can be applied while
+ events are read.  This obviates the need for a separate pass over the
+ predicates between event count computation and indexing.  It also saves
+ memory by reducing the amount of temporary data needed and by avoiding
+ creation of instances of the Counter class.  the applyCutoff() method
+ was no longer needed and so is gone. (Eric)
+ 
+ (opennlp.maxent.DataIndexer)
+ Made the event count computation + cutoff application also handle the
+ assignment of unique indexes to predicates that "make the cut."  This
+ saves a fair amount of time in the indexing process. (Eric)
+ 
+ (opennlp.maxent.DataIndexer)
+ Refactored the indexing implementation so that TIntArrayLists are 
+ (re-)used for constructing the array of predicate references associated
+ with each ComparableEvent.  Using the TIntArrayList instead of an
+ ArrayList of Integers dramatically reduces the amount of garbage
+ produced during indexing; it's also smaller. (Eric)
+ 
+ (opennlp.maxent.DataIndexer)
+ removed toIntArray() method, since TIntArrayList provides the same
+ behavior without the cost of a loop over a List of Integers (Eric)
+ 
+ (opennlp.maxent.DataIndexer)
+ changed indexing Maps to TObjectIntHashMaps to save space in several
+ places. (Eric)
+ 
  1.2.6
  -----

[Maxent-commit] CVS: maxent .cvsignore,1.1,1.2 CHANGES,1.5,1.6 build.xml,1.13,1.14

From: Jason B. <jas...@us...> - 2002-01-02 11:31:33

Update of /cvsroot/maxent/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv7724

Modified Files:
	.cvsignore CHANGES build.xml 
Log Message:
The output directory of the build structure is now 'output' instead of 'build'.

Index: .cvsignore
===================================================================
RCS file: /cvsroot/maxent/maxent/.cvsignore,v
retrieving revision 1.1
retrieving revision 1.2
diff -C2 -d -r1.1 -r1.2
*** .cvsignore	2001/10/28 01:25:56	1.1
--- .cvsignore	2002/01/02 11:31:29	1.2
***************
*** 1 ****
! build
--- 1 ----
! output

Index: CHANGES
===================================================================
RCS file: /cvsroot/maxent/maxent/CHANGES,v
retrieving revision 1.5
retrieving revision 1.6
diff -C2 -d -r1.5 -r1.6
*** CHANGES	2001/12/27 19:20:26	1.5
--- CHANGES	2002/01/02 11:31:30	1.6
***************
*** 14,17 ****
--- 14,19 ----
  There is still more to be done in this department, however. (Eric)
  
+ The output directory is now "output" instead of "build". (Jason)
+ 
  1.2.4
  -----

Index: build.xml
===================================================================
RCS file: /cvsroot/maxent/maxent/build.xml,v
retrieving revision 1.13
retrieving revision 1.14
diff -C2 -d -r1.13 -r1.14
*** build.xml	2002/01/02 11:20:00	1.13
--- build.xml	2002/01/02 11:31:30	1.14
***************
*** 24,30 ****
      <property name="packages" value="opennlp.maxent.*"/>
  
!     <property name="build.dir" value="./build"/>
!     <property name="build.src" value="./build/src"/>
!     <property name="build.dest" value="./build/classes"/>
      <property name="build.javadocs" value="./docs/api"/>
  
--- 24,29 ----
      <property name="packages" value="opennlp.maxent.*"/>
  
!     <property name="build.dir" value="./output"/>
!     <property name="build.dest" value="${build.dir}/classes"/>
      <property name="build.javadocs" value="./docs/api"/>
  
***************
*** 81,92 ****
     <target name="prepare-src" depends="prepare">
      <!-- create directories -->
-     <mkdir dir="${build.src}"/>
      <mkdir dir="${build.dest}"/>
- 
-     <!-- copy src files -->
-     <copy todir="${build.src}" >
-       <fileset dir="${src.dir}"/>
-     </copy>
- 
    </target>
  
--- 80,84 ----
***************
*** 96,100 ****
    <!-- =================================================================== -->
    <target name="compile" depends="prepare-src">
!     <javac srcdir="${build.src}"
             destdir="${build.dest}"
             debug="${debug}"
--- 88,92 ----
    <!-- =================================================================== -->
    <target name="compile" depends="prepare-src">
!     <javac srcdir="${src.dir}"
             destdir="${build.dest}"
             debug="${debug}"
***************
*** 170,174 ****
      <mkdir dir="${build.javadocs}"/>
      <javadoc packagenames="${packages}"
!              sourcepath="${build.src}"
               destdir="${build.javadocs}"
               author="true"
--- 162,166 ----
      <mkdir dir="${build.javadocs}"/>
      <javadoc packagenames="${packages}"
!              sourcepath="${src.dir}"
               destdir="${build.javadocs}"
               author="true"

[Maxent-commit] CVS: maxent/lib LIBNOTES,1.5,1.6 trove.jar,1.6,1.7

From: Jason B. <jas...@us...> - 2002-01-02 11:20:25

Update of /cvsroot/maxent/maxent/lib
In directory usw-pr-cvs1:/tmp/cvs-serv5964/lib

Modified Files:
	LIBNOTES trove.jar 
Log Message:
Added new version of Trove.

Index: LIBNOTES
===================================================================
RCS file: /cvsroot/maxent/maxent/lib/LIBNOTES,v
retrieving revision 1.5
retrieving revision 1.6
diff -C2 -d -r1.5 -r1.6
*** LIBNOTES	2001/11/27 17:11:40	1.5
--- LIBNOTES	2002/01/02 11:20:22	1.6
***************
*** 17,32 ****
  
  ------------------------------------------------------------------------
- colt.jar
- 
- The CERN Colt Distribution, version 1.0.1 (all patches before 2001 Sep
- 7 have been applied)
- Homepage: http://tilde-hoschek.home.cern.ch/~hoschek/colt/
- License: see colt.license
- 
- This distribution provides an infrastructure for scalable scientific
- and technical computing in Java.
- 
- 
- ------------------------------------------------------------------------
  java-getopt.jar
  
--- 17,20 ----
***************
*** 41,45 ****
  trove.jar
  
! GNU Trove, version 0.0.8
  Homepage: http://trove4j.sf.net
  License: LGPL
--- 29,33 ----
  trove.jar
  
! GNU Trove, version 0.1.0
  Homepage: http://trove4j.sf.net
  License: LGPL

Index: trove.jar
===================================================================
RCS file: /cvsroot/maxent/maxent/lib/trove.jar,v
retrieving revision 1.6
retrieving revision 1.7
diff -C2 -d -r1.6 -r1.7
Binary files /tmp/cvspeUrVS and /tmp/cvsoYthhD differ

[Maxent-commit] CVS: maxent build.xml,1.12,1.13

From: Jason B. <jas...@us...> - 2002-01-02 11:20:04

Update of /cvsroot/maxent/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv5854

Modified Files:
	build.xml 
Log Message:
Updated version number

Index: build.xml
===================================================================
RCS file: /cvsroot/maxent/maxent/build.xml,v
retrieving revision 1.12
retrieving revision 1.13
diff -C2 -d -r1.12 -r1.13
*** build.xml	2001/12/27 19:20:26	1.12
--- build.xml	2002/01/02 11:20:00	1.13
***************
*** 10,14 ****
      <property name="Name" value="Maxent"/>
      <property name="name" value="maxent"/>
!     <property name="version" value="1.2.5"/>
      <property name="year" value="2001"/>
  
--- 10,14 ----
      <property name="Name" value="Maxent"/>
      <property name="name" value="maxent"/>
!     <property name="version" value="1.2.6"/>
      <property name="year" value="2001"/>

[Maxent-commit] CVS: maxent/src/java/opennlp/maxent/io GISModelReader.java,1.3,1.4 GISModelWriter.java,1.3,1.4 OldFormatGISModelReader.java,1.1.1.1,1.2

From: Eric F. <er...@us...> - 2001-12-27 19:20:29

Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent/io
In directory usw-pr-cvs1:/tmp/cvs-serv11903/src/java/opennlp/maxent/io

Modified Files:
	GISModelReader.java GISModelWriter.java 
	OldFormatGISModelReader.java 
Log Message:
This is the merge of the no_colt branch -> head.  The following notes
are copied from the head of the CHANGES file.

Removed Colt dependency in favor of GNU Trove. (Eric)

Refactored index() method in DataIndexer so that only one pass over the
list of events is needed.  This saves time (of course) and also space,
since it's no longer necessary to allocate temporary data structures to
share data between two loops. (Eric)

Refactored sorting/merging algorithm for ComparableEvents so that
merging can be done in place.  This makes it possible to merge without
copying duplicate events into sublists and so improves the indexer's
ability to work on large data sets with a reasonable amount of memory.
There is still more to be done in this department, however. (Eric)


Index: GISModelReader.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/GISModelReader.java,v
retrieving revision 1.3
retrieving revision 1.4
diff -C2 -d -r1.3 -r1.4
*** GISModelReader.java	2001/11/15 15:42:14	1.3
--- GISModelReader.java	2001/12/27 19:20:26	1.4
***************
*** 18,23 ****
  package opennlp.maxent.io;
  
  import opennlp.maxent.*;
- import cern.colt.map.*;
  import java.util.StringTokenizer;
  
--- 18,23 ----
  package opennlp.maxent.io;
  
+ import gnu.trove.*;
  import opennlp.maxent.*;
  import java.util.StringTokenizer;
  
***************
*** 79,83 ****
          int[][] outcomePatterns = getOutcomePatterns();
          String[] predLabels = getPredicates();
!         OpenIntDoubleHashMap[] params = getParameters(outcomePatterns);
   	
          return new GISModel(params,
--- 79,83 ----
          int[][] outcomePatterns = getOutcomePatterns();
          String[] predLabels = getPredicates();
!         TIntDoubleHashMap[] params = getParameters(outcomePatterns);
   	
          return new GISModel(params,
***************
*** 134,151 ****
      }
  
!     protected OpenIntDoubleHashMap[] getParameters (int[][] outcomePatterns)
          throws java.io.IOException {
  	
!         OpenIntDoubleHashMap[] params = new OpenIntDoubleHashMap[NUM_PREDS];
  
          int pid=0;
          for (int i=0; i<outcomePatterns.length; i++) {
              for (int j=0; j<outcomePatterns[i][0]; j++) {
!                 params[pid] = new OpenIntDoubleHashMap();
                  for (int k=1; k<outcomePatterns[i].length; k++) {
                      double d = readDouble();
                      params[pid].put(outcomePatterns[i][k], d);
                  }
!                 params[pid].trimToSize();
                  pid++;
              }
--- 134,151 ----
      }
  
!     protected TIntDoubleHashMap[] getParameters (int[][] outcomePatterns)
          throws java.io.IOException {
  	
!         TIntDoubleHashMap[] params = new TIntDoubleHashMap[NUM_PREDS];
  
          int pid=0;
          for (int i=0; i<outcomePatterns.length; i++) {
              for (int j=0; j<outcomePatterns[i][0]; j++) {
!                 params[pid] = new TIntDoubleHashMap();
                  for (int k=1; k<outcomePatterns[i].length; k++) {
                      double d = readDouble();
                      params[pid].put(outcomePatterns[i][k], d);
                  }
!                 params[pid].compact();
                  pid++;
              }

Index: GISModelWriter.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/GISModelWriter.java,v
retrieving revision 1.3
retrieving revision 1.4
diff -C2 -d -r1.3 -r1.4
*** GISModelWriter.java	2001/11/15 16:18:40	1.3
--- GISModelWriter.java	2001/12/27 19:20:26	1.4
***************
*** 20,25 ****
  import opennlp.maxent.*;
  import gnu.trove.*;
- import cern.colt.list.*;
- import cern.colt.map.*;
  import java.io.*;
  import java.util.*;
--- 20,23 ----
***************
*** 34,38 ****
   */
  public abstract class GISModelWriter {
!     protected OpenIntDoubleHashMap[] PARAMS;
      protected String[] OUTCOME_LABELS;
      protected int CORRECTION_CONSTANT;
--- 32,36 ----
   */
  public abstract class GISModelWriter {
!     protected TIntDoubleHashMap[] PARAMS;
      protected String[] OUTCOME_LABELS;
      protected int CORRECTION_CONSTANT;
***************
*** 44,48 ****
      	Object[] data = model.getDataStructures();
  
! 	PARAMS = (OpenIntDoubleHashMap[])data[0];
  	TObjectIntHashMap pmap = (TObjectIntHashMap)data[1];
  	OUTCOME_LABELS = (String[])data[2];
--- 42,46 ----
      	Object[] data = model.getDataStructures();
  
! 	PARAMS = (TIntDoubleHashMap[])data[0];
  	TObjectIntHashMap pmap = (TObjectIntHashMap)data[1];
  	OUTCOME_LABELS = (String[])data[2];
***************
*** 121,153 ****
      protected ComparablePredicate[] sortValues () {
  	
! 	ComparablePredicate[] sortPreds =
! 	    new ComparablePredicate[PARAMS.length];
  
! 	int numParams = 0;	
! 	for (int pid=0; pid<PARAMS.length; pid++) {
! 	    IntArrayList predkeys = PARAMS[pid].keys();
! 	    predkeys.sort();
  	    
! 	    int numActive = predkeys.size();
  
! 	    numParams += numActive;
! 	    int[] activeOCs = new int[numActive];
! 	    double[] activeParams = new double[numActive];
  	    
! 	    int id = 0;	   
! 	    for (int i=0; i<predkeys.size(); i++) {
! 		int oid = predkeys.get(i);
! 		activeOCs[id] = oid;
! 		activeParams[id] = PARAMS[pid].get(oid);
! 		id++;
! 	    }
  	    
! 	    sortPreds[pid] = new ComparablePredicate(PRED_LABELS[pid],
! 						     activeOCs,
! 						     activeParams);
! 	}
  		
! 	Arrays.sort(sortPreds);
! 	return sortPreds;
      }
      
--- 119,151 ----
      protected ComparablePredicate[] sortValues () {
  	
!         ComparablePredicate[] sortPreds =
!             new ComparablePredicate[PARAMS.length];
  
!         int numParams = 0;	
!         for (int pid=0; pid<PARAMS.length; pid++) {
!             int[] predkeys = PARAMS[pid].keys();
!             Arrays.sort(predkeys);
  	    
!             int numActive = predkeys.length;
  
!             numParams += numActive;
!             int[] activeOCs = new int[numActive];
!             double[] activeParams = new double[numActive];
  	    
!             int id = 0;	   
!             for (int i=0; i < predkeys.length; i++) {
!                 int oid = predkeys[i];
!                 activeOCs[id] = oid;
!                 activeParams[id] = PARAMS[pid].get(oid);
!                 id++;
!             }
  	    
!             sortPreds[pid] = new ComparablePredicate(PRED_LABELS[pid],
!                                                      activeOCs,
!                                                      activeParams);
!         }
  		
!         Arrays.sort(sortPreds);
!         return sortPreds;
      }
      

Index: OldFormatGISModelReader.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/OldFormatGISModelReader.java,v
retrieving revision 1.1.1.1
retrieving revision 1.2
diff -C2 -d -r1.1.1.1 -r1.2
*** OldFormatGISModelReader.java	2001/10/23 14:06:53	1.1.1.1
--- OldFormatGISModelReader.java	2001/12/27 19:20:26	1.2
***************
*** 18,22 ****
  package opennlp.maxent.io;
  
! import cern.colt.map.*;
  import java.io.*;
  import java.util.zip.*;
--- 18,22 ----
  package opennlp.maxent.io;
  
! import gnu.trove.*;
  import java.io.*;
  import java.util.zip.*;
***************
*** 45,66 ****
      }
  
!     protected OpenIntDoubleHashMap[] getParameters (int[][] outcomePatterns)
! 	throws java.io.IOException {
  	
! 	OpenIntDoubleHashMap[] params = new OpenIntDoubleHashMap[NUM_PREDS];
  	  
! 	int pid=0;
! 	for (int i=0; i<outcomePatterns.length; i++) {
! 	    for (int j=0; j<outcomePatterns[i][0]; j++) {
! 		params[pid] = new OpenIntDoubleHashMap();
! 		for (int k=1; k<outcomePatterns[i].length; k++) {
! 		    double d = paramsInput.readDouble();
! 		    params[pid].put(outcomePatterns[i][k], d);
! 		}
! 		params[pid].trimToSize();
! 		pid++;
! 	    }
! 	}
! 	return params;
      }
  
--- 45,66 ----
      }
  
!     protected TIntDoubleHashMap[] getParameters (int[][] outcomePatterns)
!         throws java.io.IOException {
  	
!         TIntDoubleHashMap[] params = new TIntDoubleHashMap[NUM_PREDS];
  	  
!         int pid=0;
!         for (int i=0; i<outcomePatterns.length; i++) {
!             for (int j=0; j<outcomePatterns[i][0]; j++) {
!                 params[pid] = new TIntDoubleHashMap();
!                 for (int k=1; k<outcomePatterns[i].length; k++) {
!                     double d = paramsInput.readDouble();
!                     params[pid].put(outcomePatterns[i][k], d);
!                 }
!                 params[pid].compact();
!                 pid++;
!             }
!         }
!         return params;
      }

[Maxent-commit] CVS: maxent/src/java/opennlp/maxent ComparableEvent.java,1.1.1.1,1.2 DataIndexer.java,1.4,1.5 GISModel.java,1.5,1.6 GISTrainer.java,1.2,1.3

From: Eric F. <er...@us...> - 2001-12-27 19:20:29

Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv11903/src/java/opennlp/maxent

Modified Files:
	ComparableEvent.java DataIndexer.java GISModel.java 
	GISTrainer.java 
Log Message:
This is the merge of the no_colt branch -> head.  The following notes
are copied from the head of the CHANGES file.

Removed Colt dependency in favor of GNU Trove. (Eric)

Refactored index() method in DataIndexer so that only one pass over the
list of events is needed.  This saves time (of course) and also space,
since it's no longer necessary to allocate temporary data structures to
share data between two loops. (Eric)

Refactored sorting/merging algorithm for ComparableEvents so that
merging can be done in place.  This makes it possible to merge without
copying duplicate events into sublists and so improves the indexer's
ability to work on large data sets with a reasonable amount of memory.
There is still more to be done in this department, however. (Eric)


Index: ComparableEvent.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/ComparableEvent.java,v
retrieving revision 1.1.1.1
retrieving revision 1.2
diff -C2 -d -r1.1.1.1 -r1.2
*** ComparableEvent.java	2001/10/23 14:06:53	1.1.1.1
--- ComparableEvent.java	2001/12/27 19:20:26	1.2
***************
*** 30,67 ****
      public int outcome;
      public int[] predIndexes;
  
      public ComparableEvent(int oc, int[] pids) {
! 	outcome = oc;
! 	Arrays.sort(pids);
! 	predIndexes = pids;
      }
  
      public int compareTo(Object o) {
! 	ComparableEvent ce = (ComparableEvent)o;
  
! 	if (outcome < ce.outcome) return -1;
! 	else if (outcome > ce.outcome) return 1;
  	
! 	int smallerLength = (predIndexes.length > ce.predIndexes.length?
! 			     ce.predIndexes.length : predIndexes.length);
  
! 	for (int i=0; i<smallerLength; i++) {
! 	    if (predIndexes[i] < ce.predIndexes[i]) return -1;
! 	    else if (predIndexes[i] > ce.predIndexes[i]) return 1;
! 	}
  
  
! 	if (predIndexes.length < ce.predIndexes.length) return -1;
! 	else if (predIndexes.length > ce.predIndexes.length) return 1;
  
! 	return 0;
      }
  
      public String toString() {
! 	String s = "";
! 	for (int i=0; i<predIndexes.length; i++) s+= " "+predIndexes[i];
! 	return s;
      }
- 
  }
   
--- 30,68 ----
      public int outcome;
      public int[] predIndexes;
+     public int seen = 1;            // the number of times this event
+                                     // has been seen.
  
      public ComparableEvent(int oc, int[] pids) {
!         outcome = oc;
!         Arrays.sort(pids);
!         predIndexes = pids;
      }
  
      public int compareTo(Object o) {
!         ComparableEvent ce = (ComparableEvent)o;
  
!         if (outcome < ce.outcome) return -1;
!         else if (outcome > ce.outcome) return 1;
  	
!         int smallerLength = (predIndexes.length > ce.predIndexes.length?
!                              ce.predIndexes.length : predIndexes.length);
  
!         for (int i=0; i<smallerLength; i++) {
!             if (predIndexes[i] < ce.predIndexes[i]) return -1;
!             else if (predIndexes[i] > ce.predIndexes[i]) return 1;
!         }
  
  
!         if (predIndexes.length < ce.predIndexes.length) return -1;
!         else if (predIndexes.length > ce.predIndexes.length) return 1;
  
!         return 0;
      }
  
      public String toString() {
!         String s = "";
!         for (int i=0; i<predIndexes.length; i++) s+= " "+predIndexes[i];
!         return s;
      }
  }
   

Index: DataIndexer.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/DataIndexer.java,v
retrieving revision 1.4
retrieving revision 1.5
diff -C2 -d -r1.4 -r1.5
*** DataIndexer.java	2001/11/15 18:08:20	1.4
--- DataIndexer.java	2001/12/27 19:20:26	1.5
***************
*** 83,106 ****
  
          System.out.print("Sorting and merging events... ");
          Arrays.sort(eventsToCompare);
  
          ComparableEvent ce = eventsToCompare[0];
!         List uniqueEvents = new ArrayList();
!         List newGroup = new ArrayList();
!         int numEvents = eventsToCompare.length;
!         for (int i=0; i<numEvents; i++) {
              if (ce.compareTo(eventsToCompare[i]) == 0) {
!                 newGroup.add(eventsToCompare[i]);
!             } else {	    
!                 ce = eventsToCompare[i];
!                 uniqueEvents.add(newGroup);
!                 newGroup = new ArrayList();
!                 newGroup.add(eventsToCompare[i]);
              }
          }
-         uniqueEvents.add(newGroup);
  
-         int numUniqueEvents = uniqueEvents.size();
- 
          System.out.println("done. Reduced " + eventsToCompare.length
                             + " events to " + numUniqueEvents + ".");
--- 83,118 ----
  
          System.out.print("Sorting and merging events... ");
+         sortAndMerge(eventsToCompare);
+         System.out.println("Done indexing.");
+     }
+ 
+     /**
+      * Sorts and uniques the array of comparable events.  This method
+      * will alter the eventsToCompare array -- it does an in place
+      * sort, followed by an in place edit to remove duplicates.
+      *
+      * @param eventsToCompare a <code>ComparableEvent[]</code> value
+      * @since maxent 1.2.6
+      */
+     private void sortAndMerge(ComparableEvent[] eventsToCompare) {
          Arrays.sort(eventsToCompare);
+         int numEvents = eventsToCompare.length;
+         int numUniqueEvents = 1; // assertion: eventsToCompare.length >= 1
+ 
+         if (eventsToCompare.length <= 1) {
+             return;             // nothing to do; edge case (see assertion)
+         }
  
          ComparableEvent ce = eventsToCompare[0];
!         for (int i=1; i<numEvents; i++) {
              if (ce.compareTo(eventsToCompare[i]) == 0) {
!                 ce.seen++;      // increment the seen count
!                 eventsToCompare[i] = null; // kill the duplicate
!             } else {
!                 ce = eventsToCompare[i]; // a new champion emerges...
!                 numUniqueEvents++; // increment the # of unique events
              }
          }
  
          System.out.println("done. Reduced " + eventsToCompare.length
                             + " events to " + numUniqueEvents + ".");
***************
*** 110,122 ****
          numTimesEventsSeen = new int[numUniqueEvents];
  
!         for (int i=0; i<numUniqueEvents; i++) {
!             List group = (List)uniqueEvents.get(i);
!             numTimesEventsSeen[i] = group.size();
!             ComparableEvent nextCE = (ComparableEvent)group.get(0);
!             outcomeList[i] = nextCE.outcome;
!             contexts[i] = nextCE.predIndexes;
          }
- 	
-         System.out.println("Done indexing.");
      }
  
--- 122,135 ----
          numTimesEventsSeen = new int[numUniqueEvents];
  
!         for (int i = 0, j = 0; i<numEvents; i++) {
!             ComparableEvent evt = eventsToCompare[i];
!             if (null == evt) {
!                 continue;       // this was a dupe, skip over it.
!             }
!             numTimesEventsSeen[j] = evt.seen;
!             outcomeList[j] = evt.outcome;
!             contexts[j] = evt.predIndexes;
!             ++j;
          }
      }
  
***************
*** 161,167 ****
          int outcomeCount = 0;
          int predCount = 0;
!         int[] uncompressedOutcomeList = new int[numEvents];   
!         List uncompressedContexts = new ArrayList();
!         
          for (int eventIndex=0; eventIndex<numEvents; eventIndex++) {
              Event ev = (Event)events.removeFirst();
--- 174,179 ----
          int outcomeCount = 0;
          int predCount = 0;
!         ComparableEvent[] eventsToCompare = new ComparableEvent[numEvents];
! 
          for (int eventIndex=0; eventIndex<numEvents; eventIndex++) {
              Event ev = (Event)events.removeFirst();
***************
*** 191,225 ****
                  }
              }
!             uncompressedContexts.add(indexedContext);
!             uncompressedOutcomeList[eventIndex] = ocID.intValue();
!         }
!         outcomeLabels = new String[omap.size()];
!         for (Iterator i=omap.keySet().iterator(); i.hasNext();) {
!             String oc = (String)i.next();
!             outcomeLabels[((Integer)omap.get(oc)).intValue()] = oc;
!         }
!         omap = null;
! 	
!         predLabels = new String[pmap.size()];
!         for (Iterator i = pmap.keySet().iterator(); i.hasNext();) {
!             String n = (String)i.next();
!             predLabels[((Integer)pmap.get(n)).intValue()] = n;
          }
!         pmap = null;
!         
!         ComparableEvent[] eventsToCompare = new ComparableEvent[numEvents];
  
!         for (int i=0; i<numEvents; i++) {
!             List ecLL = (List)uncompressedContexts.get(i);
!             int[] ecInts = new int[ecLL.size()];
!             for (int j=0; j<ecInts.length; j++) {
!                 ecInts[j] = ((Integer)ecLL.get(j)).intValue();
!             }
!             eventsToCompare[i] =
!                 new ComparableEvent(uncompressedOutcomeList[i], ecInts);
          }
  
!         return eventsToCompare;
      }
-     
  }
--- 203,250 ----
                  }
              }
!             eventsToCompare[eventIndex] =
!                 new ComparableEvent(ocID.intValue(),
!                                     toIntArray(indexedContext));
          }
!         outcomeLabels = toIndexedStringArray(omap);
!         predLabels = toIndexedStringArray(pmap);
!         return eventsToCompare;
!     }
  
!     /**
!      * Utility method for creating a String[] array from a map whose
!      * keys are labels (Strings) to be stored in the array and whose
!      * values are the indices (Integers) at which the corresponding
!      * labels should be inserted.
!      *
!      * @param labelToIndexMap a <code>Map</code> value
!      * @return a <code>String[]</code> value
!      * @since maxent 1.2.6
!      */
!     static String[] toIndexedStringArray(Map labelToIndexMap) {
!         String[] array = new String[labelToIndexMap.size()];
!         for (Iterator i = labelToIndexMap.keySet().iterator(); i.hasNext();) {
!             String label = (String)i.next();
!             int index = ((Integer)labelToIndexMap.get(label)).intValue();
!             array[index] = label;
          }
+         return array;
+     }
  
!     /**
!      * Utility method for turning a list of Integer objects into a
!      * native array of primitive ints.
!      *
!      * @param integers a <code>List</code> value
!      * @return an <code>int[]</code> value
!      * @since maxent 1.2.6
!      */
!     static final int[] toIntArray(List integers) {
!         int[] rv = new int[integers.size()];
!         int i = 0;
!         for (Iterator it = integers.iterator(); it.hasNext();) {
!             rv[i++] = ((Integer)it.next()).intValue();
!         }
!         return rv;
      }
  }

Index: GISModel.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/GISModel.java,v
retrieving revision 1.5
retrieving revision 1.6
diff -C2 -d -r1.5 -r1.6
*** GISModel.java	2001/11/30 14:33:28	1.5
--- GISModel.java	2001/12/27 19:20:26	1.6
***************
*** 19,24 ****
  
  import gnu.trove.*;
- import cern.colt.list.*;
- import cern.colt.map.*;
  import java.util.*;
  
--- 19,22 ----
***************
*** 31,35 ****
   */
  public final class GISModel implements MaxentModel {
!     private final OpenIntDoubleHashMap[] params;
      private final TObjectIntHashMap pmap;
      private final String[] ocNames;
--- 29,33 ----
   */
  public final class GISModel implements MaxentModel {
!     private final TIntDoubleHashMap[] params;
      private final TObjectIntHashMap pmap;
      private final String[] ocNames;
***************
*** 41,62 ****
      private final double fval;
      
!     public GISModel (OpenIntDoubleHashMap[] _params,
! 		     String[] predLabels,
! 		     String[] _ocNames,
! 		     int _correctionConstant,
! 		     double _correctionParam) {
  
! 	pmap = new TObjectIntHashMap(predLabels.length);
! 	for (int i=0; i<predLabels.length; i++)
! 	    pmap.put(predLabels[i], i);
  
! 	params = _params;
! 	ocNames =  _ocNames;
! 	correctionConstant = _correctionConstant;
! 	correctionParam = _correctionParam;
  	
! 	numOutcomes = ocNames.length;
! 	iprob = Math.log(1.0/numOutcomes);
! 	fval = 1.0/correctionConstant;
  	
      }
--- 39,60 ----
      private final double fval;
      
!     public GISModel (TIntDoubleHashMap[] _params,
!                      String[] predLabels,
!                      String[] _ocNames,
!                      int _correctionConstant,
!                      double _correctionParam) {
  
!         pmap = new TObjectIntHashMap(predLabels.length);
!         for (int i=0; i<predLabels.length; i++)
!             pmap.put(predLabels[i], i);
  
!         params = _params;
!         ocNames =  _ocNames;
!         correctionConstant = _correctionConstant;
!         correctionParam = _correctionParam;
  	
!         numOutcomes = ocNames.length;
!         iprob = Math.log(1.0/numOutcomes);
!         fval = 1.0/correctionConstant;
  	
      }
***************
*** 77,115 ****
       */
      public final double[] eval(String[] context) {
! 	double[] outsums = new double[numOutcomes];
! 	int[] numfeats = new int[numOutcomes];
  
! 	for (int oid=0; oid<numOutcomes; oid++) {
! 	    outsums[oid] = iprob;
! 	    numfeats[oid] = 0;
! 	}
  
! 	IntArrayList activeOutcomes = new IntArrayList(0);
! 	for (int i=0; i<context.length; i++) {
! 	    if (pmap.containsKey(context[i])) {
! 		OpenIntDoubleHashMap predParams =
! 		    params[pmap.get(context[i])];
! 		predParams.keys(activeOutcomes);
! 		for (int j=0; j<activeOutcomes.size(); j++) {
! 		    int oid = activeOutcomes.getQuick(j);
! 		    numfeats[oid]++;
! 		    outsums[oid] += fval * predParams.get(oid);
! 		}
! 	    }
! 	}
  
! 	double normal = 0.0;
! 	for (int oid=0; oid<numOutcomes; oid++) {
! 	    outsums[oid] = Math.exp(outsums[oid]
! 				    + ((1.0 -
! 					(numfeats[oid]/correctionConstant))
! 				       * correctionParam));
! 	    normal += outsums[oid];
! 	}
  
! 	for (int oid=0; oid<numOutcomes; oid++)
! 	    outsums[oid] /= normal;
  
! 	return outsums;
      }
  
--- 75,113 ----
       */
      public final double[] eval(String[] context) {
!         double[] outsums = new double[numOutcomes];
!         int[] numfeats = new int[numOutcomes];
  
!         for (int oid=0; oid<numOutcomes; oid++) {
!             outsums[oid] = iprob;
!             numfeats[oid] = 0;
!         }
  
!         int[] activeOutcomes;
!         for (int i=0; i<context.length; i++) {
!             if (pmap.containsKey(context[i])) {
!                 TIntDoubleHashMap predParams =
!                     params[pmap.get(context[i])];
!                 activeOutcomes = predParams.keys();
!                 for (int j=0; j<activeOutcomes.length; j++) {
!                     int oid = activeOutcomes[j];
!                     numfeats[oid]++;
!                     outsums[oid] += fval * predParams.get(oid);
!                 }
!             }
!         }
  
!         double normal = 0.0;
!         for (int oid=0; oid<numOutcomes; oid++) {
!             outsums[oid] = Math.exp(outsums[oid]
!                                     + ((1.0 -
!                                         (numfeats[oid]/correctionConstant))
!                                        * correctionParam));
!             normal += outsums[oid];
!         }
  
!         for (int oid=0; oid<numOutcomes; oid++)
!             outsums[oid] /= normal;
  
!         return outsums;
      }
  
***************
*** 124,131 ****
       */    
      public final String getBestOutcome(double[] ocs) {
! 	int best = 0;
! 	for (int i = 1; i<ocs.length; i++)
! 	    if (ocs[i] > ocs[best]) best = i;
! 	return ocNames[best];
      }
  
--- 122,129 ----
       */    
      public final String getBestOutcome(double[] ocs) {
!         int best = 0;
!         for (int i = 1; i<ocs.length; i++)
!             if (ocs[i] > ocs[best]) best = i;
!         return ocNames[best];
      }
  
***************
*** 144,164 ****
       */    
      public final String getAllOutcomes (double[] ocs) {
! 	if (ocs.length != ocNames.length) {
! 	    return "The double array sent as a parameter to GISModel.getAllOutcomes() must not have been produced by this model.";
! 	}
! 	else {
! 	    StringBuffer sb = new StringBuffer(ocs.length*2);
! 	    String d = Double.toString(ocs[0]);
! 	    if (d.length() > 6)
! 		d = d.substring(0,7);
! 	    sb.append(ocNames[0]).append("[").append(d).append("]");
! 	    for (int i = 1; i<ocs.length; i++) {
! 		d = Double.toString(ocs[i]);
! 		if (d.length() > 6)
! 		    d = d.substring(0,7);
! 		sb.append("  ").append(ocNames[i]).append("[").append(d).append("]");
! 	    }
! 	    return sb.toString();
! 	}
      }
  
--- 142,162 ----
       */    
      public final String getAllOutcomes (double[] ocs) {
!         if (ocs.length != ocNames.length) {
!             return "The double array sent as a parameter to GISModel.getAllOutcomes() must not have been produced by this model.";
!         }
!         else {
!             StringBuffer sb = new StringBuffer(ocs.length*2);
!             String d = Double.toString(ocs[0]);
!             if (d.length() > 6)
!                 d = d.substring(0,7);
!             sb.append(ocNames[0]).append("[").append(d).append("]");
!             for (int i = 1; i<ocs.length; i++) {
!                 d = Double.toString(ocs[i]);
!                 if (d.length() > 6)
!                     d = d.substring(0,7);
!                 sb.append("  ").append(ocNames[i]).append("[").append(d).append("]");
!             }
!             return sb.toString();
!         }
      }
  
***************
*** 171,175 ****
       */
      public final String getOutcome(int i) {
! 	return ocNames[i];
      }
  
--- 169,173 ----
       */
      public final String getOutcome(int i) {
!         return ocNames[i];
      }
  
***************
*** 183,191 ****
       **/
      public int getIndex (String outcome) {
! 	for (int i=0; i<ocNames.length; i++) {
! 	    if (ocNames[i].equals(outcome))
! 		return i;
! 	}
! 	return -1;
      } 
  
--- 181,189 ----
       **/
      public int getIndex (String outcome) {
!         for (int i=0; i<ocNames.length; i++) {
!             if (ocNames[i].equals(outcome))
!                 return i;
!         }
!         return -1;
      } 
  
***************
*** 197,201 ****
       * which is returned by this method:
       *
!      * <li>index 0: cern.colt.map.OpenIntDoubleHashMap[] containing the model
       *            parameters  
       * <li>index 1: java.util.Map containing the mapping of model predicates
--- 195,199 ----
       * which is returned by this method:
       *
!      * <li>index 0: gnu.trove.TIntDoubleHashMap[] containing the model
       *            parameters  
       * <li>index 1: java.util.Map containing the mapping of model predicates
***************
*** 212,225 ****
       */
      public final Object[] getDataStructures () {
! 	Object[] data = new Object[5];
! 	data[0] = params;
! 	data[1] = pmap;
! 	data[2] = ocNames;
! 	data[3] = new Integer(correctionConstant);
! 	data[4] = new Double(correctionParam);
! 	return data;
      }
-     
- 
- 
  }
--- 210,220 ----
       */
      public final Object[] getDataStructures () {
!         Object[] data = new Object[5];
!         data[0] = params;
!         data[1] = pmap;
!         data[2] = ocNames;
!         data[3] = new Integer(correctionConstant);
!         data[4] = new Double(correctionParam);
!         return data;
      }
  }

Index: GISTrainer.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/GISTrainer.java,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -d -r1.2 -r1.3
*** GISTrainer.java	2001/11/16 10:37:43	1.2
--- GISTrainer.java	2001/12/27 19:20:26	1.3
***************
*** 18,24 ****
  package opennlp.maxent;
  
! import cern.colt.function.*;
! import cern.colt.list.*;
! import cern.colt.map.*;
  
  import java.io.*;
--- 18,22 ----
  package opennlp.maxent;
  
! import gnu.trove.*;
  
  import java.io.*;
***************
*** 82,95 ****
  
      // stores the observed expections of each of the events
!     private OpenIntDoubleHashMap[] observedExpects;
  
      // stores the estimated parameter value of each predicate during iteration
!     private OpenIntDoubleHashMap[] params;
  
      // stores the modifiers of the parameter values, paired to params
!     private OpenIntDoubleHashMap[] modifiers;
  
      // a helper object for storing predicate indexes
!     private IntArrayList predkeys; 
  
      // a boolean to track if all events have same number of active features
--- 80,93 ----
  
      // stores the observed expections of each of the events
!     private TIntDoubleHashMap[] observedExpects;
  
      // stores the estimated parameter value of each predicate during iteration
!     private TIntDoubleHashMap[] params;
  
      // stores the modifiers of the parameter values, paired to params
!     private TIntDoubleHashMap[] modifiers;
  
      // a helper object for storing predicate indexes
!     private int[] predkeys; 
  
      // a boolean to track if all events have same number of active features
***************
*** 109,137 ****
      // stores the value of corrections feature for each event's predicate list,
      // expanded to include all outcomes which might come from those predicates.
!     private OpenIntIntHashMap[] cfvals;
  
      // Normalized Probabilities Of Outcomes Given Context: p(a|b_i)
      // Stores the computation of each iterations for the update to the
      // modifiers (and therefore the params)
!     private OpenIntDoubleHashMap[] pabi;
  
!     // make all values in an OpenIntDoubleHashMap return to 0.0
!     private DoubleFunction backToZeros =
!         new DoubleFunction() {
!                 public double apply(double arg) { return 0.0; }
              };
  
!     // divide all values in the OpenIntDoubleHashMap pabi[TID] by the sum of
      // all values in the map.
!     private DoubleFunction normalizePABI =
!         new DoubleFunction() {
!                 public double apply(double arg) { return arg / PABISUM; }
              };
  
      // add the previous iteration's parameters to the computation of the
      // modifiers of this iteration.
!     private IntDoubleProcedure addParamsToPABI =
!         new IntDoubleProcedure() {
!                 public boolean apply(int oid, double arg) {
                      pabi[TID].put(oid, pabi[TID].get(oid) + arg);
                      return true;
--- 107,135 ----
      // stores the value of corrections feature for each event's predicate list,
      // expanded to include all outcomes which might come from those predicates.
!     private TIntIntHashMap[] cfvals;
  
      // Normalized Probabilities Of Outcomes Given Context: p(a|b_i)
      // Stores the computation of each iterations for the update to the
      // modifiers (and therefore the params)
!     private TIntDoubleHashMap[] pabi;
  
!     // make all values in an TIntDoubleHashMap return to 0.0
!     private TDoubleFunction backToZeros =
!         new TDoubleFunction() {
!                 public double execute(double arg) { return 0.0; }
              };
  
!     // divide all values in the TIntDoubleHashMap pabi[TID] by the sum of
      // all values in the map.
!     private TDoubleFunction normalizePABI =
!         new TDoubleFunction() {
!                 public double execute(double arg) { return arg / PABISUM; }
              };
  
      // add the previous iteration's parameters to the computation of the
      // modifiers of this iteration.
!     private TIntDoubleProcedure addParamsToPABI =
!         new TIntDoubleProcedure() {
!                 public boolean execute(int oid, double arg) {
                      pabi[TID].put(oid, pabi[TID].get(oid) + arg);
                      return true;
***************
*** 140,146 ****
  
      // add the correction parameter and exponentiate it
!     private IntDoubleProcedure addCorrectionToPABIandExponentiate =
!         new IntDoubleProcedure() {
!                 public boolean apply(int oid, double arg) {
                      if (needCorrection)
                          arg = arg + (correctionParam * cfvals[TID].get(oid));
--- 138,144 ----
  
      // add the correction parameter and exponentiate it
!     private TIntDoubleProcedure addCorrectionToPABIandExponentiate =
!         new TIntDoubleProcedure() {
!                 public boolean execute(int oid, double arg) {
                      if (needCorrection)
                          arg = arg + (correctionParam * cfvals[TID].get(oid));
***************
*** 153,159 ****
  
      // update the modifiers based on the new pabi values
!     private IntDoubleProcedure updateModifiers =
!         new IntDoubleProcedure() {
!                 public boolean apply(int oid, double arg) {
                      modifiers[PID].put(oid,
                                         arg
--- 151,157 ----
  
      // update the modifiers based on the new pabi values
!     private TIntDoubleProcedure updateModifiers =
!         new TIntDoubleProcedure() {
!                 public boolean execute(int oid, double arg) {
                      modifiers[PID].put(oid,
                                         arg
***************
*** 165,171 ****
  
      // update the params based on the newly computed modifiers
!     private IntDoubleProcedure updateParams =
!         new IntDoubleProcedure() {
!                 public boolean apply(int oid, double arg) {
                      params[PID].put(oid,
                                      arg
--- 163,169 ----
  
      // update the params based on the newly computed modifiers
!     private TIntDoubleProcedure updateParams =
!         new TIntDoubleProcedure() {
!                 public boolean execute(int oid, double arg) {
                      params[PID].put(oid,
                                      arg
***************
*** 179,185 ****
      // update the correction feature modifier, which will then be used to
      // updated the correction parameter
!     private IntDoubleProcedure updateCorrectionFeatureModifier =
!         new IntDoubleProcedure() {
!                 public boolean apply(int oid, double arg) {
                      CFMOD +=  arg * cfvals[TID].get(oid) * numTimesEventsSeen[TID];
                      return true;
--- 177,183 ----
      // update the correction feature modifier, which will then be used to
      // updated the correction parameter
!     private TIntDoubleProcedure updateCorrectionFeatureModifier =
!         new TIntDoubleProcedure() {
!                 public boolean execute(int oid, double arg) {
                      CFMOD +=  arg * cfvals[TID].get(oid) * numTimesEventsSeen[TID];
                      return true;
***************
*** 304,315 ****
          // implementation, this is cancelled out when we compute the next
          // iteration of a parameter, making the extra divisions wasteful.
!         params = new OpenIntDoubleHashMap[numPreds];
!         modifiers = new OpenIntDoubleHashMap[numPreds];
!         observedExpects = new OpenIntDoubleHashMap[numPreds];
  
  	for (PID=0; PID<numPreds; PID++) {
! 	    params[PID] = new OpenIntDoubleHashMap();
!             modifiers[PID] = new OpenIntDoubleHashMap();
!             observedExpects[PID] = new OpenIntDoubleHashMap();
              for (OID=0; OID<numOutcomes; OID++) {
                  if (predCount[PID][OID] > 0) {
--- 302,313 ----
          // implementation, this is cancelled out when we compute the next
          // iteration of a parameter, making the extra divisions wasteful.
!         params = new TIntDoubleHashMap[numPreds];
!         modifiers = new TIntDoubleHashMap[numPreds];
!         observedExpects = new TIntDoubleHashMap[numPreds];
  
  	for (PID=0; PID<numPreds; PID++) {
! 	    params[PID] = new TIntDoubleHashMap();
!             modifiers[PID] = new TIntDoubleHashMap();
!             observedExpects[PID] = new TIntDoubleHashMap();
              for (OID=0; OID<numOutcomes; OID++) {
                  if (predCount[PID][OID] > 0) {
***************
*** 324,330 ****
  		}
              }
!             params[PID].trimToSize();
!             modifiers[PID].trimToSize();
!             observedExpects[PID].trimToSize();
          }
  
--- 322,328 ----
  		}
              }
!             params[PID].compact();
!             modifiers[PID].compact();
!             observedExpects[PID].compact();
          }
  
***************
*** 333,337 ****
          display("...done.\n");
  
!         pabi = new OpenIntDoubleHashMap[numTokens];
  
          if (needCorrection) {
--- 331,335 ----
          display("...done.\n");
  
!         pabi = new TIntDoubleHashMap[numTokens];
  
          if (needCorrection) {
***************
*** 339,351 ****
              display("Computing correction feature matrix... ");
  	
!             cfvals = new OpenIntIntHashMap[numTokens];
              for (TID=0; TID<numTokens; TID++) {
!                 cfvals[TID] = new OpenIntIntHashMap();
!                 pabi[TID] = new OpenIntDoubleHashMap();
                  for (int j=0; j<contexts[TID].length; j++) {
                      PID = contexts[TID][j];
                      predkeys = params[PID].keys();
!                     for (int i=0; i<predkeys.size(); i++) {
!                         OID = predkeys.get(i);
                          if (cfvals[TID].containsKey(OID)) {
                              cfvals[TID].put(OID, cfvals[TID].get(OID) + 1);
--- 337,349 ----
              display("Computing correction feature matrix... ");
  	
!             cfvals = new TIntIntHashMap[numTokens];
              for (TID=0; TID<numTokens; TID++) {
!                 cfvals[TID] = new TIntIntHashMap();
!                 pabi[TID] = new TIntDoubleHashMap();
                  for (int j=0; j<contexts[TID].length; j++) {
                      PID = contexts[TID][j];
                      predkeys = params[PID].keys();
!                     for (int i=0; i<predkeys.length; i++) {
!                         OID = predkeys[i];
                          if (cfvals[TID].containsKey(OID)) {
                              cfvals[TID].put(OID, cfvals[TID].get(OID) + 1);
***************
*** 356,367 ****
                      }
                  }
!                 cfvals[TID].trimToSize();
!                 pabi[TID].trimToSize();
              }
  	
              for (TID=0; TID<numTokens; TID++) {
                  predkeys = cfvals[TID].keys();
!                 for (int i=0; i<predkeys.size(); i++) {
!                     OID = predkeys.get(i);
                      cfvals[TID].put(OID, constant - cfvals[TID].get(OID));
                  }
--- 354,365 ----
                      }
                  }
!                 cfvals[TID].compact();
!                 pabi[TID].compact();
              }
  	
              for (TID=0; TID<numTokens; TID++) {
                  predkeys = cfvals[TID].keys();
!                 for (int i=0; i<predkeys.length; i++) {
!                     OID = predkeys[i];
                      cfvals[TID].put(OID, constant - cfvals[TID].get(OID));
                  }
***************
*** 381,394 ****
          else {
              // initialize just the pabi table
!             pabi = new OpenIntDoubleHashMap[numTokens];
              for (TID=0; TID<numTokens; TID++) {
!                 pabi[TID] = new OpenIntDoubleHashMap();
                  for (int j=0; j<contexts[TID].length; j++) {
                      PID = contexts[TID][j];
                      predkeys = params[PID].keys();
!                     for (int i=0; i<predkeys.size(); i++)
!                         pabi[TID].put(predkeys.get(i), 0.0);
                  }
!                 pabi[TID].trimToSize();
              }
          }
--- 379,392 ----
          else {
              // initialize just the pabi table
!             pabi = new TIntDoubleHashMap[numTokens];
              for (TID=0; TID<numTokens; TID++) {
!                 pabi[TID] = new TIntDoubleHashMap();
                  for (int j=0; j<contexts[TID].length; j++) {
                      PID = contexts[TID][j];
                      predkeys = params[PID].keys();
!                     for (int i=0; i<predkeys.length; i++)
!                         pabi[TID].put(predkeys[i], 0.0);
                  }
!                 pabi[TID].compact();
              }
          }
***************
*** 434,448 ****
          CFMOD = 0.0;
          for (TID=0; TID<numTokens; TID++) {
!             pabi[TID].assign(backToZeros);
  
              for (int j=0; j<contexts[TID].length; j++)
!                 params[contexts[TID][j]].forEachPair(addParamsToPABI);
  
              PABISUM = 0.0; // PABISUM is computed in the next line's procedure
!             pabi[TID].forEachPair(addCorrectionToPABIandExponentiate);
!             if (PABISUM > 0.0) pabi[TID].assign(normalizePABI);
  
              if (needCorrection)
!                 pabi[TID].forEachPair(updateCorrectionFeatureModifier);
          }
          display(".");
--- 432,446 ----
          CFMOD = 0.0;
          for (TID=0; TID<numTokens; TID++) {
!             pabi[TID].transformValues(backToZeros);
  
              for (int j=0; j<contexts[TID].length; j++)
!                 params[contexts[TID][j]].forEachEntry(addParamsToPABI);
  
              PABISUM = 0.0; // PABISUM is computed in the next line's procedure
!             pabi[TID].forEachEntry(addCorrectionToPABIandExponentiate);
!             if (PABISUM > 0.0) pabi[TID].transformValues(normalizePABI);
  
              if (needCorrection)
!                 pabi[TID].forEachEntry(updateCorrectionFeatureModifier);
          }
          display(".");
***************
*** 455,459 ****
                  // globally for the updateModifiers procedure used after it
                  PID = contexts[TID][j]; 
!                 modifiers[PID].forEachPair(updateModifiers);
              }
          }
--- 453,457 ----
                  // globally for the updateModifiers procedure used after it
                  PID = contexts[TID][j]; 
!                 modifiers[PID].forEachEntry(updateModifiers);
              }
          }
***************
*** 462,467 ****
          // compute the new parameter values
          for (PID=0; PID<numPreds; PID++) {
!             params[PID].forEachPair(updateParams);
!             modifiers[PID].assign(backToZeros); // re-initialize to 0.0's
          }
  
--- 460,465 ----
          // compute the new parameter values
          for (PID=0; PID<numPreds; PID++) {
!             params[PID].forEachEntry(updateParams);
!             modifiers[PID].transformValues(backToZeros); // re-initialize to 0.0's
          }

[Maxent-commit] CVS: maxent/lib trove.jar,1.5,1.6 colt.jar,1.2,NONE colt.license,1.1.1.1,NONE

From: Eric F. <er...@us...> - 2001-12-27 19:20:29

Update of /cvsroot/maxent/maxent/lib
In directory usw-pr-cvs1:/tmp/cvs-serv11903/lib

Modified Files:
	trove.jar 
Removed Files:
	colt.jar colt.license 
Log Message:
This is the merge of the no_colt branch -> head.  The following notes
are copied from the head of the CHANGES file.

Removed Colt dependency in favor of GNU Trove. (Eric)

Refactored index() method in DataIndexer so that only one pass over the
list of events is needed.  This saves time (of course) and also space,
since it's no longer necessary to allocate temporary data structures to
share data between two loops. (Eric)

Refactored sorting/merging algorithm for ComparableEvents so that
merging can be done in place.  This makes it possible to merge without
copying duplicate events into sublists and so improves the indexer's
ability to work on large data sets with a reasonable amount of memory.
There is still more to be done in this department, however. (Eric)


Index: trove.jar
===================================================================
RCS file: /cvsroot/maxent/maxent/lib/trove.jar,v
retrieving revision 1.5
retrieving revision 1.6
diff -C2 -d -r1.5 -r1.6
Binary files /tmp/cvsBdQ5R4 and /tmp/cvsAiczo1 differ

--- colt.jar DELETED ---

--- colt.license DELETED ---

[Maxent-commit] CVS: maxent CHANGES,1.4,1.5 build.xml,1.11,1.12

From: Eric F. <er...@us...> - 2001-12-27 19:20:28

Update of /cvsroot/maxent/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv11903

Modified Files:
	CHANGES build.xml 
Log Message:
This is the merge of the no_colt branch -> head.  The following notes
are copied from the head of the CHANGES file.

Removed Colt dependency in favor of GNU Trove. (Eric)

Refactored index() method in DataIndexer so that only one pass over the
list of events is needed.  This saves time (of course) and also space,
since it's no longer necessary to allocate temporary data structures to
share data between two loops. (Eric)

Refactored sorting/merging algorithm for ComparableEvents so that
merging can be done in place.  This makes it possible to merge without
copying duplicate events into sublists and so improves the indexer's
ability to work on large data sets with a reasonable amount of memory.
There is still more to be done in this department, however. (Eric)


Index: CHANGES
===================================================================
RCS file: /cvsroot/maxent/maxent/CHANGES,v
retrieving revision 1.4
retrieving revision 1.5
diff -C2 -d -r1.4 -r1.5
*** CHANGES	2001/11/21 10:15:54	1.4
--- CHANGES	2001/12/27 19:20:26	1.5
***************
*** 1,2 ****
--- 1,17 ----
+ 1.2.6
+ -----
+ Removed Colt dependency in favor of GNU Trove. (Eric)
+ 
+ Refactored index() method in DataIndexer so that only one pass over the
+ list of events is needed.  This saves time (of course) and also space,
+ since it's no longer necessary to allocate temporary data structures to
+ share data between two loops. (Eric)
+ 
+ Refactored sorting/merging algorithm for ComparableEvents so that
+ merging can be done in place.  This makes it possible to merge without
+ copying duplicate events into sublists and so improves the indexer's
+ ability to work on large data sets with a reasonable amount of memory.
+ There is still more to be done in this department, however. (Eric)
+ 
  1.2.4
  -----
***************
*** 139,141 ****
  0.2.0
  _____
! Initial release of fully functional maxent package.
\ No newline at end of file
--- 154,156 ----
  0.2.0
  _____
! Initial release of fully functional maxent package.

Index: build.xml
===================================================================
RCS file: /cvsroot/maxent/maxent/build.xml,v
retrieving revision 1.11
retrieving revision 1.12
diff -C2 -d -r1.11 -r1.12
*** build.xml	2001/11/22 15:07:05	1.11
--- build.xml	2001/12/27 19:20:26	1.12
***************
*** 17,21 ****
      <property name="build.compiler" value="classic"/>
      <property name="debug" value="on"/>
!     <property name="optimize" value="on"/>
      <property name="deprecation" value="on"/>
  
--- 17,21 ----
      <property name="build.compiler" value="classic"/>
      <property name="debug" value="on"/>
!     <property name="optimize" value="off"/>
      <property name="deprecation" value="on"/>
  
***************
*** 42,46 ****
      <path id="build.classpath">
        <pathelement location="${lib.dir}/java-getopt.jar"/>
-       <pathelement location="${lib.dir}/colt.jar"/>
        <pathelement location="${lib.dir}/trove.jar"/>
      </path>
--- 42,45 ----
***************
*** 125,129 ****
          <pathelement path="${build.dir}/${name}-${DSTAMP}.jar"/>
  	<pathelement location="${lib.dir}/java-getopt.jar"/>
- 	<pathelement location="${lib.dir}/colt.jar"/>
  	<pathelement location="${lib.dir}/trove.jar"/>
        </mergefiles>
--- 124,127 ----

[Maxent-commit] CVS: maxent/src/java/opennlp/maxent ComparableEvent.java,1.1.1.1,1.1.1.1.2.1 DataIndexer.java,1.4.2.3,1.4.2.4

From: Eric F. <er...@us...> - 2001-12-27 04:32:40

Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv24656/src/java/opennlp/maxent

Modified Files:
      Tag: no_colt
	ComparableEvent.java DataIndexer.java 
Log Message:
refactored sorting/merging of events so that the events do not have to
be copied into lists of lists for merging.  Instead, ComparableEvent
has a `seen' field (default 1) which can be used to track the number of
times an event has been seen.  After sorting the events, the first loop
over the set increments this value when duplicates are found and simply
nulls them out of the list.  The second loop can then skip over null
slots in the list, mining only the "unique" events.  This saves a lot of
memory, because it's no longer necessary to build lists of lists.
I also switched optimize to "off" in build.xml because -O doesn't do
anything in modern java compilers (HotSpot does all of the optimization
work, at runtime), but it does cause jikes to drop the line numbers
from stack traces.


Index: ComparableEvent.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/ComparableEvent.java,v
retrieving revision 1.1.1.1
retrieving revision 1.1.1.1.2.1
diff -C2 -d -r1.1.1.1 -r1.1.1.1.2.1
*** ComparableEvent.java	2001/10/23 14:06:53	1.1.1.1
--- ComparableEvent.java	2001/12/27 04:32:37	1.1.1.1.2.1
***************
*** 30,67 ****
      public int outcome;
      public int[] predIndexes;
  
      public ComparableEvent(int oc, int[] pids) {
! 	outcome = oc;
! 	Arrays.sort(pids);
! 	predIndexes = pids;
      }
  
      public int compareTo(Object o) {
! 	ComparableEvent ce = (ComparableEvent)o;
  
! 	if (outcome < ce.outcome) return -1;
! 	else if (outcome > ce.outcome) return 1;
  	
! 	int smallerLength = (predIndexes.length > ce.predIndexes.length?
! 			     ce.predIndexes.length : predIndexes.length);
  
! 	for (int i=0; i<smallerLength; i++) {
! 	    if (predIndexes[i] < ce.predIndexes[i]) return -1;
! 	    else if (predIndexes[i] > ce.predIndexes[i]) return 1;
! 	}
  
  
! 	if (predIndexes.length < ce.predIndexes.length) return -1;
! 	else if (predIndexes.length > ce.predIndexes.length) return 1;
  
! 	return 0;
      }
  
      public String toString() {
! 	String s = "";
! 	for (int i=0; i<predIndexes.length; i++) s+= " "+predIndexes[i];
! 	return s;
      }
- 
  }
   
--- 30,68 ----
      public int outcome;
      public int[] predIndexes;
+     public int seen = 1;            // the number of times this event
+                                     // has been seen.
  
      public ComparableEvent(int oc, int[] pids) {
!         outcome = oc;
!         Arrays.sort(pids);
!         predIndexes = pids;
      }
  
      public int compareTo(Object o) {
!         ComparableEvent ce = (ComparableEvent)o;
  
!         if (outcome < ce.outcome) return -1;
!         else if (outcome > ce.outcome) return 1;
  	
!         int smallerLength = (predIndexes.length > ce.predIndexes.length?
!                              ce.predIndexes.length : predIndexes.length);
  
!         for (int i=0; i<smallerLength; i++) {
!             if (predIndexes[i] < ce.predIndexes[i]) return -1;
!             else if (predIndexes[i] > ce.predIndexes[i]) return 1;
!         }
  
  
!         if (predIndexes.length < ce.predIndexes.length) return -1;
!         else if (predIndexes.length > ce.predIndexes.length) return 1;
  
!         return 0;
      }
  
      public String toString() {
!         String s = "";
!         for (int i=0; i<predIndexes.length; i++) s+= " "+predIndexes[i];
!         return s;
      }
  }
   

Index: DataIndexer.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/DataIndexer.java,v
retrieving revision 1.4.2.3
retrieving revision 1.4.2.4
diff -C2 -d -r1.4.2.3 -r1.4.2.4
*** DataIndexer.java	2001/12/27 03:51:52	1.4.2.3
--- DataIndexer.java	2001/12/27 04:32:37	1.4.2.4
***************
*** 83,105 ****
  
          System.out.print("Sorting and merging events... ");
          Arrays.sort(eventsToCompare);
  
          ComparableEvent ce = eventsToCompare[0];
!         List uniqueEvents = new ArrayList();
!         List newGroup = new ArrayList();
!         int numEvents = eventsToCompare.length;
!         for (int i=0; i<numEvents; i++) {
              if (ce.compareTo(eventsToCompare[i]) == 0) {
!                 newGroup.add(eventsToCompare[i]);
!             } else {	    
!                 ce = eventsToCompare[i];
!                 uniqueEvents.add(newGroup);
!                 newGroup = new ArrayList();
!                 newGroup.add(eventsToCompare[i]);
              }
          }
-         uniqueEvents.add(newGroup);
- 
-         int numUniqueEvents = uniqueEvents.size();
  
          System.out.println("done. Reduced " + eventsToCompare.length
--- 83,117 ----
  
          System.out.print("Sorting and merging events... ");
+         sortAndMerge(eventsToCompare);
+         System.out.println("Done indexing.");
+     }
+ 
+     /**
+      * Sorts and uniques the array of comparable events.  This method
+      * will alter the eventsToCompare array -- it does an in place
+      * sort, followed by an in place edit to remove duplicates.
+      *
+      * @param eventsToCompare a <code>ComparableEvent[]</code> value
+      * @since maxent 1.2.6
+      */
+     private void sortAndMerge(ComparableEvent[] eventsToCompare) {
          Arrays.sort(eventsToCompare);
+         int numEvents = eventsToCompare.length;
+         int numUniqueEvents = 1; // assertion: eventsToCompare.length >= 1
  
+         if (eventsToCompare.length <= 1) {
+             return;             // nothing to do; edge case (see assertion)
+         }
+ 
          ComparableEvent ce = eventsToCompare[0];
!         for (int i=1; i<numEvents; i++) {
              if (ce.compareTo(eventsToCompare[i]) == 0) {
!                 ce.seen++;      // increment the seen count
!                 eventsToCompare[i] = null; // kill the duplicate
!             } else {
!                 ce = eventsToCompare[i]; // a new champion emerges...
!                 numUniqueEvents++; // increment the # of unique events
              }
          }
  
          System.out.println("done. Reduced " + eventsToCompare.length
***************
*** 110,122 ****
          numTimesEventsSeen = new int[numUniqueEvents];
  
!         for (int i=0; i<numUniqueEvents; i++) {
!             List group = (List)uniqueEvents.get(i);
!             numTimesEventsSeen[i] = group.size();
!             ComparableEvent nextCE = (ComparableEvent)group.get(0);
!             outcomeList[i] = nextCE.outcome;
!             contexts[i] = nextCE.predIndexes;
          }
- 	
-         System.out.println("Done indexing.");
      }
  
--- 122,135 ----
          numTimesEventsSeen = new int[numUniqueEvents];
  
!         for (int i = 0, j = 0; i<numEvents; i++) {
!             ComparableEvent evt = eventsToCompare[i];
!             if (null == evt) {
!                 continue;       // this was a dupe, skip over it.
!             }
!             numTimesEventsSeen[j] = evt.seen;
!             outcomeList[j] = evt.outcome;
!             contexts[j] = evt.predIndexes;
!             ++j;
          }
      }

[Maxent-commit] CVS: maxent build.xml,1.11.2.1,1.11.2.2

From: Eric F. <er...@us...> - 2001-12-27 04:32:40

Update of /cvsroot/maxent/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv24656

Modified Files:
      Tag: no_colt
	build.xml 
Log Message:
refactored sorting/merging of events so that the events do not have to
be copied into lists of lists for merging.  Instead, ComparableEvent
has a `seen' field (default 1) which can be used to track the number of
times an event has been seen.  After sorting the events, the first loop
over the set increments this value when duplicates are found and simply
nulls them out of the list.  The second loop can then skip over null
slots in the list, mining only the "unique" events.  This saves a lot of
memory, because it's no longer necessary to build lists of lists.
I also switched optimize to "off" in build.xml because -O doesn't do
anything in modern java compilers (HotSpot does all of the optimization
work, at runtime), but it does cause jikes to drop the line numbers
from stack traces.


Index: build.xml
===================================================================
RCS file: /cvsroot/maxent/maxent/build.xml,v
retrieving revision 1.11.2.1
retrieving revision 1.11.2.2
diff -C2 -d -r1.11.2.1 -r1.11.2.2
*** build.xml	2001/12/14 14:38:10	1.11.2.1
--- build.xml	2001/12/27 04:32:37	1.11.2.2
***************
*** 17,21 ****
      <property name="build.compiler" value="classic"/>
      <property name="debug" value="on"/>
!     <property name="optimize" value="on"/>
      <property name="deprecation" value="on"/>
  
--- 17,21 ----
      <property name="build.compiler" value="classic"/>
      <property name="debug" value="on"/>
!     <property name="optimize" value="off"/>
      <property name="deprecation" value="on"/>

[Maxent-commit] CVS: maxent/src/java/opennlp/maxent DataIndexer.java,1.4.2.2,1.4.2.3

From: Eric F. <er...@us...> - 2001-12-27 03:51:55

Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv19977/src/java/opennlp/maxent

Modified Files:
      Tag: no_colt
	DataIndexer.java 
Log Message:
refactored code that converts a Map of String->Integer relations into a
String[] array whose indices are the values of the Integers from the
map (and whose values are the corresponding Strings).  This used to
be duplicated code; it is now encapsulated in a method.



Index: DataIndexer.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/DataIndexer.java,v
retrieving revision 1.4.2.2
retrieving revision 1.4.2.3
diff -C2 -d -r1.4.2.2 -r1.4.2.3
*** DataIndexer.java	2001/12/27 03:33:30	1.4.2.2
--- DataIndexer.java	2001/12/27 03:51:52	1.4.2.3
***************
*** 194,212 ****
                                      toIntArray(indexedContext));
          }
!         outcomeLabels = new String[omap.size()];
!         for (Iterator i=omap.keySet().iterator(); i.hasNext();) {
!             String oc = (String)i.next();
!             outcomeLabels[((Integer)omap.get(oc)).intValue()] = oc;
!         }
!         omap = null;
! 	
!         predLabels = new String[pmap.size()];
!         for (Iterator i = pmap.keySet().iterator(); i.hasNext();) {
!             String n = (String)i.next();
!             predLabels[((Integer)pmap.get(n)).intValue()] = n;
!         }
!         pmap = null;
!         
          return eventsToCompare;
      }
  
--- 194,220 ----
                                      toIntArray(indexedContext));
          }
!         outcomeLabels = toIndexedStringArray(omap);
!         predLabels = toIndexedStringArray(pmap);
          return eventsToCompare;
+     }
+ 
+     /**
+      * Utility method for creating a String[] array from a map whose
+      * keys are labels (Strings) to be stored in the array and whose
+      * values are the indices (Integers) at which the corresponding
+      * labels should be inserted.
+      *
+      * @param labelToIndexMap a <code>Map</code> value
+      * @return a <code>String[]</code> value
+      * @since maxent 1.2.6
+      */
+     static String[] toIndexedStringArray(Map labelToIndexMap) {
+         String[] array = new String[labelToIndexMap.size()];
+         for (Iterator i = labelToIndexMap.keySet().iterator(); i.hasNext();) {
+             String label = (String)i.next();
+             int index = ((Integer)labelToIndexMap.get(label)).intValue();
+             array[index] = label;
+         }
+         return array;
      }

[Maxent-commit] CVS: maxent/src/java/opennlp/maxent DataIndexer.java,1.4.2.1,1.4.2.2

From: Eric F. <er...@us...> - 2001-12-27 03:33:33

Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv17805/src/java/opennlp/maxent

Modified Files:
      Tag: no_colt
	DataIndexer.java 
Log Message:
part two of index() refactoring:  removed a temporary int[] array which
was as large as the number of events, since it is not needed in a single
pass implementation of the indexing loop.



Index: DataIndexer.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/DataIndexer.java,v
retrieving revision 1.4.2.1
retrieving revision 1.4.2.2
diff -C2 -d -r1.4.2.1 -r1.4.2.2
*** DataIndexer.java	2001/12/27 03:19:01	1.4.2.1
--- DataIndexer.java	2001/12/27 03:33:30	1.4.2.2
***************
*** 161,165 ****
          int outcomeCount = 0;
          int predCount = 0;
-         int[] uncompressedOutcomeList = new int[numEvents];   
          ComparableEvent[] eventsToCompare = new ComparableEvent[numEvents];
  
--- 161,164 ----

[Maxent-commit] CVS: maxent/src/java/opennlp/maxent DataIndexer.java,1.4,1.4.2.1

From: Eric F. <er...@us...> - 2001-12-27 03:19:04

Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv15572/java/opennlp/maxent

Modified Files:
      Tag: no_colt
	DataIndexer.java 
Log Message:
Refactored the index() method so that it only loops once over the list of
events.  The previous implementation looped twice for the same functionality.
Note that this change is on the no_colt branch, since we haven't merged
those changes into the trunk, yet.


Index: DataIndexer.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/DataIndexer.java,v
retrieving revision 1.4
retrieving revision 1.4.2.1
diff -C2 -d -r1.4 -r1.4.2.1
*** DataIndexer.java	2001/11/15 18:08:20	1.4
--- DataIndexer.java	2001/12/27 03:19:01	1.4.2.1
***************
*** 162,167 ****
          int predCount = 0;
          int[] uncompressedOutcomeList = new int[numEvents];   
!         List uncompressedContexts = new ArrayList();
!         
          for (int eventIndex=0; eventIndex<numEvents; eventIndex++) {
              Event ev = (Event)events.removeFirst();
--- 162,167 ----
          int predCount = 0;
          int[] uncompressedOutcomeList = new int[numEvents];   
!         ComparableEvent[] eventsToCompare = new ComparableEvent[numEvents];
! 
          for (int eventIndex=0; eventIndex<numEvents; eventIndex++) {
              Event ev = (Event)events.removeFirst();
***************
*** 191,196 ****
                  }
              }
!             uncompressedContexts.add(indexedContext);
!             uncompressedOutcomeList[eventIndex] = ocID.intValue();
          }
          outcomeLabels = new String[omap.size()];
--- 191,197 ----
                  }
              }
!             eventsToCompare[eventIndex] =
!                 new ComparableEvent(ocID.intValue(),
!                                     toIntArray(indexedContext));
          }
          outcomeLabels = new String[omap.size()];
***************
*** 208,225 ****
          pmap = null;
          
!         ComparableEvent[] eventsToCompare = new ComparableEvent[numEvents];
  
!         for (int i=0; i<numEvents; i++) {
!             List ecLL = (List)uncompressedContexts.get(i);
!             int[] ecInts = new int[ecLL.size()];
!             for (int j=0; j<ecInts.length; j++) {
!                 ecInts[j] = ((Integer)ecLL.get(j)).intValue();
!             }
!             eventsToCompare[i] =
!                 new ComparableEvent(uncompressedOutcomeList[i], ecInts);
          }
! 
!         return eventsToCompare;
      }
-     
  }
--- 209,230 ----
          pmap = null;
          
!         return eventsToCompare;
!     }
  
!     /**
!      * Utility method for turning a list of Integer objects into a
!      * native array of primitive ints.
!      *
!      * @param integers a <code>List</code> value
!      * @return an <code>int[]</code> value
!      * @since maxent 1.2.6
!      */
!     static final int[] toIntArray(List integers) {
!         int[] rv = new int[integers.size()];
!         int i = 0;
!         for (Iterator it = integers.iterator(); it.hasNext();) {
!             rv[i++] = ((Integer)it.next()).intValue();
          }
!         return rv;
      }
  }

[Maxent-commit] CVS: maxent/src/java/opennlp/maxent/io GISModelReader.java,1.3,1.3.2.1 GISModelWriter.java,1.3,1.3.2.1 OldFormatGISModelReader.java,1.1.1.1,1.1.1.1.2.1

From: Eric F. <er...@us...> - 2001-12-14 14:38:15

Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent/io
In directory usw-pr-cvs1:/tmp/cvs-serv16092/src/java/opennlp/maxent/io

Modified Files:
      Tag: no_colt
	GISModelReader.java GISModelWriter.java 
	OldFormatGISModelReader.java 
Log Message:
[note that this is a commit to a branch, not the HEAD]
Removed all colt dependencies
Removed colt
Upgraded trove.jar to version 0.0.8


Index: GISModelReader.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/GISModelReader.java,v
retrieving revision 1.3
retrieving revision 1.3.2.1
diff -C2 -d -r1.3 -r1.3.2.1
*** GISModelReader.java	2001/11/15 15:42:14	1.3
--- GISModelReader.java	2001/12/14 14:38:11	1.3.2.1
***************
*** 18,23 ****
  package opennlp.maxent.io;
  
  import opennlp.maxent.*;
- import cern.colt.map.*;
  import java.util.StringTokenizer;
  
--- 18,23 ----
  package opennlp.maxent.io;
  
+ import gnu.trove.*;
  import opennlp.maxent.*;
  import java.util.StringTokenizer;
  
***************
*** 79,83 ****
          int[][] outcomePatterns = getOutcomePatterns();
          String[] predLabels = getPredicates();
!         OpenIntDoubleHashMap[] params = getParameters(outcomePatterns);
   	
          return new GISModel(params,
--- 79,83 ----
          int[][] outcomePatterns = getOutcomePatterns();
          String[] predLabels = getPredicates();
!         TIntDoubleHashMap[] params = getParameters(outcomePatterns);
   	
          return new GISModel(params,
***************
*** 134,151 ****
      }
  
!     protected OpenIntDoubleHashMap[] getParameters (int[][] outcomePatterns)
          throws java.io.IOException {
  	
!         OpenIntDoubleHashMap[] params = new OpenIntDoubleHashMap[NUM_PREDS];
  
          int pid=0;
          for (int i=0; i<outcomePatterns.length; i++) {
              for (int j=0; j<outcomePatterns[i][0]; j++) {
!                 params[pid] = new OpenIntDoubleHashMap();
                  for (int k=1; k<outcomePatterns[i].length; k++) {
                      double d = readDouble();
                      params[pid].put(outcomePatterns[i][k], d);
                  }
!                 params[pid].trimToSize();
                  pid++;
              }
--- 134,151 ----
      }
  
!     protected TIntDoubleHashMap[] getParameters (int[][] outcomePatterns)
          throws java.io.IOException {
  	
!         TIntDoubleHashMap[] params = new TIntDoubleHashMap[NUM_PREDS];
  
          int pid=0;
          for (int i=0; i<outcomePatterns.length; i++) {
              for (int j=0; j<outcomePatterns[i][0]; j++) {
!                 params[pid] = new TIntDoubleHashMap();
                  for (int k=1; k<outcomePatterns[i].length; k++) {
                      double d = readDouble();
                      params[pid].put(outcomePatterns[i][k], d);
                  }
!                 params[pid].compact();
                  pid++;
              }

Index: GISModelWriter.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/GISModelWriter.java,v
retrieving revision 1.3
retrieving revision 1.3.2.1
diff -C2 -d -r1.3 -r1.3.2.1
*** GISModelWriter.java	2001/11/15 16:18:40	1.3
--- GISModelWriter.java	2001/12/14 14:38:11	1.3.2.1
***************
*** 20,25 ****
  import opennlp.maxent.*;
  import gnu.trove.*;
- import cern.colt.list.*;
- import cern.colt.map.*;
  import java.io.*;
  import java.util.*;
--- 20,23 ----
***************
*** 34,38 ****
   */
  public abstract class GISModelWriter {
!     protected OpenIntDoubleHashMap[] PARAMS;
      protected String[] OUTCOME_LABELS;
      protected int CORRECTION_CONSTANT;
--- 32,36 ----
   */
  public abstract class GISModelWriter {
!     protected TIntDoubleHashMap[] PARAMS;
      protected String[] OUTCOME_LABELS;
      protected int CORRECTION_CONSTANT;
***************
*** 44,48 ****
      	Object[] data = model.getDataStructures();
  
! 	PARAMS = (OpenIntDoubleHashMap[])data[0];
  	TObjectIntHashMap pmap = (TObjectIntHashMap)data[1];
  	OUTCOME_LABELS = (String[])data[2];
--- 42,46 ----
      	Object[] data = model.getDataStructures();
  
! 	PARAMS = (TIntDoubleHashMap[])data[0];
  	TObjectIntHashMap pmap = (TObjectIntHashMap)data[1];
  	OUTCOME_LABELS = (String[])data[2];
***************
*** 121,153 ****
      protected ComparablePredicate[] sortValues () {
  	
! 	ComparablePredicate[] sortPreds =
! 	    new ComparablePredicate[PARAMS.length];
  
! 	int numParams = 0;	
! 	for (int pid=0; pid<PARAMS.length; pid++) {
! 	    IntArrayList predkeys = PARAMS[pid].keys();
! 	    predkeys.sort();
  	    
! 	    int numActive = predkeys.size();
  
! 	    numParams += numActive;
! 	    int[] activeOCs = new int[numActive];
! 	    double[] activeParams = new double[numActive];
  	    
! 	    int id = 0;	   
! 	    for (int i=0; i<predkeys.size(); i++) {
! 		int oid = predkeys.get(i);
! 		activeOCs[id] = oid;
! 		activeParams[id] = PARAMS[pid].get(oid);
! 		id++;
! 	    }
  	    
! 	    sortPreds[pid] = new ComparablePredicate(PRED_LABELS[pid],
! 						     activeOCs,
! 						     activeParams);
! 	}
  		
! 	Arrays.sort(sortPreds);
! 	return sortPreds;
      }
      
--- 119,151 ----
      protected ComparablePredicate[] sortValues () {
  	
!         ComparablePredicate[] sortPreds =
!             new ComparablePredicate[PARAMS.length];
  
!         int numParams = 0;	
!         for (int pid=0; pid<PARAMS.length; pid++) {
!             int[] predkeys = PARAMS[pid].keys();
!             Arrays.sort(predkeys);
  	    
!             int numActive = predkeys.length;
  
!             numParams += numActive;
!             int[] activeOCs = new int[numActive];
!             double[] activeParams = new double[numActive];
  	    
!             int id = 0;	   
!             for (int i=0; i < predkeys.length; i++) {
!                 int oid = predkeys[i];
!                 activeOCs[id] = oid;
!                 activeParams[id] = PARAMS[pid].get(oid);
!                 id++;
!             }
  	    
!             sortPreds[pid] = new ComparablePredicate(PRED_LABELS[pid],
!                                                      activeOCs,
!                                                      activeParams);
!         }
  		
!         Arrays.sort(sortPreds);
!         return sortPreds;
      }
      

Index: OldFormatGISModelReader.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/io/OldFormatGISModelReader.java,v
retrieving revision 1.1.1.1
retrieving revision 1.1.1.1.2.1
diff -C2 -d -r1.1.1.1 -r1.1.1.1.2.1
*** OldFormatGISModelReader.java	2001/10/23 14:06:53	1.1.1.1
--- OldFormatGISModelReader.java	2001/12/14 14:38:11	1.1.1.1.2.1
***************
*** 18,22 ****
  package opennlp.maxent.io;
  
! import cern.colt.map.*;
  import java.io.*;
  import java.util.zip.*;
--- 18,22 ----
  package opennlp.maxent.io;
  
! import gnu.trove.*;
  import java.io.*;
  import java.util.zip.*;
***************
*** 45,66 ****
      }
  
!     protected OpenIntDoubleHashMap[] getParameters (int[][] outcomePatterns)
! 	throws java.io.IOException {
  	
! 	OpenIntDoubleHashMap[] params = new OpenIntDoubleHashMap[NUM_PREDS];
  	  
! 	int pid=0;
! 	for (int i=0; i<outcomePatterns.length; i++) {
! 	    for (int j=0; j<outcomePatterns[i][0]; j++) {
! 		params[pid] = new OpenIntDoubleHashMap();
! 		for (int k=1; k<outcomePatterns[i].length; k++) {
! 		    double d = paramsInput.readDouble();
! 		    params[pid].put(outcomePatterns[i][k], d);
! 		}
! 		params[pid].trimToSize();
! 		pid++;
! 	    }
! 	}
! 	return params;
      }
  
--- 45,66 ----
      }
  
!     protected TIntDoubleHashMap[] getParameters (int[][] outcomePatterns)
!         throws java.io.IOException {
  	
!         TIntDoubleHashMap[] params = new TIntDoubleHashMap[NUM_PREDS];
  	  
!         int pid=0;
!         for (int i=0; i<outcomePatterns.length; i++) {
!             for (int j=0; j<outcomePatterns[i][0]; j++) {
!                 params[pid] = new TIntDoubleHashMap();
!                 for (int k=1; k<outcomePatterns[i].length; k++) {
!                     double d = paramsInput.readDouble();
!                     params[pid].put(outcomePatterns[i][k], d);
!                 }
!                 params[pid].compact();
!                 pid++;
!             }
!         }
!         return params;
      }

[Maxent-commit] CVS: maxent/lib trove.jar,1.5,1.5.2.1 colt.jar,1.2,NONE colt.license,1.1.1.1,NONE

From: Eric F. <er...@us...> - 2001-12-14 14:38:14

Update of /cvsroot/maxent/maxent/lib
In directory usw-pr-cvs1:/tmp/cvs-serv16092/lib

Modified Files:
      Tag: no_colt
	trove.jar 
Removed Files:
      Tag: no_colt
	colt.jar colt.license 
Log Message:
[note that this is a commit to a branch, not the HEAD]
Removed all colt dependencies
Removed colt
Upgraded trove.jar to version 0.0.8


Index: trove.jar
===================================================================
RCS file: /cvsroot/maxent/maxent/lib/trove.jar,v
retrieving revision 1.5
retrieving revision 1.5.2.1
diff -C2 -d -r1.5 -r1.5.2.1
Binary files /tmp/cvsEYMWat and /tmp/cvsGIsQZL differ

--- colt.jar DELETED ---

--- colt.license DELETED ---

[Maxent-commit] CVS: maxent/src/java/opennlp/maxent GISModel.java,1.5,1.5.2.1 GISTrainer.java,1.2,1.2.2.1

From: Eric F. <er...@us...> - 2001-12-14 14:38:14

Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv16092/src/java/opennlp/maxent

Modified Files:
      Tag: no_colt
	GISModel.java GISTrainer.java 
Log Message:
[note that this is a commit to a branch, not the HEAD]
Removed all colt dependencies
Removed colt
Upgraded trove.jar to version 0.0.8


Index: GISModel.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/GISModel.java,v
retrieving revision 1.5
retrieving revision 1.5.2.1
diff -C2 -d -r1.5 -r1.5.2.1
*** GISModel.java	2001/11/30 14:33:28	1.5
--- GISModel.java	2001/12/14 14:38:11	1.5.2.1
***************
*** 19,24 ****
  
  import gnu.trove.*;
- import cern.colt.list.*;
- import cern.colt.map.*;
  import java.util.*;
  
--- 19,22 ----
***************
*** 31,35 ****
   */
  public final class GISModel implements MaxentModel {
!     private final OpenIntDoubleHashMap[] params;
      private final TObjectIntHashMap pmap;
      private final String[] ocNames;
--- 29,33 ----
   */
  public final class GISModel implements MaxentModel {
!     private final TIntDoubleHashMap[] params;
      private final TObjectIntHashMap pmap;
      private final String[] ocNames;
***************
*** 41,62 ****
      private final double fval;
      
!     public GISModel (OpenIntDoubleHashMap[] _params,
! 		     String[] predLabels,
! 		     String[] _ocNames,
! 		     int _correctionConstant,
! 		     double _correctionParam) {
  
! 	pmap = new TObjectIntHashMap(predLabels.length);
! 	for (int i=0; i<predLabels.length; i++)
! 	    pmap.put(predLabels[i], i);
  
! 	params = _params;
! 	ocNames =  _ocNames;
! 	correctionConstant = _correctionConstant;
! 	correctionParam = _correctionParam;
  	
! 	numOutcomes = ocNames.length;
! 	iprob = Math.log(1.0/numOutcomes);
! 	fval = 1.0/correctionConstant;
  	
      }
--- 39,60 ----
      private final double fval;
      
!     public GISModel (TIntDoubleHashMap[] _params,
!                      String[] predLabels,
!                      String[] _ocNames,
!                      int _correctionConstant,
!                      double _correctionParam) {
  
!         pmap = new TObjectIntHashMap(predLabels.length);
!         for (int i=0; i<predLabels.length; i++)
!             pmap.put(predLabels[i], i);
  
!         params = _params;
!         ocNames =  _ocNames;
!         correctionConstant = _correctionConstant;
!         correctionParam = _correctionParam;
  	
!         numOutcomes = ocNames.length;
!         iprob = Math.log(1.0/numOutcomes);
!         fval = 1.0/correctionConstant;
  	
      }
***************
*** 77,115 ****
       */
      public final double[] eval(String[] context) {
! 	double[] outsums = new double[numOutcomes];
! 	int[] numfeats = new int[numOutcomes];
  
! 	for (int oid=0; oid<numOutcomes; oid++) {
! 	    outsums[oid] = iprob;
! 	    numfeats[oid] = 0;
! 	}
  
! 	IntArrayList activeOutcomes = new IntArrayList(0);
! 	for (int i=0; i<context.length; i++) {
! 	    if (pmap.containsKey(context[i])) {
! 		OpenIntDoubleHashMap predParams =
! 		    params[pmap.get(context[i])];
! 		predParams.keys(activeOutcomes);
! 		for (int j=0; j<activeOutcomes.size(); j++) {
! 		    int oid = activeOutcomes.getQuick(j);
! 		    numfeats[oid]++;
! 		    outsums[oid] += fval * predParams.get(oid);
! 		}
! 	    }
! 	}
  
! 	double normal = 0.0;
! 	for (int oid=0; oid<numOutcomes; oid++) {
! 	    outsums[oid] = Math.exp(outsums[oid]
! 				    + ((1.0 -
! 					(numfeats[oid]/correctionConstant))
! 				       * correctionParam));
! 	    normal += outsums[oid];
! 	}
  
! 	for (int oid=0; oid<numOutcomes; oid++)
! 	    outsums[oid] /= normal;
  
! 	return outsums;
      }
  
--- 75,113 ----
       */
      public final double[] eval(String[] context) {
!         double[] outsums = new double[numOutcomes];
!         int[] numfeats = new int[numOutcomes];
  
!         for (int oid=0; oid<numOutcomes; oid++) {
!             outsums[oid] = iprob;
!             numfeats[oid] = 0;
!         }
  
!         int[] activeOutcomes;
!         for (int i=0; i<context.length; i++) {
!             if (pmap.containsKey(context[i])) {
!                 TIntDoubleHashMap predParams =
!                     params[pmap.get(context[i])];
!                 activeOutcomes = predParams.keys();
!                 for (int j=0; j<activeOutcomes.length; j++) {
!                     int oid = activeOutcomes[j];
!                     numfeats[oid]++;
!                     outsums[oid] += fval * predParams.get(oid);
!                 }
!             }
!         }
  
!         double normal = 0.0;
!         for (int oid=0; oid<numOutcomes; oid++) {
!             outsums[oid] = Math.exp(outsums[oid]
!                                     + ((1.0 -
!                                         (numfeats[oid]/correctionConstant))
!                                        * correctionParam));
!             normal += outsums[oid];
!         }
  
!         for (int oid=0; oid<numOutcomes; oid++)
!             outsums[oid] /= normal;
  
!         return outsums;
      }
  
***************
*** 124,131 ****
       */    
      public final String getBestOutcome(double[] ocs) {
! 	int best = 0;
! 	for (int i = 1; i<ocs.length; i++)
! 	    if (ocs[i] > ocs[best]) best = i;
! 	return ocNames[best];
      }
  
--- 122,129 ----
       */    
      public final String getBestOutcome(double[] ocs) {
!         int best = 0;
!         for (int i = 1; i<ocs.length; i++)
!             if (ocs[i] > ocs[best]) best = i;
!         return ocNames[best];
      }
  
***************
*** 144,164 ****
       */    
      public final String getAllOutcomes (double[] ocs) {
! 	if (ocs.length != ocNames.length) {
! 	    return "The double array sent as a parameter to GISModel.getAllOutcomes() must not have been produced by this model.";
! 	}
! 	else {
! 	    StringBuffer sb = new StringBuffer(ocs.length*2);
! 	    String d = Double.toString(ocs[0]);
! 	    if (d.length() > 6)
! 		d = d.substring(0,7);
! 	    sb.append(ocNames[0]).append("[").append(d).append("]");
! 	    for (int i = 1; i<ocs.length; i++) {
! 		d = Double.toString(ocs[i]);
! 		if (d.length() > 6)
! 		    d = d.substring(0,7);
! 		sb.append("  ").append(ocNames[i]).append("[").append(d).append("]");
! 	    }
! 	    return sb.toString();
! 	}
      }
  
--- 142,162 ----
       */    
      public final String getAllOutcomes (double[] ocs) {
!         if (ocs.length != ocNames.length) {
!             return "The double array sent as a parameter to GISModel.getAllOutcomes() must not have been produced by this model.";
!         }
!         else {
!             StringBuffer sb = new StringBuffer(ocs.length*2);
!             String d = Double.toString(ocs[0]);
!             if (d.length() > 6)
!                 d = d.substring(0,7);
!             sb.append(ocNames[0]).append("[").append(d).append("]");
!             for (int i = 1; i<ocs.length; i++) {
!                 d = Double.toString(ocs[i]);
!                 if (d.length() > 6)
!                     d = d.substring(0,7);
!                 sb.append("  ").append(ocNames[i]).append("[").append(d).append("]");
!             }
!             return sb.toString();
!         }
      }
  
***************
*** 171,175 ****
       */
      public final String getOutcome(int i) {
! 	return ocNames[i];
      }
  
--- 169,173 ----
       */
      public final String getOutcome(int i) {
!         return ocNames[i];
      }
  
***************
*** 183,191 ****
       **/
      public int getIndex (String outcome) {
! 	for (int i=0; i<ocNames.length; i++) {
! 	    if (ocNames[i].equals(outcome))
! 		return i;
! 	}
! 	return -1;
      } 
  
--- 181,189 ----
       **/
      public int getIndex (String outcome) {
!         for (int i=0; i<ocNames.length; i++) {
!             if (ocNames[i].equals(outcome))
!                 return i;
!         }
!         return -1;
      } 
  
***************
*** 197,201 ****
       * which is returned by this method:
       *
!      * <li>index 0: cern.colt.map.OpenIntDoubleHashMap[] containing the model
       *            parameters  
       * <li>index 1: java.util.Map containing the mapping of model predicates
--- 195,199 ----
       * which is returned by this method:
       *
!      * <li>index 0: gnu.trove.TIntDoubleHashMap[] containing the model
       *            parameters  
       * <li>index 1: java.util.Map containing the mapping of model predicates
***************
*** 212,225 ****
       */
      public final Object[] getDataStructures () {
! 	Object[] data = new Object[5];
! 	data[0] = params;
! 	data[1] = pmap;
! 	data[2] = ocNames;
! 	data[3] = new Integer(correctionConstant);
! 	data[4] = new Double(correctionParam);
! 	return data;
      }
-     
- 
- 
  }
--- 210,220 ----
       */
      public final Object[] getDataStructures () {
!         Object[] data = new Object[5];
!         data[0] = params;
!         data[1] = pmap;
!         data[2] = ocNames;
!         data[3] = new Integer(correctionConstant);
!         data[4] = new Double(correctionParam);
!         return data;
      }
  }

Index: GISTrainer.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/GISTrainer.java,v
retrieving revision 1.2
retrieving revision 1.2.2.1
diff -C2 -d -r1.2 -r1.2.2.1
*** GISTrainer.java	2001/11/16 10:37:43	1.2
--- GISTrainer.java	2001/12/14 14:38:11	1.2.2.1
***************
*** 18,24 ****
  package opennlp.maxent;
  
! import cern.colt.function.*;
! import cern.colt.list.*;
! import cern.colt.map.*;
  
  import java.io.*;
--- 18,22 ----
  package opennlp.maxent;
  
! import gnu.trove.*;
  
  import java.io.*;
***************
*** 82,95 ****
  
      // stores the observed expections of each of the events
!     private OpenIntDoubleHashMap[] observedExpects;
  
      // stores the estimated parameter value of each predicate during iteration
!     private OpenIntDoubleHashMap[] params;
  
      // stores the modifiers of the parameter values, paired to params
!     private OpenIntDoubleHashMap[] modifiers;
  
      // a helper object for storing predicate indexes
!     private IntArrayList predkeys; 
  
      // a boolean to track if all events have same number of active features
--- 80,93 ----
  
      // stores the observed expections of each of the events
!     private TIntDoubleHashMap[] observedExpects;
  
      // stores the estimated parameter value of each predicate during iteration
!     private TIntDoubleHashMap[] params;
  
      // stores the modifiers of the parameter values, paired to params
!     private TIntDoubleHashMap[] modifiers;
  
      // a helper object for storing predicate indexes
!     private int[] predkeys; 
  
      // a boolean to track if all events have same number of active features
***************
*** 109,137 ****
      // stores the value of corrections feature for each event's predicate list,
      // expanded to include all outcomes which might come from those predicates.
!     private OpenIntIntHashMap[] cfvals;
  
      // Normalized Probabilities Of Outcomes Given Context: p(a|b_i)
      // Stores the computation of each iterations for the update to the
      // modifiers (and therefore the params)
!     private OpenIntDoubleHashMap[] pabi;
  
!     // make all values in an OpenIntDoubleHashMap return to 0.0
!     private DoubleFunction backToZeros =
!         new DoubleFunction() {
!                 public double apply(double arg) { return 0.0; }
              };
  
!     // divide all values in the OpenIntDoubleHashMap pabi[TID] by the sum of
      // all values in the map.
!     private DoubleFunction normalizePABI =
!         new DoubleFunction() {
!                 public double apply(double arg) { return arg / PABISUM; }
              };
  
      // add the previous iteration's parameters to the computation of the
      // modifiers of this iteration.
!     private IntDoubleProcedure addParamsToPABI =
!         new IntDoubleProcedure() {
!                 public boolean apply(int oid, double arg) {
                      pabi[TID].put(oid, pabi[TID].get(oid) + arg);
                      return true;
--- 107,135 ----
      // stores the value of corrections feature for each event's predicate list,
      // expanded to include all outcomes which might come from those predicates.
!     private TIntIntHashMap[] cfvals;
  
      // Normalized Probabilities Of Outcomes Given Context: p(a|b_i)
      // Stores the computation of each iterations for the update to the
      // modifiers (and therefore the params)
!     private TIntDoubleHashMap[] pabi;
  
!     // make all values in an TIntDoubleHashMap return to 0.0
!     private TDoubleFunction backToZeros =
!         new TDoubleFunction() {
!                 public double execute(double arg) { return 0.0; }
              };
  
!     // divide all values in the TIntDoubleHashMap pabi[TID] by the sum of
      // all values in the map.
!     private TDoubleFunction normalizePABI =
!         new TDoubleFunction() {
!                 public double execute(double arg) { return arg / PABISUM; }
              };
  
      // add the previous iteration's parameters to the computation of the
      // modifiers of this iteration.
!     private TIntDoubleProcedure addParamsToPABI =
!         new TIntDoubleProcedure() {
!                 public boolean execute(int oid, double arg) {
                      pabi[TID].put(oid, pabi[TID].get(oid) + arg);
                      return true;
***************
*** 140,146 ****
  
      // add the correction parameter and exponentiate it
!     private IntDoubleProcedure addCorrectionToPABIandExponentiate =
!         new IntDoubleProcedure() {
!                 public boolean apply(int oid, double arg) {
                      if (needCorrection)
                          arg = arg + (correctionParam * cfvals[TID].get(oid));
--- 138,144 ----
  
      // add the correction parameter and exponentiate it
!     private TIntDoubleProcedure addCorrectionToPABIandExponentiate =
!         new TIntDoubleProcedure() {
!                 public boolean execute(int oid, double arg) {
                      if (needCorrection)
                          arg = arg + (correctionParam * cfvals[TID].get(oid));
***************
*** 153,159 ****
  
      // update the modifiers based on the new pabi values
!     private IntDoubleProcedure updateModifiers =
!         new IntDoubleProcedure() {
!                 public boolean apply(int oid, double arg) {
                      modifiers[PID].put(oid,
                                         arg
--- 151,157 ----
  
      // update the modifiers based on the new pabi values
!     private TIntDoubleProcedure updateModifiers =
!         new TIntDoubleProcedure() {
!                 public boolean execute(int oid, double arg) {
                      modifiers[PID].put(oid,
                                         arg
***************
*** 165,171 ****
  
      // update the params based on the newly computed modifiers
!     private IntDoubleProcedure updateParams =
!         new IntDoubleProcedure() {
!                 public boolean apply(int oid, double arg) {
                      params[PID].put(oid,
                                      arg
--- 163,169 ----
  
      // update the params based on the newly computed modifiers
!     private TIntDoubleProcedure updateParams =
!         new TIntDoubleProcedure() {
!                 public boolean execute(int oid, double arg) {
                      params[PID].put(oid,
                                      arg
***************
*** 179,185 ****
      // update the correction feature modifier, which will then be used to
      // updated the correction parameter
!     private IntDoubleProcedure updateCorrectionFeatureModifier =
!         new IntDoubleProcedure() {
!                 public boolean apply(int oid, double arg) {
                      CFMOD +=  arg * cfvals[TID].get(oid) * numTimesEventsSeen[TID];
                      return true;
--- 177,183 ----
      // update the correction feature modifier, which will then be used to
      // updated the correction parameter
!     private TIntDoubleProcedure updateCorrectionFeatureModifier =
!         new TIntDoubleProcedure() {
!                 public boolean execute(int oid, double arg) {
                      CFMOD +=  arg * cfvals[TID].get(oid) * numTimesEventsSeen[TID];
                      return true;
***************
*** 304,315 ****
          // implementation, this is cancelled out when we compute the next
          // iteration of a parameter, making the extra divisions wasteful.
!         params = new OpenIntDoubleHashMap[numPreds];
!         modifiers = new OpenIntDoubleHashMap[numPreds];
!         observedExpects = new OpenIntDoubleHashMap[numPreds];
  
  	for (PID=0; PID<numPreds; PID++) {
! 	    params[PID] = new OpenIntDoubleHashMap();
!             modifiers[PID] = new OpenIntDoubleHashMap();
!             observedExpects[PID] = new OpenIntDoubleHashMap();
              for (OID=0; OID<numOutcomes; OID++) {
                  if (predCount[PID][OID] > 0) {
--- 302,313 ----
          // implementation, this is cancelled out when we compute the next
          // iteration of a parameter, making the extra divisions wasteful.
!         params = new TIntDoubleHashMap[numPreds];
!         modifiers = new TIntDoubleHashMap[numPreds];
!         observedExpects = new TIntDoubleHashMap[numPreds];
  
  	for (PID=0; PID<numPreds; PID++) {
! 	    params[PID] = new TIntDoubleHashMap();
!             modifiers[PID] = new TIntDoubleHashMap();
!             observedExpects[PID] = new TIntDoubleHashMap();
              for (OID=0; OID<numOutcomes; OID++) {
                  if (predCount[PID][OID] > 0) {
***************
*** 324,330 ****
  		}
              }
!             params[PID].trimToSize();
!             modifiers[PID].trimToSize();
!             observedExpects[PID].trimToSize();
          }
  
--- 322,328 ----
  		}
              }
!             params[PID].compact();
!             modifiers[PID].compact();
!             observedExpects[PID].compact();
          }
  
***************
*** 333,337 ****
          display("...done.\n");
  
!         pabi = new OpenIntDoubleHashMap[numTokens];
  
          if (needCorrection) {
--- 331,335 ----
          display("...done.\n");
  
!         pabi = new TIntDoubleHashMap[numTokens];
  
          if (needCorrection) {
***************
*** 339,351 ****
              display("Computing correction feature matrix... ");
  	
!             cfvals = new OpenIntIntHashMap[numTokens];
              for (TID=0; TID<numTokens; TID++) {
!                 cfvals[TID] = new OpenIntIntHashMap();
!                 pabi[TID] = new OpenIntDoubleHashMap();
                  for (int j=0; j<contexts[TID].length; j++) {
                      PID = contexts[TID][j];
                      predkeys = params[PID].keys();
!                     for (int i=0; i<predkeys.size(); i++) {
!                         OID = predkeys.get(i);
                          if (cfvals[TID].containsKey(OID)) {
                              cfvals[TID].put(OID, cfvals[TID].get(OID) + 1);
--- 337,349 ----
              display("Computing correction feature matrix... ");
  	
!             cfvals = new TIntIntHashMap[numTokens];
              for (TID=0; TID<numTokens; TID++) {
!                 cfvals[TID] = new TIntIntHashMap();
!                 pabi[TID] = new TIntDoubleHashMap();
                  for (int j=0; j<contexts[TID].length; j++) {
                      PID = contexts[TID][j];
                      predkeys = params[PID].keys();
!                     for (int i=0; i<predkeys.length; i++) {
!                         OID = predkeys[i];
                          if (cfvals[TID].containsKey(OID)) {
                              cfvals[TID].put(OID, cfvals[TID].get(OID) + 1);
***************
*** 356,367 ****
                      }
                  }
!                 cfvals[TID].trimToSize();
!                 pabi[TID].trimToSize();
              }
  	
              for (TID=0; TID<numTokens; TID++) {
                  predkeys = cfvals[TID].keys();
!                 for (int i=0; i<predkeys.size(); i++) {
!                     OID = predkeys.get(i);
                      cfvals[TID].put(OID, constant - cfvals[TID].get(OID));
                  }
--- 354,365 ----
                      }
                  }
!                 cfvals[TID].compact();
!                 pabi[TID].compact();
              }
  	
              for (TID=0; TID<numTokens; TID++) {
                  predkeys = cfvals[TID].keys();
!                 for (int i=0; i<predkeys.length; i++) {
!                     OID = predkeys[i];
                      cfvals[TID].put(OID, constant - cfvals[TID].get(OID));
                  }
***************
*** 381,394 ****
          else {
              // initialize just the pabi table
!             pabi = new OpenIntDoubleHashMap[numTokens];
              for (TID=0; TID<numTokens; TID++) {
!                 pabi[TID] = new OpenIntDoubleHashMap();
                  for (int j=0; j<contexts[TID].length; j++) {
                      PID = contexts[TID][j];
                      predkeys = params[PID].keys();
!                     for (int i=0; i<predkeys.size(); i++)
!                         pabi[TID].put(predkeys.get(i), 0.0);
                  }
!                 pabi[TID].trimToSize();
              }
          }
--- 379,392 ----
          else {
              // initialize just the pabi table
!             pabi = new TIntDoubleHashMap[numTokens];
              for (TID=0; TID<numTokens; TID++) {
!                 pabi[TID] = new TIntDoubleHashMap();
                  for (int j=0; j<contexts[TID].length; j++) {
                      PID = contexts[TID][j];
                      predkeys = params[PID].keys();
!                     for (int i=0; i<predkeys.length; i++)
!                         pabi[TID].put(predkeys[i], 0.0);
                  }
!                 pabi[TID].compact();
              }
          }
***************
*** 434,448 ****
          CFMOD = 0.0;
          for (TID=0; TID<numTokens; TID++) {
!             pabi[TID].assign(backToZeros);
  
              for (int j=0; j<contexts[TID].length; j++)
!                 params[contexts[TID][j]].forEachPair(addParamsToPABI);
  
              PABISUM = 0.0; // PABISUM is computed in the next line's procedure
!             pabi[TID].forEachPair(addCorrectionToPABIandExponentiate);
!             if (PABISUM > 0.0) pabi[TID].assign(normalizePABI);
  
              if (needCorrection)
!                 pabi[TID].forEachPair(updateCorrectionFeatureModifier);
          }
          display(".");
--- 432,446 ----
          CFMOD = 0.0;
          for (TID=0; TID<numTokens; TID++) {
!             pabi[TID].transformValues(backToZeros);
  
              for (int j=0; j<contexts[TID].length; j++)
!                 params[contexts[TID][j]].forEachEntry(addParamsToPABI);
  
              PABISUM = 0.0; // PABISUM is computed in the next line's procedure
!             pabi[TID].forEachEntry(addCorrectionToPABIandExponentiate);
!             if (PABISUM > 0.0) pabi[TID].transformValues(normalizePABI);
  
              if (needCorrection)
!                 pabi[TID].forEachEntry(updateCorrectionFeatureModifier);
          }
          display(".");
***************
*** 455,459 ****
                  // globally for the updateModifiers procedure used after it
                  PID = contexts[TID][j]; 
!                 modifiers[PID].forEachPair(updateModifiers);
              }
          }
--- 453,457 ----
                  // globally for the updateModifiers procedure used after it
                  PID = contexts[TID][j]; 
!                 modifiers[PID].forEachEntry(updateModifiers);
              }
          }
***************
*** 462,467 ****
          // compute the new parameter values
          for (PID=0; PID<numPreds; PID++) {
!             params[PID].forEachPair(updateParams);
!             modifiers[PID].assign(backToZeros); // re-initialize to 0.0's
          }
  
--- 460,465 ----
          // compute the new parameter values
          for (PID=0; PID<numPreds; PID++) {
!             params[PID].forEachEntry(updateParams);
!             modifiers[PID].transformValues(backToZeros); // re-initialize to 0.0's
          }

[Maxent-commit] CVS: maxent build.xml,1.11,1.11.2.1

From: Eric F. <er...@us...> - 2001-12-14 14:38:13

Update of /cvsroot/maxent/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv16092

Modified Files:
      Tag: no_colt
	build.xml 
Log Message:
[note that this is a commit to a branch, not the HEAD]
Removed all colt dependencies
Removed colt
Upgraded trove.jar to version 0.0.8


Index: build.xml
===================================================================
RCS file: /cvsroot/maxent/maxent/build.xml,v
retrieving revision 1.11
retrieving revision 1.11.2.1
diff -C2 -d -r1.11 -r1.11.2.1
*** build.xml	2001/11/22 15:07:05	1.11
--- build.xml	2001/12/14 14:38:10	1.11.2.1
***************
*** 42,46 ****
      <path id="build.classpath">
        <pathelement location="${lib.dir}/java-getopt.jar"/>
-       <pathelement location="${lib.dir}/colt.jar"/>
        <pathelement location="${lib.dir}/trove.jar"/>
      </path>
--- 42,45 ----
***************
*** 125,129 ****
          <pathelement path="${build.dir}/${name}-${DSTAMP}.jar"/>
  	<pathelement location="${lib.dir}/java-getopt.jar"/>
- 	<pathelement location="${lib.dir}/colt.jar"/>
  	<pathelement location="${lib.dir}/trove.jar"/>
        </mergefiles>
--- 124,127 ----

[Maxent-commit] CVS: maxent/src/java/opennlp/maxent GISModel.java,1.4,1.5 MaxentModel.java,1.2,1.3

From: Jason B. <jas...@us...> - 2001-11-30 14:33:33

Update of /cvsroot/maxent/maxent/src/java/opennlp/maxent
In directory usw-pr-cvs1:/tmp/cvs-serv27109/src/java/opennlp/maxent

Modified Files:
	GISModel.java MaxentModel.java 
Log Message:
Added getIndex() method to MaxentModel.

Index: GISModel.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/GISModel.java,v
retrieving revision 1.4
retrieving revision 1.5
diff -C2 -d -r1.4 -r1.5
*** GISModel.java	2001/11/15 16:18:40	1.4
--- GISModel.java	2001/11/30 14:33:28	1.5
***************
*** 174,177 ****
--- 174,192 ----
      }
  
+     /**
+      * Gets the index associated with the String name of the given outcome.
+      *
+      * @param outcome the String name of the outcome for which the
+      *          index is desired
+      * @return the index if the given outcome label exists for this
+      * model, -1 if it does not.
+      **/
+     public int getIndex (String outcome) {
+ 	for (int i=0; i<ocNames.length; i++) {
+ 	    if (ocNames[i].equals(outcome))
+ 		return i;
+ 	}
+ 	return -1;
+     } 
  
      

Index: MaxentModel.java
===================================================================
RCS file: /cvsroot/maxent/maxent/src/java/opennlp/maxent/MaxentModel.java,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -d -r1.2 -r1.3
*** MaxentModel.java	2001/11/06 15:03:49	1.2
--- MaxentModel.java	2001/11/30 14:33:29	1.3
***************
*** 10,14 ****
  // but WITHOUT ANY WARRANTY; without even the implied warranty of
  // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
! // GNU General Public License for more details.
  //
  // You should have received a copy of the GNU Lesser General Public
--- 10,14 ----
  // but WITHOUT ANY WARRANTY; without even the implied warranty of
  // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
! // GNU Lesser General Public License for more details.
  //
  // You should have received a copy of the GNU Lesser General Public
***************
*** 23,27 ****
   * @author      Jason Baldridge
   * @version     $Revision$, $Date$
!  */
  public interface MaxentModel {
  
--- 23,27 ----
   * @author      Jason Baldridge
   * @version     $Revision$, $Date$
!  **/
  public interface MaxentModel {
  
***************
*** 34,38 ****
       *         outcomes, all of which sum to 1.
       *
!      */
      public double[] eval (String[] context);
  
--- 34,38 ----
       *         outcomes, all of which sum to 1.
       *
!      **/
      public double[] eval (String[] context);
  
***************
*** 45,50 ****
       *            method.
       * @return the String name of the best outcome
!      *
!      */
      public String getBestOutcome (double[] outcomes);
  
--- 45,49 ----
       *            method.
       * @return the String name of the best outcome
!      **/
      public String getBestOutcome (double[] outcomes);
  
***************
*** 52,57 ****
      /**
       * Return a string matching all the outcome names with all the
!      * probabilities produced by the <code>eval(String[] context)</code>
!      * method.
       *
       * @param outcomes A <code>double[]</code> as returned by the
--- 51,56 ----
      /**
       * Return a string matching all the outcome names with all the
!      * probabilities produced by the <code>eval(String[]
!      * context)</code> method.
       *
       * @param outcomes A <code>double[]</code> as returned by the
***************
*** 61,77 ****
       *            probability (contained in the <code>double[] ocs</code>)
       *            for each one.
!      */    
      public String getAllOutcomes (double[] outcomes);
  
  	
      /**
!      * Gets the String name of the outcome associated with the index i.
       *
       * @param i the index for which the name of the associated outcome is
       *          desired.
       * @return the String name of the outcome
!      */
      public String getOutcome (int i);
  
      public Object[] getDataStructures ();
      
--- 60,93 ----
       *            probability (contained in the <code>double[] ocs</code>)
       *            for each one.
!      **/    
      public String getAllOutcomes (double[] outcomes);
  
  	
      /**
!      * Gets the String name of the outcome associated with the index
!      * i.
       *
       * @param i the index for which the name of the associated outcome is
       *          desired.
       * @return the String name of the outcome
!      **/
      public String getOutcome (int i);
  
+ 
+     /**
+      * Gets the index associated with the String name of the given
+      * outcome.
+      *
+      * @param outcome the String name of the outcome for which the
+      *          index is desired
+      * @return the index if the given outcome label exists for this
+      * model, -1 if it does not.
+      **/
+     public int getIndex (String outcome);
+ 
+ 
+     /**
+      * Returns the data structures relevant to storing the model.
+      **/
      public Object[] getDataStructures ();

[Maxent-commit] CVS: maxentc HashMap.h,1.2,1.3 HashSet.h,1.2,1.3 MaxentModel.h,1.3,1.4

From: Gann B. <ga...@us...> - 2001-11-28 20:52:28

Update of /cvsroot/maxent/maxentc
In directory usw-pr-cvs1:/tmp/cvs-serv13350

Modified Files:
	HashMap.h HashSet.h MaxentModel.h 
Log Message:
Fixed some memory problems

Index: HashMap.h
===================================================================
RCS file: /cvsroot/maxent/maxentc/HashMap.h,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -d -r1.2 -r1.3
*** HashMap.h	2001/08/09 18:29:49	1.2
--- HashMap.h	2001/11/28 20:52:25	1.3
***************
*** 78,82 ****
  	~HashMap()                    { delete set; }
  	void put(K key, V value)      { set->put(new HashMapEntry(HF::copy(key), HF::copyVal(value)), false); }
! 	int nTotalEntries()           { return set->nTotalEntries; }
  	bool bContains(const K key)   { HashMapEntry n(key, NULL); return set->bContains(&n); }
  	HashMapEntry* pGetNextEntry() { return set->pGetNextEntry(); }
--- 78,82 ----
  	~HashMap()                    { delete set; }
  	void put(K key, V value)      { set->put(new HashMapEntry(HF::copy(key), HF::copyVal(value)), false); }
! 	int nTotalEntries()           { return set->nTotalEntries(); }
  	bool bContains(const K key)   { HashMapEntry n(key, NULL); return set->bContains(&n); }
  	HashMapEntry* pGetNextEntry() { return set->pGetNextEntry(); }
***************
*** 84,96 ****
  	void Clear()                  { set->Clear(); }
  	V get(K key)                  {
! 		HashMapEntry n(key, NULL);
  		HashMapEntry* newn = set->get(&n);
  		if(newn==NULL)
! 			return NULL;
  		else
  			return newn->value;
  	}
  	void remove(const K key)      {
! 		HashMapEntry n(key, NULL);
  		HashMapEntry* newn = set->get(&n);
  		if(newn)
--- 84,96 ----
  	void Clear()                  { set->Clear(); }
  	V get(K key)                  {
! 		HashMapEntry n(key, 0);
  		HashMapEntry* newn = set->get(&n);
  		if(newn==NULL)
! 			return 0;
  		else
  			return newn->value;
  	}
  	void remove(const K key)      {
! 		HashMapEntry n(key, 0);
  		HashMapEntry* newn = set->get(&n);
  		if(newn)
***************
*** 109,112 ****
--- 109,126 ----
  
  typedef HashMap<char*, char*, szszHashMapFunctions> StringStringMap;
+ 
+ class nnHashMapFunctions : public nHashFunctions {
+ public:
+ 	static int copyVal(int n) { return n; }
+ 	static void delVal(int n) { }
+ };	
+ typedef HashMap<int, int, nnHashMapFunctions> IntIntMap;
+ 
+ template<class V>
+ class szgenHashMapFunctions : public szHashFunctions {
+ public:
+ 	static V copyVal(V n) { return n; }
+ 	static void delVal(V n) { }
+ };
  
  #endif

Index: HashSet.h
===================================================================
RCS file: /cvsroot/maxent/maxentc/HashSet.h,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -d -r1.2 -r1.3
*** HashSet.h	2001/08/09 18:29:49	1.2
--- HashSet.h	2001/11/28 20:52:25	1.3
***************
*** 24,27 ****
--- 24,33 ----
  const float MAX_LOAD = .75;
  
+ //////////////////////////////////////////////////////////////////////////////////
+ // HashSet.h: A generic HashSet template
+ //
+ // 
+ //////////////////////////////////////////////////////////////////////////////////
+ 
  #define HASH_SET_TEMPLATE template<class T, class HF>
  #define HASH_SET HashSet<T, HF>
***************
*** 49,57 ****
  	T put(const T e, bool bAlloc=true);
  	void remove(const T e);
! 	int nTotalEntries() { return nEntries; }
  	bool bContains(const T e);
  	T get(T e);
  	T pGetNextEntry(); 
  	void Reset() { m_nCurRow = -1; m_pCurEntry = NULL; }
  
  private:
--- 55,65 ----
  	T put(const T e, bool bAlloc=true);
  	void remove(const T e);
! 	int nTotalEntries() { return m_nEntries; }
  	bool bContains(const T e);
  	T get(T e);
  	T pGetNextEntry(); 
  	void Reset() { m_nCurRow = -1; m_pCurEntry = NULL; }
+     int nGetSize()      { return(m_nSize); }
+     int nGetEntries()   { return(m_nEntries); }
  
  private:
***************
*** 116,120 ****
  		return m_pCurEntry->data;
  	else
! 		return NULL;
  }
  
--- 124,128 ----
  		return m_pCurEntry->data;
  	else
! 		return 0;
  }
  
***************
*** 148,153 ****
  void HASH_SET::push(const T e, int nPos) {
  	hash_entry* new_entry = new hash_entry(e);
! 	hash_entry* temp      = m_aEntries[nPos];	
! 	m_aEntries[nPos]        = new_entry;
  	new_entry->pNext      = temp;
  	m_nEntries++;
--- 156,161 ----
  void HASH_SET::push(const T e, int nPos) {
  	hash_entry* new_entry = new hash_entry(e);
! 	hash_entry* temp      = m_aEntries[nPos];
! 	m_aEntries[nPos]      = new_entry;
  	new_entry->pNext      = temp;
  	m_nEntries++;
***************
*** 173,180 ****
  			// insert a new entry
  			push(e, nPos);
!         } else
!             return(temp->pNext->data);
!     }
!     return(e);
  }
  
--- 181,190 ----
  			// insert a new entry
  			push(e, nPos);
! 		} else {
! 			if(bAlloc) HF::del(e);
! 			return(temp->pNext->data);
! 		}
! 	}
! 	return(e);
  }
  
***************
*** 231,235 ****
  //////////////////////////////////////////////////////////////////////////////////
  #include <math.h>
! #include <string>
  class szHashFunctions {
  public:
--- 241,245 ----
  //////////////////////////////////////////////////////////////////////////////////
  #include <math.h>
! #include <string.h>
  class szHashFunctions {
  public:
***************
*** 257,260 ****
--- 267,272 ----
  	static bool bComp(const int a, const int b) { return a==b; }
  	static int nHash(int e, int nSize){
+ 		if(e < 0)
+ 			e*=-1;
  		return e%nSize;
  	}

Index: MaxentModel.h
===================================================================
RCS file: /cvsroot/maxent/maxentc/MaxentModel.h,v
retrieving revision 1.3
retrieving revision 1.4
diff -C2 -d -r1.3 -r1.4
*** MaxentModel.h	2001/05/01 10:17:26	1.3
--- MaxentModel.h	2001/11/28 20:52:25	1.4
***************
*** 29,32 ****
--- 29,35 ----
  
  public:
+ 
+ 	virtual ~MaxentModel() {}
+ 	
  	/**
  	 * Evaluates a context.

[Maxent-commit] CVS: maxent/lib LIBNOTES,1.4,1.5 trove.jar,1.4,1.5

From: Jason B. <jas...@us...> - 2001-11-27 17:11:44

Update of /cvsroot/maxent/maxent/lib
In directory usw-pr-cvs1:/tmp/cvs-serv32100/lib

Modified Files:
	LIBNOTES trove.jar 
Log Message:
v0.0.8 of trove added.

Index: LIBNOTES
===================================================================
RCS file: /cvsroot/maxent/maxent/lib/LIBNOTES,v
retrieving revision 1.4
retrieving revision 1.5
diff -C2 -d -r1.4 -r1.5
*** LIBNOTES	2001/11/26 10:14:55	1.4
--- LIBNOTES	2001/11/27 17:11:40	1.5
***************
*** 41,45 ****
  trove.jar
  
! GNU Trove, version 0.0.7
  Homepage: http://trove4j.sf.net
  License: LGPL
--- 41,45 ----
  trove.jar
  
! GNU Trove, version 0.0.8
  Homepage: http://trove4j.sf.net
  License: LGPL

Index: trove.jar
===================================================================
RCS file: /cvsroot/maxent/maxent/lib/trove.jar,v
retrieving revision 1.4
retrieving revision 1.5
diff -C2 -d -r1.4 -r1.5
Binary files /tmp/cvsVXXmqI and /tmp/cvs2azBwi differ

[Maxent-commit] CVS: maxent/lib LIBNOTES,1.3,1.4 trove.jar,1.3,1.4

From: Jason B. <jas...@us...> - 2001-11-26 10:14:59

Update of /cvsroot/maxent/maxent/lib
In directory usw-pr-cvs1:/tmp/cvs-serv28086/lib

Modified Files:
	LIBNOTES trove.jar 
Log Message:
Changed to version 0.0.7 of trove.

Index: LIBNOTES
===================================================================
RCS file: /cvsroot/maxent/maxent/lib/LIBNOTES,v
retrieving revision 1.3
retrieving revision 1.4
diff -C2 -d -r1.3 -r1.4
*** LIBNOTES	2001/11/20 17:06:19	1.3
--- LIBNOTES	2001/11/26 10:14:55	1.4
***************
*** 41,45 ****
  trove.jar
  
! GNU Trove, version 0.0.6
  Homepage: http://trove4j.sf.net
  License: LGPL
--- 41,45 ----
  trove.jar
  
! GNU Trove, version 0.0.7
  Homepage: http://trove4j.sf.net
  License: LGPL

Index: trove.jar
===================================================================
RCS file: /cvsroot/maxent/maxent/lib/trove.jar,v
retrieving revision 1.3
retrieving revision 1.4
diff -C2 -d -r1.3 -r1.4
Binary files /tmp/cvs5eIzJc and /tmp/cvsupDT8g differ

Flat | Threaded

<< < 1 .. 10 11 12 13 14 15 > >> (Page 12 of 15)