Menu

tadm problem with a large data set

Help
2007-01-16
2013-04-09
  • Francis Bond

    Francis Bond - 2007-01-16

    G'day,

    We are trying to run tadm/evaluate (with a patch form Stephan Oepen) on a fairly large data set (875809 features).  It is actually enhanced semantic dependencies for parse ranking with HPSG.

    On a 32 bit machine, we can't even load the event file without running out of memory (I have added the error message at the end).

    On a 64 bit machine, it runs and produces a parameter file, but when we try to evaluate a test set, evaluate core dumps. 

    In fact we have three files (train, devel and test).  We tried training on test or devel and that produces parameter files we can use with evaluate on test, devel or train.  Therefore, we think the event files themselves are well formed.

    Has anyone ever seen behavior like this before?   If not, any ideas as to what may be causing it?
    We would be happy to provide the data-sets for debugging, they are around 80M in all.

    Francis Bond, NICT

    P.S. Error message on the 32 bit machine
    bond@knut:~/sprg$ tadm -events_in trains_df.eve.gz -params_out para_df-smooth.par -monitor -fatol 1e-32 -frtol 1e-7 -variances variances -malloc_log

    Maximum Entropy Parameter Estimation
      version 0.99.5 (07 August 2005)

    Start: Tue Jan 16 23:34:19 2007

    Events in  = trains_df.eve.gz
    --------------------------------------------------------------------------
    Petsc Release Version 2.3.0, Patch 44, April, 26, 2005
    See docs/changes/index.html for recent updates.
    See docs/faq.html for hints about trouble shooting.
    See docs/index.html for manual pages.
    -----------------------------------------------------------------------
    /home/bond/logon/sdsu/bin/linux.x86.32/tadm on a linux-gnu named knut by bond Tue Jan 16 23:34:19 2007
    Libraries linked from /home/oe/src/logon/anl/petsc-2.3.0/lib/linux-gnu
    Configure run at Sat Feb 11 14:53:06 2006
    Configure options --download-f-blas-lapack=1 --with-default-optimization=O --with-mpi=0 --with-dynamic=0 --with-clanguage=C++ --with-shared=0
    -----------------------------------------------------------------------
    [0]PETSC ERROR: PetscMallocAlign() line 62 in src/sys/src/memory/mal.c
    [0]PETSC ERROR: Out of memory. This could be due to allocating
    [0]PETSC ERROR: too large an object or bleeding by not properly
    [0]PETSC ERROR: destroying unneeded objects.
    [0] Maximum memory PetscMalloc()ed 1213346708 maximum size of entire process 0
    [0] Memory usage sorted by function
    [0] 14060 ClassPerfLogCreate()
    [0] 812 ClassRegLogCreate()
    [0] 1213296340 Dataset::readEvents()
    [0] 30060 EventPerfLogCreate()
    [0] 812 EventRegLogCreate()
    [0] 720 PetscFListAdd()
    [0] 32 PetscPushSignalHandler()
    [0] 244 PetscStackCreate()
    [0] 2336 PetscStrallocpy()
    [0] 524 StackCreate()
    [0] 788 StageLogCreate()
    [0]PETSC ERROR: Memory requested 2402076528!
    [0]PETSC ERROR: PetscTrMallocDefault() line 188 in src/sys/src/memory/mtr.c
    [0]PETSC ERROR: Dataset::readEvents() line 248 in dataset.cc
    [0]PETSC ERROR: VecDuplicate() line 1586 in src/vec/interface/vector.c
    [0]PETSC ERROR: Invalid argument!
    [0]PETSC ERROR: Wrong type of object: Parameter # 1!
    [0]PETSC ERROR: initializeDataset() line 563 in dataset.cc
    Params out = para_df-smooth.par
    [0]PETSC ERROR: VecDuplicate() line 1586 in src/vec/interface/vector.c
    [0]PETSC ERROR: Null argument, when expecting valid pointer!
    [0]PETSC ERROR: Null Object: Parameter # 1!
    [0]PETSC ERROR: VecSet() line 945 in src/vec/interface/vector.c
    [0]PETSC ERROR: Invalid argument!
    [0]PETSC ERROR: Wrong type of object: Parameter # 1!
    Segmentation fault (core dumped)

     
    • Jason Baldridge

      Jason Baldridge - 2007-01-26

      How much memory do your machines have? I've trained models with 1M features on a 2GB machine. I ran into a similar cap a while back, and had to scale back the number of features I used in order for it to work. Now I have bigger machines, and things have been just fine.

      I wonder if updating to the latest Petsc and Tao would help out here. I won't have a chance to do that for a while, but you might be able to give it a try. There is usually a small amount of function renaming to get TADM to compile again.

      BTW, what's in Stephan's patch?

      Jason

       
    • Miles Osborne

      Miles Osborne - 2007-02-14

      Another possible reason is that you have too many active features on each data point.

      How many features does a data point typically have?

      This may cause problems for "evaluate" as it has a buffer size cap, if I recall

      Miles

       
    • Erik Velldal

      Erik Velldal - 2007-06-28

      Jason; the patch is just a simple check in `evaluate.cc' to ensure that the indexes of the features we read from the event file are not out of range with respect to the parameter vector.

      -erik

       

Log in to post a comment.