Segmentation Fault

Retrieval
A K
2011-09-14
2012-09-27
  • A K

    A K - 2011-09-14

    I am using Lemur 4.12 on a Fedora system. I am getting a segmentation fault
    when I try to run "makeprior".

    1. I have a shell script to embed priors for number of terms: called "makep.sh"

    It is of the following format:

    OBJ=/path-to-lemur/lemur-4.12/app/obj
    $OBJ/makeprior prior_param_A
    $OBJ/makeprior prior_param_B
    $OBJ/makeprior prior_param_C
    ..
    ...

    The prior_param_A looks like:

    <parameters>
    <index>/path-to-index/indri_index</index>
    <name>A</name>
    <input>/path-to-priors/priors/A</input>
    <memory>500M</memory>
    </parameters>

    When I run makep.sh , I get error of the following form:

    /makep.sh: line 2: 24956 Segmentation fault (core dumped) $OBJ/makeprior
    param_prior/prior_param_A
    ./makep.sh: line 3: 24959 Segmentation fault (core dumped) $OBJ/makeprior
    param_prior/prior_param_B
    ./makep.sh: line 4: 24961 Segmentation fault (core dumped) $OBJ/makeprior
    param_prior/prior_param_C
    ./makep2.sh: line 5: 24963 Segmentation fault (core dumped) $OBJ/makeprior
    param_prior/prior_param_D
    ../src/IndriFile.cpp(75): Couldn't create: /path-to-index/indri_index/prior/E

    The folder in /path-to-index/indri_index/prior/ remains unchanged. (I do have
    500M memory)

    gdb core.24956 tells me this:

    On Failed to read a valid object file image from memory.
    Core was generated by `/path-to-lemur/lemur-4.12/app/obj/makeprior prior_'.
    Program terminated with signal 11, Segmentation fault.

    0 0x0000003faf865c54 in ?? ()

    Looking into makeprior.cpp in /lemur-4.12/app/src (by inserting a few print
    statements) I think the following function does not work:

    (Line 276)

    std::vector<lemur::api::DOCID_T> result = env.documentIDsFromMetadata(
    docnoName, docnoValues );

    1. When I tried to just insert 1 document into the prior from 1 sample prior file I got the following error:

    Bad file in open_segment
    Bad file in open_segment
    Bad file in read_page
    Bad file in read_page
    Bad file in read_page
    Bad file in read_page
    Segmentation fault (core dumped)

    Question:

    The most likely case is that I am messing up somewhere or there is some
    problem with file acess. I would be grateful if you can give me any hint or
    intuition on what I should try to fix or what I should look into.

    Let me know if you require more information.

    Many thanks in advance,

    AK

     
  • David Fisher

    David Fisher - 2011-09-14

    makeprior does not perform conversion on the memory parameter, its value must
    be a number of bytes. The default value of 50MB is sufficient for most
    applications.

    Try removing that parameter, and also verify that the path to the index in the
    index parameter actually points to a valid indri repository (eg you can use
    IndriRunQuery to retrieve documents from it). The error that occurs in the
    second case makes it seem likely that your indri repository is corrupted in
    some fashion, or you don't have adequate permissions to read and write it.

     
  • David Fisher

    David Fisher - 2011-09-14

    I missed reading this:

              1. I have a shell script to embed priors for number of terms: called "makep.sh"
    

    If you mean that the entries in your priors file consist of terms and log
    probabilities, then your data is in error.

    Referring to the makeprior documentation, http://lemur.sourceforge.net/indri/
    makeprior.html,
    the
    data must be of the form:

    document-id log-probability

    Separately, I misread the makeprior source, the memory parameter may use the
    M/G suffix on the numbers, so there is nothing wrong with that parameter.

     
  • A K

    A K - 2011-09-14

    My prior files are indeed of format:

    "document-id logprobability"

    The prior_param_A looks like:

    <parameters>
    <index>/path-to-index/indri_index</index>
    <name>A</name>
    <input>/path-to-priors/priors/A</input>
    <memory>500M</memory>
    </parameters>

    Here "A" is a file of "document-id logprobability". What could be wrong? Is it
    a file access or permissions issue?

    Thank You,

    AK

     
  • A K

    A K - 2011-09-15

    You were right. The repository seemed to be corrupted. I rebuilt the index and
    for now everything seems to be working again. Thanks for that!
    Is there a limit to how many priors can be embedded into the index? Can I put
    more than 1000 priors? What does this depend on?

     
  • David Fisher

    David Fisher - 2011-09-15

    Each prior that you add adds a file to the index. All prior files are opened
    once when the repository is opened. Most of the supported OSes have a hard
    limit on the number of open files, OSX and linux each use 1024. There are
    roughly 30 open files for a given repository, so the maximum number of priors
    is somewhere less than 1000.

    Separately, if you are trying to model combinations of 1000s of priors to
    effect change in the retrieval, you would probably do better to use fewer
    priors consisting of aggregates of the larger set.

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks