Menu

#2356 Lilypond segfaults

Verified
nobody
Critical
2015-09-19
2012-02-26
Anonymous
No

Originally created by: *anonymous

Originally created by: PhilEHol...@googlemail.com

There have been 2 reports of lilypond segfaulting on complex scores.  Jay Anderson's report is here: http://lists.gnu.org/archive/html/bug-lilypond/2011-12/msg01167.html and Hu Haipeng's is here: http://old.nabble.com/problem-still-exists-in-the-latest-2.15-version-td33382315.html

Haipeng's complex score crashes LilyPond on my Vista 64-bit machine after 18 seconds compilation time, using about 350Megs memory, which is nowhere near the machine limit.

Neil suggested adding

\context {
   \GrandStaff
   \remove "Span_bar_stub_engraver"
}

to the score's layout block, but that does not fix the crash.

Haipeng says this does not crash on a previous version of Lilypond, but I can't check this because of new syntax in the version provided.

Related

Issues: #2272
Issues: #2356

Discussion

1 2 > >> (Page 1 of 2)
  • Google Importer

    Google Importer - 2012-02-27

    Originally posted by: n.putt...@gmail.com

    Sorry, that was the wrong suggestion.  It's actually the Span_bar_stub_engraver in the StaffGroup context which causes the crash.

    Adding

    \context {
       \StaffGroup
       \remove "Span_bar_stub_engraver"
    }

    works for me.

    The first bad commit is the one which adds the Span_bar_stub_engraver: [r20670d51f8d97fd390210dd239b3b2427f071e7c]

    It produces a different segfault though, since there was another bug shadowing the current one in Grob::get_vertical_axis_group () (which Mike fixed recently with [r70fd22ce9b84f9d3c1d44ffd79baafd370a389fb]).  Saying that, with the previous commit Haipeng's file seems to enter an infinite loop on my system and Jay's has an assertion failure related to TupletNumber offsets.

     
  • Google Importer

    Google Importer - 2012-03-01

    Originally posted by: dak@gnu.org

    Mike, any pointers of where to look here regarding the Span_bar_stub_engraver?  The commit is rather humongous.

    Cc: mts...@gmail.com

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: mts...@gmail.com

    valgrinding it gives different results every time - sometimes it segfaults in the interpreting stage, some times it makes it through later.

    i know that span-bar-stub-engraver.cc explicitly checks for null pointers, but i have a feeling that this is not necessarily done in the functions it uses.  another problem may be that it is working w/ dead grobs.

    what would really help is a minimal example, as it is difficult to isolate the problem w/ such a large score.

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: mts...@gmail.com

    More strangeness - I am getting a consistent segfault in scm-hash.cc.

    Scheme_hash_table::try_retrieve (SCM k, SCM *v)

    It's on the line:

    SCM handle = scm_hashq_get_handle (hash_tab_, k);

    This doesn't seem like the type of thing that'd usually crash.  I'm guessing that something got garbage collected that shouldn't have.  But it seems like the mark_smob method in context protects all of its scheme variables (save daddy_context_, but when I include it in mark_smob that doesn't change anything).

    What's difficult, as I said before, is not having a small example.  Trying to debug w/ printf's on something like this is near-impossible.

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: dak@gnu.org

    SCM *v is a recipe for trouble with regard to garbage collection.  It is not something that the Scheme garbage collector sees as Scheme.  So the variable under it needs to be protected separately.  I'll look some more.

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: dak@gnu.org

    SCM *v is a recipe for trouble with regard to garbage collection.  It is not something that the Scheme garbage collector sees as Scheme.  So the variable under it needs to be protected separately.  I'll look some more.

    Oh, and Mike?  In my last segfault hunt, I remarked:

    Anyway, one thing that has been useful is figuring out "target record" in gdb which lets you step backwards from a segfault.  Since various other optimizations made the stack backtrace less than useful (since the problem occurs with tail jump optimizations, the bad function is not actually present in the backtrace), this was quite helpful.

    Could be useful for stepping backwards from your segfault to the actual cause.

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: mts...@gmail.com

    I read a bit on the gdb website about this but I'm not quite sure how it works.
    If I have the file foo.ly that I want to compile with LilyPond, what would I need to do to target record and then step backwards?

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: dak@gnu.org

    Well you do something like
    gdb out/bin/lilypond
    break some_subroutine_likely_called_not_all_too_much_before_the_segfault
    run foo.ly
    target record
    continue
    [wait for a long time]
    [segfault occurs]
    reverse-step
    [repeat until you get somewhere where the data and debugging makes
    sense again]

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: mts...@gmail.com

    My computer doesn't like record :(

    Process record doesn't support instruction 0xfef at address 0x699778.
    Process record: failed to record execution log.

    [Thread 0xb7fe76d0 (LWP 1985)] #1 stopped.
    __strlen_sse2 () at ../sysdeps/i386/i686/multiarch/strlen.S:75
    75    ../sysdeps/i386/i686/multiarch/strlen.S: No such file or directory.
        in ../sysdeps/i386/i686/multiarch/strlen.S

    Is anyone else able to do this with Haipeng's score?

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: n.putt...@gmail.com

    I can take a look after dinner.

    Jay's score is much easier to work with - segfaults almost immediately here.

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: n.putt...@gmail.com

    I can't find a useful breakpoint unfortunately; getting too many continues.  Jay's score segfaults in the 104th bar.

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: dak@gnu.org

    Something like

    \ApplyContext #tanh

    before the crash might help.  Put a breakpoint on tanh, change to the
    target record and then just do

    return

    before tanh discovers it has been taken for a ride.  I doubt it gets
    used in LilyPond for anything else.

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: n.putt...@gmail.com

    There's a pair of cresc/descresc hairpins in the horn part in bars 103 - 104.  If I remove both dynamics, compilation continues.

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: dak@gnu.org

    Regarding comment 12: the breakpoint will need to be on scm_tanh with that kind of call.

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: n.putt...@gmail.com

    I tried

    \applyContext #atanh

    since Guile uses the standard maths library for tanh but can't seemt to trigger the breakpoint:

    Interpreting music... [8][16][24][32][40][48][56][64][72][80][88][96]<unnamed port>: In procedure + in expression (+ 1 z):
    <unnamed port>: Wrong type: #<Context Voice () >
    [Inferior 1 (process 10660) exited with code 01]

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: dak@gnu.org

    Breakpoint on scm_atanh?  Obviously, Scheme would not let this through to the real math routine.

     
  • Google Importer

    Google Importer - 2012-03-02

    Originally posted by: mts...@gmail.com

    I know it's kludgy, but you could set a property for the BarLine in that measure to something like:

    \override BarLine #'foo = ##t

    Then, in span_bar_stub_engraver, have a line that checks for this property and if it is set calls some exotic function (like scm_tanh or whatever).

     
  • Google Importer

    Google Importer - 2012-03-03

    Originally posted by: mts...@gmail.com

    I'm marking this as critical just because it is a regression and I don't think a stable release should go out that causes this sorta problem.  Any luck with gdb?

    Labels: -Type-Crash Type-Critical

     
  • Google Importer

    Google Importer - 2012-03-03

    Originally posted by: mts...@gmail.com

    Protects contexts in Span_bar_stub_engraver

    http://codereview.appspot.com/5727050

    Labels: Patch-new

     
  • Google Importer

    Google Importer - 2012-03-03

    Originally posted by: mts...@gmail.com

    Hey all,

    I don't have time to run regtests on the proposed patch, so my apologies if it doesn't work.  Both of the problematic files make it thru to compilation with this fix, although I have no clue how/why it does what it does and if there is a better/safer/smarter way to do it.

     
  • Google Importer

    Google Importer - 2012-03-03

    Originally posted by: dak@gnu.org

    "protecting" is an operation with permanent performance and memory impact during the life time of protection, so it is a bad idea to do it in situations where the pairing is not guaranteed (like in constructor/destructor pairs).  If contexts need to be kept alive for some engraver or other entity, the way to do that is to mark them during the gc mark phase.

    So this "fix" definitely looks wrong.  If you can trace the problem to premature collection of a context, that is where we need to look.

     
  • Google Importer

    Google Importer - 2012-03-03

    Originally posted by: dak@gnu.org

    Mike, we have the following:
    class Span_bar_stub_engraver : public Engraver
    {
      vector<Grob *> spanbars_;
      map<Grob *, Context *> axis_groups_;

    _None_ of all that, as far as I can see, is getting marked _anywhere_.  This is a garbage collection disaster waiting to happen.  Wait, it already happens.  Which is what this issue is about.

    Now one can mark all this, sure.  But walking through a map is effort.  Is there a reason you are using a C++ map here instead of a Scheme hashtable?  A Scheme hashtable only needs to get marked on its own and will keep its contents alive (or, if it is a weak hashtable, deal with their demise on its own).

    If you don't want to rewrite things, just create a derived_mark member function (it is called from translator.cc as a virtual function) for your engraver, and let it call scm_gc_mark on all values in your map.

    That's the way to do this sort of protection thing.

    Labels: -Patch-new Patch-needs_work

     
  • Google Importer

    Google Importer - 2012-03-03

    Originally posted by: dak@gnu.org

    Issue 2356: Segfault in spanbars.

    This protects the contexts in the internal data structure
    axis_groups_.  Incidentally, this looks like this data structure grows
    indefinitely and is never cleaned up again.  What's up with _that_?

    http://codereview.appspot.com/5732054

    Labels: Patch-new

     

    Related

    Issues: #2356

  • Google Importer

    Google Importer - 2012-03-03

    Originally posted by: dak@gnu.org

    And the whole process_acknowledged is one steaming heap of undocumented incomprehensible contorted...

    And I repeat: where is axis_group_ ever cleared out again?

     
  • Google Importer

    Google Importer - 2012-03-03

    Originally posted by: dak@gnu.org

    Patchy the autobot says: LGTM.  Passes basic tests on a 32bit system (like before the fix).  Feedback from 64bit testers is required.  And due to the total lack of documentation including what this engraver is actually supposed to do in detail, the original author (namely Mike) should check whether it is intended that axis_groups_ is never cleared out again.

    Labels: Patch-review

     
1 2 > >> (Page 1 of 2)
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.