Originally created by: *anonymous
Originally created by: v.villenave
The following code (which basically amounts to what showLastLength does, as inspired by issue 1173) produces a segfault:
\version "2.13.36"
<<
{
\time 3/4
\set Score.skipBars = ##t
% permuting the two following lines fixes the problem.
a4 a a
[r2].*2
}
\\
{
\set Score.skipTypesetting = ##t
s2. s4
\set Score.skipTypesetting = ##f
}
>>
This looks like a serious problem, and probably requires to modify the Timing translator (possibly introducing a new property) or the Score engraver.
Originally posted by: percival.music.ca@gmail.com
why a new property? Can't you have a check in the Timing translator that says "if Score.skipBars is set, give the user an error message and then quit" ?
The goal is not to produce output. The goal is just to quit gracefully.
Originally posted by: pnorcks@gmail.com
I don't know how to fix this right now, but I'll attach the backtrace for the segfault.
Originally posted by: percival.music.ca@gmail.com
Here's a not-serious patch to avoid the crash. Note that I haven't used the proper way to quit, and probably shouldn't be using programmer_error() anyway.
The point of this is just to move development to the next stage:
- either figure out why centered_on_object(NULL, a) is called with a null pointer, or
- decide we don't care, and just write a nice "bailing" message (and quit with the appropriate function, which is probably not exit(1)).
diff --git a/lily/self-alignment-interface.cc b/lily/self-alignment-interface.cc
index e339e80..615c732 100644
--- a/lily/self-alignment-interface.cc
+++ b/lily/self-alignment-interface.cc
@@ -66,6 +66,11 @@ Self_alignment_interface::aligned_on_self (Grob *me, Axis a,
SCM
Self_alignment_interface::centered_on_object (Grob *him, Axis a)
{
+ if (him == NULL) {
+ programming_error ("Do not have an object to center on!");
+ programming_error ("(quitting, despite the next message you see. :)");
+ exit (1);
+ }
return scm_from_double (robust_relative_extent (him, him, a).center ());
}
Originally posted by: percival.music.ca@gmail.com
Another option would be to add a null pointer check to
lily/grob.cc
line 786
robust_relative_extent (Grob *me, Grob *refpoint, Axis a)
and return an empty Interval if me == 0. Given the "robust" name, it might make sense to put the check there, rather than self_alignment_interface.
Originally posted by: percival.music.ca@gmail.com
A bit more data: in the crash-case, grob-property.cc internal_get_property() calls internal_get_property_data () with minimum-X-extent, and receives the answer of
()
I'm guessing that at some other place, it tries to add up these values, and can't handle adding 0.0 and ().
I see two options:
1) make
Grob::internal_get_property_data (SCM sym) const
not return a () when looking up minimum-X-extent in this case.
2) robustify the addition such that it automatically treats () as a 0.
I'm guessing that #1 is the preferable option, but I have no real feeling for this stuff. I won't be working on this for another week, so if anybody else wants to jump in now, go right ahead.
Originally posted by: n.putt...@gmail.com
> A bit more data: in the crash-case, grob-property.cc internal_get_property() calls internal_get_property_data () with minimum-X-extent, and receives the answer of
()
Are you sure that's the root cause though? It looks to me like the MultiMeasureRest has invalid bounds when cloned, so the cloned object is NULL when the MultiMeasureRestNumber is aligned (the parent object no longer exists).
I can see several workarounds for this, but ideally the MutiMeasureRestNumber would be suicided if the parent object's invalid. Unfortunately, it doesn't look as if this is possible in the engraver (at that point the original rest appears to have valid bounds).
Originally posted by: percival.music.ca@gmail.com
Depends what you mean by "root cause". :)
There's two ways of debugging this kind of thing: the intelligent way, and the stupid way.
Intelligent: think about the lilypond internals, the data structures, the architecture, etc. Figure out which object/engraver/translator is misbehaving.
Stupid: this segfault comes from trying to use a NULL pointer. So figure out where that NULL pointer is coming from, by tracing back through the backtrace, ading printf() statements where appropriate, and comparing the text output from a successful program run (with a slightly modified input file), and the segfault program run.
You're thinking about it in the intelligent way. I've been working through it in the stupid way. Partly because I have relatively little knowledge of our internals, but partly because I've been working on this when my brain is fried from other stuff.
Originally posted by: percival.music.ca@gmail.com
For the sake of shared investigation, here's a patch showing how I reached the conclusion I did, and two files (bad.ly is the initial report; good.ly is the tweaked version, so that I can compare the output in the good and bad cases).
Originally posted by: Carl.D.S...@gmail.com
Here's a patch that seems to avoid the segfault in what I think is a reasonable way. It checks for a null grob to align on and returns an offset of zero if it's found, along with setting a programming error.
diff --git a/lily/self-alignment-interface.cc b/lily/self-alignment-interface.cc
index e339e80..23a89a0 100644
--- a/lily/self-alignment-interface.cc
+++ b/lily/self-alignment-interface.cc
@@ -66,6 +66,11 @@ Self_alignment_interface::aligned_on_self (Grob *me, Axis a, bool
SCM
Self_alignment_interface::centered_on_object (Grob *him, Axis a)
{
+ if (!him)
+ {
+ programming_error ("cannot center: null grob");
+ return scm_from_int (0);
+ }
return scm_from_double (robust_relative_extent (him, him, a).center ());
}
It's probably not as "correct" as suiciding the part of the grob that goes away with the earlier programming error, but I think it fixes the critical issue.
Patch is also available on Rietveld:
http://codereview.appspot.com/3308043
Originally posted by: percival.music.ca@gmail.com
Other than my use of exit(0) and printf, that's the same as my patch in comment 3 on 1 Nov.
I'm not saying this to be snarky, or to object to this patch, but rather to highlight our inefficient use of resources. If this is an acceptable way to fix lilypond, and if we'd had a few more people discussing it, then this critical issue could have been closed almost a month ago.
As far as the patch goes, it looks fine to me -- but I'm notably ignorant about the internals and our general "architecture style".
Originally posted by: v.villenave
I may be completely wrong, but is there a particular reason why the MultiMeasureRest has to be printed as a Spanner *at all*? I mean, it's not breakable, it isn't really extended horizontally, it doesn't have a left or right bound...
Perhaps implementing it differently could also help us solve syntax inconsistencies such as [r1]\fermata vs [r1]-\fermataMarkup? At any cost, it would certainly solve this specific issue.
(I know, this isn't very helpful. :-)
Originally posted by: v.villenave
Carl has been investigating a bit more: see the discussion on http://www.mail-archive.com/lilypond-devel@gnu.org/msg32506.html
Originally posted by: v.villenave
(Answering my own suggestion: that's silly, I didn't remember that MultiMeasureRest could print something else than church rests. Of course we need a Spanner.)
(Sorry for the noise. Have a nice game of volleyball :)
Originally posted by: Carl.D.S...@gmail.com
Re Graham (comment 10):
No snarkiness taken.
Neil's earlier comment indicated this might not be the right way to solve things, so I was waiting for him.
After our recent discussion, I decided to play volleyball, instead of waiting. So I put together a patch that used appropriate lilypond functions and would allow the layout to continue to completion, albeit with an error message. It's probably not optimal, but it's the best I've been able to come up with so far.
In Neil's comment on the -devel thread,
<http://thread.gmane.org/gmane.comp.gnu.lilypond.devel/32472>
he proposed another possible idea:
> See spanner.cc, lines 113-121.
>
> If this code doesn't exit the loop early, the crash goes away (since
> it allows the cloned spanner to exist outside its parent's bounds).
> Cheers,
> Neil
Perhaps we could bracket the continue statement in a test for Score.SkipBars, and only execute it if SkipBars were false.
Originally posted by: Carl.D.S...@gmail.com
OK, so I've commented out the continue statement on line 120 of lily/spanner.cc
I get the output shown in 1336-no-continue.png. The multimeasure rest shows up before the time signature with the multimeasure rest number above it.
If I leave the continue in, and the patch I proposed above, I get the output of 1336-continue.png. The multimeasure rest is gone, and the multimeasure rest number shows up at the left hand margin of the page.
I'm not sure which behavior I prefer. I'd really prefer to suicide the multimeasure rest number, but I haven't been able to work out how to do that yet.
Originally posted by: Carl.D.S...@gmail.com
I tried following up in ly/grob.cc where the offset calls start from. At the time of the call, the parent is not null. But at the time of the delayed evaluation, it apparently is null. I assume this is what Neil meant when he said it wasn't available from the engraver.
I suspect that one of the above patches is the best we can do, at least for now.
Originally posted by: percival.music.ca@gmail.com
Valentin: that wasn't a silly suggestion. We're brainstorming here, so any contribution is worthwhile.
Carl: I've been caught up doing oral exams for the 4th-year engineering programming course I'm TAing, but I'll have a look at the spanner stuff soon. If I can't see any promising leads in an hour or two, then I'll do a regtest check on your "return 0" patch, and then we'll put it forward as an official fix. The goal isn't to produce good output; it's just to avoid a segfault. Somebody else can work on producing good output for this particular input, later on.
Originally posted by: v.villenave
Okay. Then here's another one: how come that MultiMeasureRestNumber (as well as MultiMeasureRestText, for that matter) have to be spanners? I've tried replacing it with an item (by doing a mashup with code pasted from bar-number-engraver.cc), but I (obviously) couldn't make it work.
BTW: guys, have you noticed that
\override Score.MultiMeasureRestNumber #'stencil = ##f
prevents the segfault from happening, even without Carl's patch?
(Which is to say, a quick-and-dirty fix would be to remove MultiMeasureRestNumbers whenever skipTypesetting is set. Granted, that is a bit like playing volleyball with your feet :-)
Originally posted by: n.putt...@gmail.com
I was reminded of the following comment in paper-column-engraver.cc,
268 TODO: don't make columns when skipTypesetting is true.
thinking it might be another angle to try, and funnily enough it actually seems to work without any problems (in fact, it improves the appearance of two of the skipTypesetting regtests).
Surely there must be a catch somewhere? :)
Originally posted by: v.villenave
Wow, that's simply awesome! I'm still trying to get my head around it, but this looks very much like the way to go... (plus, it could solve other potential undetected-as-of-yet problems with skipTypesetting, couldn't it?)
Originally posted by: n.putt...@gmail.com
> I'm still trying to get my head around it
Me too. :)
I've looked at that TODO before, but hadn't realized how simple it might be to implement. Frankly, I'm surprised there are no side-effects with the regression checking.
> plus, it could solve other potential undetected-as-of-yet problems with skipTypesetting, couldn't it?
I hope we don't discover any more. ;)
Originally posted by: percival.music.ca@gmail.com
Valentin: woah, that #stencil = ##f was a good find!
Neil: I'll do a regtest comparison and full rebuild from scratch to test your patch in a few hours, but at the moment it certainly looks like a great solution.
Originally posted by: percival.music.ca@gmail.com
There's (IMO acceptable/good) changes to:
skiptypesetting-show-last.ly
skiptypesetting-tuplet.ly
In particular, the "extra" bar line that's between the clef and time signature is removed after applying Neil's patch. The output of those two regtests just looks like normal music.
This seems like a quite acceptable change. Neil, could you make a new patch which removes the TODO comment, and then put it up on reitveld for an official 24-hour review?
Originally posted by: n.putt...@gmail.com
> Neil, could you make a new patch which removes the TODO comment, and then put it up on reitveld for an official 24-hour review?
Unfortunately, there does appear to be a problem with the patch, which I'm unsure how to fix without more investigation: setting skipTypesetting = ##t for a full score (or doing the equivalent using showLastLength = s1*0) causes a segfault.
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff5b89103 in __dynamic_cast () from /usr/lib/libstdc++.so.6
(gdb) bt
#0 0x00007ffff5b89103 in __dynamic_cast () from /usr/lib/libstdc++.so.6
#1 0x000000000062c344 in get_column_description (cols=..., col_index=0, line_starter=true) at simple-spacer.cc:367
#2 0x000000000062c9b0 in get_line_forces (columns=..., line_len=89.341653543307075, indent=0, ragged=false) at simple-spacer.cc:421
#3 0x0000000000488b48 in Constrained_breaking::initialize (this=0x7fffffff8530) at constrained-breaking.cc:444
#4 0x0000000000487eb0 in Constrained_breaking::Constrained_breaking (this=0x7fffffff8530, ps=0xcb24e0, start=...) at constrained-breaking.cc:360
There's only one pair of paper columns in the score (created in Paper_column::initialize () before skipTypsetting can be read) which seems to cause all sorts of problems for the breaking algorithm.
Originally posted by: percival.music.ca@gmail.com
Attached is a minimal example to reproduce the crash.
Without much testing or thought, here's one way to avoid the crash. If I comment out the skipTypesetting line, I see sizes of 3, 2 instead of 1, 2. This is probably just treating the symptoms, though.