From: James H. <bi...@de...> - 2012-09-30 08:35:07
|
My first post... bear with me. I have an application-crash issue that isn't behaving like other crashes I've seen. (I searched the list for "msvcrt!_abnormal_termination" and the only matching post wasn't relevant.) The application is SuperCollider [1], a programming language for music and audio. The backend is in c++ and implements its own byte code interpreter for the object-oriented language. Typically, code is run interactively using one of several editor interfaces, but this issue reproduces when using the generic readline interface, so we don't have to worry about other front ends. The issue occurs when something (in the sc-language) is scheduled on the most commonly used clock, BUT -- only in Windows XP. There's no problem in Windows 7. Also, it isn't a general problem with scheduling in the language. Another type of clock (SystemClock, which always runs in seconds) doesn't show the problem. // this should wait half a beat, then print "hello\n" in the console // 1 beat usually = 1 second, but the tempo can be changed (hence TempoClock) TempoClock.sched(0.5, { "hello".postln }); Running sclang in gdb turned up a surprisingly short stack trace. No prior stack frames before the abnormal_termination. #0 0x77c3554a in msvcrt!_abnormal_termination () from C:\WINDOWS\system32\msvcrt.dll The results of "thread apply all bt" can be found at [2], about halfway down the page. I also wanted to "git bisect," but I can't find a "good" commit in my XP build environment. This is strange because of the issue's history: - SC3.4.x was compiled using MSVC, with a python front-end. No problem with the above code. - SC3.5 - 3.5.4 was compiled using mingw, and can be run from the command line (readline interface) or using gedit as a front-end. In the binary releases (built in win7), there is no problem with the above code in either win7 or XP. - SC3.6 is also compiled using mingw, and the binary (alpha) releases so far have been built in win7. No problems with the test code in win7, but it crashes in all XP environments. - At first we thought it might be an issue with using a binary in XP, which was built in win7, so I set up a mingw environment on one computer that is still running XP, using the same dependent libraries that were used for the win7 build. In my builds, the issue occurs going all the way back to the 3.5.0 source code -- whereas the problem doesn't happen in the 3.5.x builds that were made by my colleague running win7 (but it does occur in his 3.6 builds). Oh, and the problem goes away in my builds if I switch my CMAKE_BUILD_TYPE to "Debug." What's making this especially difficult to troubleshoot is that none of the SuperCollider developers knows Windows development very well: enough to configure mingw but not enough to know what to do with mysterious problems that seem to involve the "fine print" of Windows system DLLs. So I thought I would ask if anyone has seen anything like this, or has any ideas how to get more information when gdb doesn't say much and git bisect can't be used. (Or, if there's a better place to ask the question, recommendations would be welcome.) Thanks, hjh [1] http://supercollider.sourceforge.net [2] https://github.com/supercollider/supercollider/issues/547 |
From: James H. <bi...@de...> - 2012-09-30 08:52:33
|
James Harkins <biz@...> writes: > The issue occurs when something (in the sc-language) is scheduled on the most > commonly used clock... Sorry, I forgot mingw version = 4.4.0 (to match Qt 4.8.0, which is used in SC). hjh |
From: LRN <lr...@gm...> - 2012-09-30 11:57:58
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 30.09.2012 12:32, James Harkins wrote: > > Oh, and the problem goes away in my builds if I switch my > CMAKE_BUILD_TYPE to "Debug." > That's a good pointer. Try reading gcc documentation, fish out the optimization flags that -O, -O2 and -O3 expand to, then apply them one by one on top of -O0, until you hit the bug. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (MingW32) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iQEcBAEBAgAGBQJQaDO6AAoJEOs4Jb6SI2CwyQ0H/jY0qMXaLLCjjxdZL62xGI+0 ZGmUlQ2j4voXevIWW6DFb31dr4jrlrcT4dgbS0pF0VxIz7m7sMDag4Sq12VcC4yT Dvcie1aJWnGt/rg5GbZs0AKngdr04SH91jDtrJJJR4oL87UHmnLut9woGEkXsyYc Can1Qu+kT9wsLfzmoa882ittuM+8g5uVXdMmXQYaENSrIHWjW7DdGpY/l13q2061 PrH254C8Tgr/jM1mPi8uF4OZYVhBZtrICAJ22hXyuPEbpESCS+o+SD3Goyv3+2BD v37aY0LwAgBIkB6t0NL21GKEQ2/2T1mZGNq98H2fpqPW5AfF9XBr1dzP8tX5XEM= =PnFM -----END PGP SIGNATURE----- |
From: Earnie B. <ea...@us...> - 2012-09-30 16:27:57
|
On Sun, Sep 30, 2012 at 4:32 AM, James Harkins wrote: > My first post... bear with me. I have an application-crash issue that isn't > behaving like other crashes I've seen. (I searched the list for > "msvcrt!_abnormal_termination" and the only matching post wasn't relevant.) > --8<-- > > #0 0x77c3554a in msvcrt!_abnormal_termination () > from C:\WINDOWS\system32\msvcrt.dll > --8<-- > > Oh, and the problem goes away in my builds if I switch my CMAKE_BUILD_TYPE to > "Debug." > This all sounds like a memory out of bounds issue in the program. The fact that you execute fine with no optimization but abort with optimization is what gives me a clue. HMM... But this interesting post http://markmail.org/message/hlg5atswqkijgt6e#query:+page:1+mid:v45o2vwi36iemyh7+state:results points to a bad implementation of setjmp/longjmp so we need to follow up further if this is a GCC issue or not. -- Earnie -- https://sites.google.com/site/earnieboyd |
From: James H. <bi...@de...> - 2012-10-01 02:13:22
|
LRN <lrn1986@...> writes: > On 30.09.2012 12:32, James Harkins wrote: > > > > Oh, and the problem goes away in my builds if I switch my > > CMAKE_BUILD_TYPE to "Debug." > > > That's a good pointer. Try reading gcc documentation, fish out the > optimization flags that -O, -O2 and -O3 expand to, then apply them one > by one on top of -O0, until you hit the bug. This is a good idea. That will take some time, but worth it for the new information. That's in line with Earnie's guess, so I'll report back when I know more. The particular optimization might help narrow down whether it is or isn't a gcc bug. Thanks to both of you! James |
From: James H. <bi...@de...> - 2012-10-14 12:31:31
|
LRN <lrn1986@...> writes: > > > On 30.09.2012 12:32, James Harkins wrote: > > > > Oh, and the problem goes away in my builds if I switch my > > CMAKE_BUILD_TYPE to "Debug." > > > That's a good pointer. Try reading gcc documentation, fish out the > optimization flags that -O, -O2 and -O3 expand to, then apply them one > by one on top of -O0, until you hit the bug. Coming back to this thread a few weeks later... it's taken that long to set up the build environment and do enough testing to discover that... I'm more confused than ever :) First I thought, a good way to reduce the amount of testing would be to try -O1, then -O2 until it breaks. -O1 was fine. -O2 failed. OK. Someone on the SuperCollider list suggested a binary search (for 24 flags, that would cut the number of tests in about half: ceil(log2(24)) is 5, and I would want to test both halves, because there's no guarantee that a failed test (with crash) means that the other half would pass. So I tested half the -O2 flags, no crash, and then the other half and... no crash. But, I can reproduce the crash on demand using -O2 without listing the individual flags. I also tried listing *all* the O2 flags after -O1... no crash. So, what I've learned so far is that -O2 does something different from the individual flags listed individually on the command line. But I don't know what is different, only that it crashes with the aggregate specification -O2, but doesn't crash when I try to drill into more detail. That places a severe limit on my ability to obtain more information about the problem :-) Does anyone have some insight what could cause this? Thanks. hjh |
From: JonY <jo...@us...> - 2012-10-14 13:04:09
Attachments:
signature.asc
|
On 10/14/2012 20:31, James Harkins wrote: > LRN <lrn1986@...> writes: > >> >> >> On 30.09.2012 12:32, James Harkins wrote: >>> >>> Oh, and the problem goes away in my builds if I switch my >>> CMAKE_BUILD_TYPE to "Debug." >>> >> That's a good pointer. Try reading gcc documentation, fish out the >> optimization flags that -O, -O2 and -O3 expand to, then apply them one >> by one on top of -O0, until you hit the bug. > > Coming back to this thread a few weeks later... it's taken that long to set up > the build environment and do enough testing to discover that... I'm more > confused than ever :) > > First I thought, a good way to reduce the amount of testing would be to try -O1, > then -O2 until it breaks. -O1 was fine. -O2 failed. OK. > > Someone on the SuperCollider list suggested a binary search (for 24 flags, that > would cut the number of tests in about half: ceil(log2(24)) is 5, and I would > want to test both halves, because there's no guarantee that a failed test (with > crash) means that the other half would pass. > > So I tested half the -O2 flags, no crash, and then the other half and... no > crash. But, I can reproduce the crash on demand using -O2 without listing the > individual flags. I also tried listing *all* the O2 flags after -O1... no crash. > > So, what I've learned so far is that -O2 does something different from the > individual flags listed individually on the command line. But I don't know what > is different, only that it crashes with the aggregate specification -O2, but > doesn't crash when I try to drill into more detail. That places a severe limit > on my ability to obtain more information about the problem :-) > > Does anyone have some insight what could cause this? -O2 activates some unnamed optimization routines, your best bet is to look for where in your program it starts calling out to the windows system code that causes the crashing and step debug from there. It is likely there are some bad code, try recompiling with -Wextra -Wall -O2 and inspect the warnings carefully. |
From: Earnie B. <ea...@us...> - 2012-10-14 16:58:06
|
On Sun, Oct 14, 2012 at 9:03 AM, JonY wrote: > On 10/14/2012 20:31, James Harkins wrote: >> LRN <lrn1986@...> writes: >> >>> >>> >>> On 30.09.2012 12:32, James Harkins wrote: >>>> >>>> Oh, and the problem goes away in my builds if I switch my >>>> CMAKE_BUILD_TYPE to "Debug." >>>> >>> That's a good pointer. Try reading gcc documentation, fish out the >>> optimization flags that -O, -O2 and -O3 expand to, then apply them one >>> by one on top of -O0, until you hit the bug. >> >> Coming back to this thread a few weeks later... it's taken that long to set up >> the build environment and do enough testing to discover that... I'm more >> confused than ever :) >> >> First I thought, a good way to reduce the amount of testing would be to try -O1, >> then -O2 until it breaks. -O1 was fine. -O2 failed. OK. >> >> Someone on the SuperCollider list suggested a binary search (for 24 flags, that >> would cut the number of tests in about half: ceil(log2(24)) is 5, and I would >> want to test both halves, because there's no guarantee that a failed test (with >> crash) means that the other half would pass. >> >> So I tested half the -O2 flags, no crash, and then the other half and... no >> crash. But, I can reproduce the crash on demand using -O2 without listing the >> individual flags. I also tried listing *all* the O2 flags after -O1... no crash. >> >> So, what I've learned so far is that -O2 does something different from the >> individual flags listed individually on the command line. But I don't know what >> is different, only that it crashes with the aggregate specification -O2, but >> doesn't crash when I try to drill into more detail. That places a severe limit >> on my ability to obtain more information about the problem :-) >> >> Does anyone have some insight what could cause this? > > -O2 activates some unnamed optimization routines, your best bet is to > look for where in your program it starts calling out to the windows > system code that causes the crashing and step debug from there. > > It is likely there are some bad code, try recompiling with -Wextra -Wall > -O2 and inspect the warnings carefully. Also use the -v option to see what is actually used when -O2 is given. -- Earnie -- https://sites.google.com/site/earnieboyd |