|
From: Julian S. <ju...@va...> - 2005-07-19 13:27:22
|
Over the past year, a tremendous amount of development effort has gone into the Valgrind 3 line. It now runs on x86 and amd64 quite usably, and ppc32 (Linux) is looking promising. There have been a large number of of bug fixes, functionality improvments, and restructuring of the source code to enhance accessibility and maintainability. A GUI (Valkyrie) is also under development, and we plan to make coordinated releases of that along with Valgrind in the future. The time for a Valgrind-3.0 release draws near. I propose to have a release candidate (feature freeze) by next Monday 25 July, with a possible release on Monday 1 Aug. Valgrind-3.0 is already stable and usable on x86 and amd64. If you haven't already done so, please checkout and build it (easy: see http://www.valgrind.org/devel/cvs_svn.html), test it on whatever applications are critical for you, and let us know of any critical breakage. The more people who do this now, the better quality 3.0 release we will have. Outstanding issues which I'm aware of, and which need to be fixed, are: - decide on a version numbering / branch management scheme - decide about how to handle dependency on libvex - minor tweaks to XML output and to logfile naming (me) - fix: #88116 (x86, "enter" variant causes assertion) #96542 (x86, possible assertions with push variants) #87263 (x86, segment stuff) #103594 (x86, FICOM) INT/INT3 insns (x86) Missing 0xA3 insn (amd64) All of these I'll chase. - Update documentation (me, + ??) I'm sure there are other things I've forgotten/am not aware of. Much effort recently has gone into making ppc32-linux work well, but that target is not yet really usable. In particular there are problems with getting a low noise level from Memcheck, and some difficulties with floating point. Work to resolve these is in progress, but I do not know if it will be successful in the limited time before the release. We are already overdue for a 3.0 release and I am reluctant to delay it further in order to have ppc32 support. Therefore I propose to present 3.0 as a production-quality release for x86 and amd64 only, and if ppc32 happens to be usable, well that's an extra bonus. Obviously it would be wonderful to have ppc32 usable too, and I will endeavour to cause that to be the case. So: - pls download, build, test, report critical bugs - JosefW: what is the calltree status for the 3 line? It would be good to ensure that calltree/kcachegrind works well with 3.0. J |
|
From: Arndt M. <amu...@is...> - 2005-07-19 15:40:52
Attachments:
amuehlen.vcf
|
Its great to see such good work go on! What about helgrind? After dropping it in 2.4.0 are there any plans to get it up and running again in the 3.x line? I'm asking, because I'm currently doing some research on finding bugs in multi-threaded applications and helgrind was/would be a good start, being the only usable tool in this area that is free and with source available. BTW, is anybody using helgrind actually? Or are there just too many false reports for most applications? Greetings, Arndt |
|
From: Duncan S. <bal...@fr...> - 2005-07-19 16:16:25
|
> What about helgrind? > > After dropping it in 2.4.0 are there any plans to get it up > and running again in the 3.x line? > > I'm asking, because I'm currently doing some research on > finding bugs in multi-threaded applications and helgrind > was/would be a good start, being the only usable tool in > this area that is free and with source available. > > BTW, is anybody using helgrind actually? > Or are there just too many false reports for most applications? I am also interested in helgrind, though I haven't tried to use it seriously. My impression from a few quick tests was that it does catch real mistakes. Though it produced many false positives, the ratio of true to false positives didn't seem too bad. This was some time ago. All the best, Duncan. |
|
From: Julian S. <js...@ac...> - 2005-07-19 16:59:11
|
I must say I find helgrind perplexing and frustrating, because it's potentially such an incredibly useful tool, yet we have never really made it work as convincingly as I'd like, and now we seem to be slipping away from being able to support it. I'd love to be able to make a good showing with Helgrind in the future, but it strikes me that the difficulties with it put it right at the edge of the state-of-the-art. The key problem is getting observability on all the lock/unlock events that the client program does. We at least had a handle on that <= 2.2.0, but it all went south in 2.4.0. Jeremy tried valiantly to get it to go in 2.4.0, but that didn't work well enough to ship. Now I'm wondering if we could use our general redirect/intercept mechanism to catch all entries into libpthread. In 2.4.0 Jeremy tried one way of doing function wrapping, but it involved some pretty intrusive changes to the JIT which I wasn't happy about. However, we could surely do the relevant function wrapping with the mechanism we have in place in the valgrind-3 line already; we don't need fully general function wrapping to make this work. Urk. It's all complicated and nasty. J On Tuesday 19 July 2005 17:16, Duncan Sands wrote: > > What about helgrind? > > > > After dropping it in 2.4.0 are there any plans to get it up > > and running again in the 3.x line? > > > > I'm asking, because I'm currently doing some research on > > finding bugs in multi-threaded applications and helgrind > > was/would be a good start, being the only usable tool in > > this area that is free and with source available. > > > > BTW, is anybody using helgrind actually? > > Or are there just too many false reports for most applications? > > I am also interested in helgrind, though I haven't tried to use > it seriously. My impression from a few quick tests was that it > does catch real mistakes. Though it produced many false > positives, the ratio of true to false positives didn't seem too > bad. This was some time ago. > > All the best, > > Duncan. > > > > ------------------------------------------------------- > SF.Net email is sponsored by: Discover Easy Linux Migration Strategies > from IBM. Find simple to follow Roadmaps, straightforward articles, > informative Webcasts and more! Get everything you need to get up to > speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |
|
From: Nicholas N. <nj...@cs...> - 2005-07-19 22:22:00
|
On Tue, 19 Jul 2005, Julian Seward wrote: > I must say I find helgrind perplexing and frustrating, because it's > potentially such an incredibly useful tool, yet we have never really > made it work as convincingly as I'd like, and now we seem to be > slipping away from being able to support it. I'd love to be able to > make a good showing with Helgrind in the future, but it strikes me > that the difficulties with it put it right at the edge of the > state-of-the-art. I think data race detection is definitely not a solved problem. As Arndt pointed out, being able to use Valgrind/Helgrind as a platform for research in this area is a worthy goal. N |
|
From: Arndt M. <amu...@is...> - 2005-07-20 13:41:30
Attachments:
amuehlen.vcf
|
Nicholas Nethercote wrote: > On Tue, 19 Jul 2005, Julian Seward wrote: > >> I must say I find helgrind perplexing and frustrating, because it's >> potentially such an incredibly useful tool, yet we have never really >> made it work as convincingly as I'd like, and now we seem to be >> slipping away from being able to support it. I'd love to be able to >> make a good showing with Helgrind in the future, but it strikes me >> that the difficulties with it put it right at the edge of the >> state-of-the-art. > > > I think data race detection is definitely not a solved problem. As > Arndt pointed out, being able to use Valgrind/Helgrind as a platform > for research in this area is a worthy goal. That's the point, indeed. How much work is it to make Helgrind work in 3.0 again? (Roughly) I would like to volunteer if it's not too difficult and I get some help. Anyway, it would be good for me to understand Valgrind and Helgrind, because afterwards it would be easier to experiment with this platform. I suppose on-the-fly methods are the only practical ways to hunt for concurrent bugs, I found static or model checking techniques impractical for most real-world and large applications. Concerning the performance issues: Did you try to implement a logging mechanism that lets you do the analyses offline? Although, in most cases the amount of data that has to be stored would be too much, but sometimes it might be a useful feature. Arndt |
|
From: Dennis L. <pla...@in...> - 2005-07-19 17:12:19
|
At 18:47 19.07.2005, Nicholas Nethercote wrote: >On Tue, 19 Jul 2005, Duncan Sands wrote: > >Jeremy wrote some code to intercept pthread functions, which would allow >Helgrind to be reinstated. It's currently commented out in the 3.0 >version, and I'm not sure how well it was working. > >Helgrind doesn't seem very popular in general. The December 2003 survey >told us that it accounted for less than 1% of all Valgrind use, and the >feedback since then hasn't given me any indication that things have changed. > >Nonetheless, we would like to get Helgrind working again. The >pthread-interception stuff is also necessary to get the basic pthread >error checking working again. I personally used helgrind in the "old days" quite often. Although it gave tons of (more or less) false positives (sometimes program flow does prevent concurrent access which cant be detected by helgrind of course) it was really very useful and I found many bugs. I check rather often if something is done at the helgrind front since there is no similar free tool. I think a big problem why not many people use helgrind is because they know valgrind as "memory overflow, leak check and uninitialized value detection tool" but not as the powerful framework for many other tools (massif is of great use sometimes too) that it actually is. Perhaps more advertising other feature tools of valgrind will increase knowledge and thus usage too. Perhaps the next survey should include (if the old didnt already) a "have you heard about tool xxx previously" question before asking fo its usage... greets Dennis Carpe quod tibi datum est |
|
From: Dennis L. <pla...@in...> - 2005-07-19 22:54:54
|
At 20:37 19.07.2005, Nicholas Nethercote wrote: >On Tue, 19 Jul 2005, Dennis Lubert wrote: > >This is good to know... very few others have said things like this which >is why Helgrind has always been a lower priority thing. Generally, >feedback indicates to us that people really hate false positives, and >Helgrind's high number of them is a big problem for most people. Yeah, false positives are always nasty. The problem with multithreading is that locking there is a really complex field, and more complex is effective programming where no locking is needed sometimes because of program flow prevents concurrent access. On the one hand I doubt that there will be ever an effective algorithm that will eliminate all those false positives. On the other hand I dont think that it will be necessary, since people who write multithreaded programs should be able to identify false positives. One thing that comes to my mind here is some kind of online-auto-suppression. In the current suppression mechanism, you ask for generating suppression, copy/paste it into some suppression file, and use this at the command line to re-run the program. Better would be some mechanism, where the currently generated suppresion can be chosen to be suppressed for the further program run, and automagically entered into some program-specific suppression file. So "disabling" the false positives would be rather fast and filtering the real bad situations should be faster and more straightforward... >Maybe a brief description of each one in the usage question would be useful. Yeah, great idea... of every installed tool if this is possible, so additionally installed tools can advertise themselves too... Carpe quod tibi datum est |
|
From: Nicholas N. <nj...@cs...> - 2005-07-20 05:32:43
|
On Wed, 20 Jul 2005, Dennis Lubert wrote: > In the current suppression mechanism, you ask for generating suppression, > copy/paste it into some suppression file, and use this at the command line to > re-run the program. Better would be some mechanism, where the currently > generated suppresion can be chosen to be suppressed for the further program > run, and automagically entered into some program-specific suppression file. > So "disabling" the false positives would be rather fast and filtering the > real bad situations should be faster and more straightforward... Yes. People have written patches for this before, but it's never got into the repository, either because the patches were no good or none of the developers had the time or inclination... N |
|
From: Nicholas N. <nj...@cs...> - 2005-07-19 16:47:32
|
On Tue, 19 Jul 2005, Duncan Sands wrote: >> What about helgrind? >> >> After dropping it in 2.4.0 are there any plans to get it up >> and running again in the 3.x line? >> >> I'm asking, because I'm currently doing some research on >> finding bugs in multi-threaded applications and helgrind >> was/would be a good start, being the only usable tool in >> this area that is free and with source available. >> >> BTW, is anybody using helgrind actually? >> Or are there just too many false reports for most applications? > > I am also interested in helgrind, though I haven't tried to use > it seriously. My impression from a few quick tests was that it > does catch real mistakes. Though it produced many false > positives, the ratio of true to false positives didn't seem too > bad. This was some time ago. Jeremy wrote some code to intercept pthread functions, which would allow Helgrind to be reinstated. It's currently commented out in the 3.0 version, and I'm not sure how well it was working. Helgrind doesn't seem very popular in general. The December 2003 survey told us that it accounted for less than 1% of all Valgrind use, and the feedback since then hasn't given me any indication that things have changed. Nonetheless, we would like to get Helgrind working again. The pthread-interception stuff is also necessary to get the basic pthread error checking working again. N |
|
From: Crispin F. <val...@fl...> - 2005-07-19 16:54:14
|
On Tue, 2005-07-19 at 14:27 +0100, Julian Seward wrote: > - pls download, build, test, report critical bugs I downloaded this, and gave it a test with the Zeus Web Server (www.zeus.com): - From basic tests, it appears to me that it is quite a bit slower (2.4 SVN starts up the server in 11 seconds, whereas 3.0 SVN takes 23 seconds). - More seriously, 3.0 seems to have lots of: ==903== Use of uninitialised value of size 4 ==903== at 0x81AC95E: (within zeus.web) and these don't have any backtrace at all, whereas when it complains that the read was inside a malloc, I get a backtrace just fine (this is with debug builds) - Interestingly, 2.4 doesn't report _any_ of these uninitialised values, and when running under 3.0 the server segfaults whereas it doesn't under 2.4 or without valgrind. I am wondering if 3.0 and 2.4 report different cpu capabilities (such as mmx, sse etc) and one of the libraries we use is using different code paths. That might explain the uninitialised values, although I don't see how that could explain the lack of backtraces. I realise that this isn't 100% helpful without a testcase, but extracting the code for a testcase as the library I believe is causing problems is something we don't have the code for :-( (Linux/x86) Crispin |
|
From: Julian S. <ju...@va...> - 2005-07-19 17:22:00
|
> - From basic tests, it appears to me that it is quite a bit slower (2.4 > SVN starts up the server in 11 seconds, whereas 3.0 SVN takes 23 > seconds). The new JIT is a lot slower than the 2.4 one, unfortunately. Most of the startup delay is likely to be translation time. > - Interestingly, 2.4 doesn't report _any_ of these uninitialised values, > and when running under 3.0 the server segfaults whereas it doesn't under > 2.4 or without valgrind. Euh, that sucks. Do you get any useful logging output just before the crash? Does it still crash with --tool=none? What about with --tool=none --vex-iropt-level=0 ? > > I am wondering if 3.0 and 2.4 report different cpu capabilities (such as > mmx, sse etc) Quite possibly. Rerun with -v -v and look for a line like this: --5075-- Host CPU: arch = X86, subarch = x86-sse2 J |
|
From: Nicholas N. <nj...@cs...> - 2005-07-19 18:37:19
|
On Tue, 19 Jul 2005, Dennis Lubert wrote: > I personally used helgrind in the "old days" quite often. Although it gave > tons of (more or less) false positives (sometimes program flow does prevent > concurrent access which cant be detected by helgrind of course) it was really > very useful and I found many bugs. This is good to know... very few others have said things like this which is why Helgrind has always been a lower priority thing. Generally, feedback indicates to us that people really hate false positives, and Helgrind's high number of them is a big problem for most people. > I check rather often if something is done at the helgrind front since there > is no similar free tool. I think a big problem why not many people use > helgrind is because they know valgrind as "memory overflow, leak check and > uninitialized value detection tool" but not as the powerful framework for > many other tools (massif is of great use sometimes too) that it actually is. > Perhaps more advertising other feature tools of valgrind will increase > knowledge and thus usage too. Perhaps the next survey should include (if the > old didnt already) a "have you heard about tool xxx previously" question > before asking fo its usage... Maybe a brief description of each one in the usage question would be useful. N |