|
From: <sv...@va...> - 2005-08-29 22:30:36
|
Author: njn Date: 2005-08-29 23:30:32 +0100 (Mon, 29 Aug 2005) New Revision: 185 Log: Added a "project suggestions" page. Added: trunk/devel/projects.html Modified: trunk/php/menu.php trunk/support/contributing.html Added: trunk/devel/projects.html =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- trunk/devel/projects.html 2005-08-29 22:30:12 UTC (rev 184) +++ trunk/devel/projects.html 2005-08-29 22:30:32 UTC (rev 185) @@ -0,0 +1,417 @@ + +<h1>Project Suggestions</h1> + +<p>This page gives a list of Valgrind projects that people might like to +try. They range from quite easy hacking projects to research-level +problems. If you plan to try one of these projects, you should +subscribe to <?php echo vglink( 'vgdevel' ); ?> list, and also write to +it to let us know you are doing the project, and you can ask questions +there also.</p> + +<p>Please note that we are very conservative about adding new features. +Only features that are useful to many users, and that do not adversely +affect the maintainability or correctness of the code base in adverse +ways are likely to be accepted. If you want to implement a feature not +mentioned on the following list, please ask on the valgrind-developers +list if it is likely to be incorporated before starting.</p> + +<p>Also, please understand that there is no guarantee that code you +write will be incorporated into Valgrind. It depends on a number of +factors: how well written it is, how important are the issues it +addresses, how does it affect the code base's structure, and so on. +Such is the nature of all free software projects. However, if you +consistently submit high quality patches, you may be granted write +access to the repository. This is how most of the current developers got +involved with Valgrind.</p> + +<h2>Software Infrastructure</h2> + +<h3>Profiling Valgrind</h3> +<p>We haven't had a good way to profile Valgrind for something like 2 ye= ars. +If we could find a way to profile Valgrind, there would certainly be som= e +easy speed-ups to find. That would make everybody happier!</p> + +<p>Valgrind used to have a tick-based profiler. A timer was set up to s= end +SIGALRM 100 times per second, and every time it was caught Valgrind woul= d +record where in its code it was. This gave a good view of which parts o= f +the code were responsible for the most amount of runtime. The code for = this +is still there (see coregrind/m_profile.c) but it hasn't worked for a lo= ng +time. If you activate it (set VG_DO_PROFILING in include/pub_tool_profi= le.h, +recompile, and use --profile=3Dyes) Valgrind bombs due to signal issues. +Also, this relies on the glibc function setitimer(), and we are in the +process of removing all dependencies on glibc.</p> + +<p>Neither gcov nor gprof work with Valgrind, basically because Valgrind= does +weird stuff that they cannot handle. Perhaps another tool (OProfile?) w= ould +be appropriate.</p> + +<p>Using Cachegrind is a possibility, although Valgrind's support for ru= nning +itself is very flaky at the moment. Improving this support would also l= et +us run Valgrind itself under Memcheck.</p> + +<p>Or if anyone knows how to get the tick-based profiler working again +without relying on glibc functions, that would be a good start. (Added +August 27, 2005)</p> + + +<h3>Performance regression testing</h3> +<p>We currently have some scripts to run the regression tests nightly on +a range of machines. This is very useful for spotting correctness +regressions. Equally useful would be a system for spotting performance +regressions (or improvements).</p> + +<p>This would involve running the Valgrind tools on a given suite of +programs, and recording how long they take to run. Or, perhaps better +would be recording how much slower than normal the programs run under +Valgrind; that metric would be more robust if the compiler or system +libraries on the test machine changed.</p> + +<p>The nightly measurements should be kept and ideally there would be a +system for producing graphs that show the performance changes over time. +You'd have to specify somehow where the previous measurements would be +stored, perhaps that would be a command line argument to the script.</p> + +<p>Choosing the programs for the test suite would be challenging. +Ideally we'd have a mix of two kinds of programs:</p> + +<ol> +<li><p>Real programs. Ones like the SPEC2000 benchmarks would be ideal, + but they are not free.</p></li> + =20 +<li><p>Artificial programs that stress performance-critical subsystems. + For example http://bugs.kde.org/show_bug.cgi?id=3D105039 has an exampl= e + program that does many malloc and free calls, and has many heap blocks + live at one time. This exposed a performance bug in Valgrind's heap + allocator.</p></li> +</ol> + +<p>The scripts in nightly/ for doing the nightly regression tests would = be the +right place to start on this. (Added August 27, 2005)</p> + + +<h3>Regression test brittleness</h3> +<p>Valgrind's regression test suite (run with "make regtest") is extreme= ly +useful. The scripts in nightly/ are used on various test machines to +determine if regressions are introduced. Unfortunately, some of the tes= ts +are too brittle -- they fail on some machines because of slight +configuration differences. On the eight test machines we use, we see up= to +about 10 or 15 failures that should not happen due to these +differences.</p> + +<p>Improving things will require either additional expected output files= (the +*.stderr.exp* and *.stdout.exp* files in the tests/ directories), or mor= e +clever output filters (the filters have names like filter_stderr). If y= ou +attempt to improve the filters, you should be careful not to remove so m= uch +information that the test becomes weaker. (Added August 27, 2005)</p> + + +<h3>Regression test gaps</h3> +<p>The regression tests are great, but they have some gaps. For +example:</p> + +<ul> +<li><p>The auto-generated insn*.c tests in none/tests/x86/ are great: + they test almost all the x86 instructions. It would be great to have + equally comprehensive tests for the other architectures supported + (AMD64, PPC32).</p></li> + +<li><p> The test memcheck/tests/x86/scalar.c is a very thorough test for + Memcheck's checking of system call arguments. We would like similar + tests for the other platforms (AMD64, PPC32).</p></li> + +<li><p> Memcheck and Nulgrind (aka "none") have a good number of tests + each. The other tools have very few. Adding more salient tests would + be useful.</p></li> +</ul> + +<p>Fixing these gaps is not very hard, just tedious. (Added August 27, +2005)<p> + +<h3>Unit regression tests</h3> +<p>The regression tests are good system-level tests, but we have almost = no unit +testing, which is bad. We would like to pull out individual Valgrind +modules into test harnesses. These can then be tested like normal progr= ams, +using normal testing tools, such as gcov (for test coverage) and Valgrin= d +itself.</p> + +<p>The test memcheck/tests/oset_test.c is one unit test we have. It tes= ts the +m_oset module. It uses some preprocessing hacks to replace calls to +Valgrind-internal functions with calls to the standard versions, eg. cal= ling +printf() instead of VG_(printf)(). memcheck/tests/vgtest_ume.c is anoth= er +one, although it has some oddities that make it not such a good +example.</p> + +<p>(Note that when this test runs, the Valgrind built in the tree is act= ually +running and testing part of its own code! Which is quirky but fine in +practice.)</p> + +<p>Some modules will be more amenable to this approach than others; the = fewer +other modules a module depends on, the easier it is. m_oset is a case i= n +point, as it only imports 5 pub_core_* header files. Other files in +coregrind/ that would be good candidates include:</p> + +<ul> +<li>m_debuglog.c</li> +<li>m_execontext.c</li> +<li>m_hashtable.c (the test would be very similar to + m_oset.c's),</li> +<li>m_libc{assert,base,file,mman,print,proc,signal}.c</li> +<li>m_mallocfree.c (a test for this would be particularly + helpful)</li> +<li>m_stacktrace.c</li> +<li>m_syscall.c</li> +</ul> + +<p>The following would be more challenging, but perhaps still +doable:</p> + +<ul> +<li>m_stacks.c</li> +<li>m_translate.c</li> +<li>m_transtab.c</li> +<li>m_ume.c (maybe use vgtest_ume.c as a starting point, but beware + that this file will change signficantly soon)</li> +</ul> + +<p>As well as redirecting Valgrind-internal functions to glibc +equivalents, stubs for various functions would need to be written for +many of these, as is standard for unit tests.</p> + +<p>The coverage (as measured by gcov) should be as high as possible. +The coverage for m_oset.c's test is over 99%.</p> + +<p>Just fitting these tests into the existing regression test framework +means that they will only be run under Valgrind. It might also be +worthwhile to introduce a new type of regression test that should also +be run natively; this native run could use gcov to determine the test +coverage. (Added August 27, 2005)</p> + + +<h2>Coding</h2> + +<h3>Bug fixes</h3> +<p>Bug fixes are always welcome. Please consult the Bugzilla page for t= he +current bug list. Bear in mind that choosing the right bugs to fix is a= n +art, and it may be worth consulting the developers list before throwing = a +lot of effort at fixing something very obscure. Patches should be submit= ted +to the relevant Bugzilla page. (Added August 27, 2005)</p> + + +<h3>Printing floating point values</h3> +<p>Valgrind's VG_(printf)() function (in coregrind/m_debuglog.c) does +not support the %f qualifier for printing floating point numbers. In +several places (eg. in VG_(percentify)() in coregrind/m_libcprint.c) the +code prints floating point numbers in an ad hoc way. Support for the %f +qualifier would simplify things. The support probably wouldn't need to +worry about a lot of the obscure corner cases (eg. NaNs, infinities, +denormals) that complicate all things related to floating point +numbers. (Added August 27, 2005)</p> + + +<h3>Updating the C++ demangler</h3> +<p>Valgrind's C++ demangler was copied almost verbatim from GNU binutils +about 4 years ago. We have heard of one or two cases where it is +failing to demangle some symbols; it's possible that new demangling +cases have been introduced since then. It would be useful if someone +checked that the demangler is up-to-date, and fixed it if not. The +relevant Valgrind files are coregrind/m_demangle/*.c. This would be a +relatively easy project, at the same time being of benefit to a large +proportion of the Valgrind user base. (Added August 27, 2005)</p> + + +<h3>Core dumping</h3> +<p>Valgrind 2.4.0 could produce useful core dumps. This functionality +was disabled in the transition to 3.0.0. It would be useful to have it +back. This requires factoring out the x86/Linux-specific parts +suitably. The old code is in m_coredump.c, entirely commented out. +This would be fairly easy, it just requires looking up the ELF +documentation to understand how core dumps are structured. (Added Augus= t +27, 2005)</p> + + +<h3>Supporting custom allocators</h3> +<p>Valgrind has two client requests, VALGRIND_MALLOCLIKE_BLOCK and +VALGRIND_FREELIKE_BLOCK that are intended to support custom allocators. +But they don't work very well. In particular, if you try to hand out +pieces of memory that came from a malloc() call, they don't work. You +could write new requests (give them different names) that avoid these +problems. You should test it with one or more real custom allocators to +make sure it suffices; the problems with the existing requests stem from +the fact that they weren't tested in this way. The *MEMPOOL* client +requests are there for pool-based custom allocators. Looking at them +may be instructive. This is a fairly straightforward project. (Added +August 27, 2005)</p> + + +<h3>Addrcheck and/or compressed V bit representation</h3> +<p>Memcheck more than doubles the amount of memory a program uses, due +to its V bits. Addrcheck only increases memory by 1/8th. However, +Addrcheck has not yet been converted from the 2.x to the 3.x line. The +main issue is converting it to be 64-bit clean, particularly the shadow +memory. Also, Valgrind's structure has been rearranged a lot since 2.x +and Addrcheck has bit-rotted somewhat because of this. Getting +Addrcheck working is fairly straightforward, although it would require a +good understanding of how shadow memory works.</p> + +<p>An alternative is to make Memcheck use less memory. It shadows each +byte of memory with 8 V bits and 1 A bit. However, the 8 V bits are +almost always either all 0 or all 1, so there is great potential for +compressing the representation. Each byte could be shadowed by 2 bits, +with the following meanings:</p> + +<ul> +<li>00: unaddressable (but treated as defined; see the Memcheck + USENIX paper for details why);</li> +<li>01: addressable but all 8 bits are undefined;</li> +<li>10: addressable and all 8 bits are defined;</li> +<li>11: something else.</li> +</ul> + +<p>A secondary table would be needed for the "something else" case; it +would map memory addresses to V-bit/A-bit values. The OSet structure +(see include/pub_tool_oset.h) would be good for this.</p> + +<p>Hopefully this can be implemented with a negligible slow-down, since +the "something else" case is so rare. And the big advantage is the +large reduction in the amount of address space used, which is +particularly important on 32-bit machines such as x86. If compressed V +bits work well, we could probably get rid of Addrcheck altogether, since +the reduced memory usage is the main advantage it has over Memcheck. (A= dded +August 27, 2005)</p> + + +<h3>Preserving debugging information</h3> +<p>Currently, Valgrind unloads debugging information for shared objects +when they are unloaded with dlclose(). If the shared object has a +memory leak, the stack trace for Memcheck's error message at termination +will be missing entries, which makes fixing the leak difficult. This is +described in <a href=3D"http://bugs.kde.org/show_bug.cgi?id=3D79362">bug +report #79362</a>.</p> + +<p>One way to fix this would be to change things so that Valgrind +records source-level information for stack traces, rather than code +addresses. It is not entirely clear how to do this in a way that avoids +unnecessary debug information lookups, and nor how to avoid increasing +the amount of memory used. An implementation would help clarify the +problem and show if this approach is feasible. This project is of +intermediate difficulty. (Added August 27, 2005)</p> + + +<h3>Ports to new platforms</h3> +<p>If you are interested in porting Valgrind to a new platform, please +read <a href=3D"/devel/platforms.html#porting_plans">porting priorities +statement</a>. Note that porting is a big task and requires a great +deal of knowledge about the targeted operating system and architecture. +(Added August 27, 2005)</p> + + +<h2>Research</h2> + +<h3>Memcheck V-bit verification</h3> +<p>Nobody has ever properly tested how reliably Memcheck tracks +definedness (V bits) through complicated integer operations, nor whether +all shadow memory load/store operations work right when you take into +account all permutations of operation size, alignment and endianness. +It would be useful to have a more formal idea of the properties of the +scheme. (An interesting case: there are two forms of V bit computation +for addition, an exact one that Memcheck uses in certain crucial cases, +and an approximation one that is used most of the time.)</p> + +<p>Cryptographic algorithms -- which do a lot of bit twiddling and have +very long chains of dependent computations -- might be a good starting +point (you could run a crypto algorithm with completely defined input, +then run it gain with one undefined bit in the input, etc). The +Memcheck USENIX paper describes Memcheck's operations in some detail. +This could be a relatively easy, yet interesting, starter project, +suitable for an advanced project for an undergraduate student. (Added +August 27, 2005)</p> + + +<h3>Characterising the kernel interface</h3> +<p>The interface between Valgrind and the OS kernel is complex and +important. System calls, signals, threads and memory layout are all +involved. A document that carefully described all the interactions +would be very useful, and might lead to improvements in the +implementation. This would not be easy, but would be a great way to +learn about OS kernels and Valgrind's internals. (Added August 27, 2005.= )</p> + + +<h3>Identifying root causes of undefined value errors</h3> +<p>Memcheck's undefined value errors report the use of undefined values +at the point at which they could affect the program's behaviour. This +is often far from the root cause of the error, which often makes fixing +these errors challenging. The Memcheck USENIX paper has lots more +information about this.</p> + +<p>Typically the root cause is forgetting to initialise a piece of +memory such as a field in a struct. Some errors have more than one root +cause. If you could associate extra metadata with each undefined value +that identifies one of its root causes, better undefined error messages +would be possible. The hard part is maintaining that extra metadata for +all undefined value errors without taking a large performance hit. +We're not convinced it's even possible, but we'd love to be proven +wrong. A solution to this problem would be publishable. (Added August = 27, +2005)</p> + + +<h3>Cryptographic snooping</h3> +<p>Since a Valgrind tool can see every operation performed by a program, +it is conceivable that a tool might be able to analyse some kind of +cryptographic program as it runs to extract certain secret information, +such as a key. This may not be true at all, but it's an intriguing +thought. Assuming there is some truth to it, this might make a good +research project for an honours undergraduate or Masters student, and +could well be publishable if done well. (Added August 27, 2005)</p> + + +<h3>Detecting dangerous floating point inaccuracies</h3> +<p>Floating point arithmetic is difficult to get right. It is easy to +write programs whose outcome depend on incorrect assumptions about +levels of precision. One could imagine a tool that tracks this +precision somehow, eg. by tracking each floating point value with a +plus/minus, and then complains if the program does an operation that +relies on more precision than is present.</p> + +<p>The exact details of how this would work remain unclear. It would +require a good knowledge of how floating point arithmetic works. Also, +Vex only provides 64-bit floating point accuracy, when x86 machines +provide 80-bit floating point values; this would complicate things +further.</p> + +<p>This is a challenging project that might be suitable for part of a +Masters or PhD project. It would definitely be publishable if done +well. (Added August 27, 2005)</p> + + +<h3>A better memory profiler</h3> +<p>Many memory profilers exist that can tell you how much memory your +program uses; the Valgrind tool Massif is one example. Other profilers +can tell you about the cache utilisation of a program; Cachegrind is an +example.</p> + +<p>What other kinds of information about memory use might be useful? +How else does memory use affect program performance? Perhaps measuring +memory bandwidth use in some way would be useful -- does the program +access memory in an even fashion, or is it bursty? When one part of a +program writes values in memory and another part reads them, that area +of memory can be thought of as a communication channel between the two +fragments of code. Is it possible to construct a tool which measures +these through-memory communication rates between parts of a program?</p> + +<p>Can a tool identify inefficient uses of memory, such as copying +values around unnecessarily? Perhaps a tool that measures page faults +and helps programmers avoid them would be useful.</p> + +<p>The first step is to identify what information a programmer would +like to know to improve a program's memory usage. This is harder than +it sounds. Analysing large programs that use memory intensively would +be a good starting point. The next step is to work out how a tool can +provide this information, or a good approximation of it, reasonably +efficiently.</p> + +<p>This is all quite speculative, but I think there is a new kind of +memory profiler waiting to be invented, and Valgrind would provide an +excellent platform for developing it. These questions could form part +of a Masters or PhD project. It would certainly be publishable if done +well. (Added August 27, 2005)</p> + Modified: trunk/php/menu.php =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- trunk/php/menu.php 2005-08-29 22:30:12 UTC (rev 184) +++ trunk/php/menu.php 2005-08-29 22:30:32 UTC (rev 185) @@ -36,6 +36,7 @@ array( 'url'=3D>'platforms.html', 'tag'=3D>'Supported Platforms' ), array( 'url'=3D>'cvs_svn.html', 'tag'=3D>'SVN Repos' ), array( 'url'=3D>'guis.html', 'tag'=3D>'Front Ends / GUIs' ) + array( 'url'=3D>'projects.html', 'tag'=3D>'Project Suggestions' )*= / /*array( 'url'=3D>'consultants.html', 'tag'=3D>'Commercial Support' )= */ ); =20 Modified: trunk/support/contributing.html =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- trunk/support/contributing.html 2005-08-29 22:30:12 UTC (rev 184) +++ trunk/support/contributing.html 2005-08-29 22:30:32 UTC (rev 185) @@ -10,48 +10,16 @@ about writing good bug reports. Small sample programs that exhibit a bu= g are particularly helpful.</p> =20 -<h3>Testing</h3> -<p>If you can regularly test Valgrind on a range of different systems, t= hat -can be very helpful. Please contact the <?php echo vglink( 'vgdevel' ); = ?>=20 -list for more information.</p> - <h3>Documentation</h3> </p>Valgrind's documentation is not always kept up to date. Any documen= tation patches that help in this respect are welcome. Please send them to the <?php echo vglink( 'vgdevel' ); ?> list.</p> =20 -<h3>Code</h3> -<p>If you want to contribute new code to Valgrind, you should subscribe = to -the <?php echo vglink( 'vgdevel' ); ?> list.</p> +<h3>Software Infrastructure and Code</h3> +If you are interested in writing code for Valgrind itself or its +infrastructure, or doing research with it, please consult our +<a href=3D"/devel/projects.html">project suggestions</a> page. =20 -<p>There are various kinds of code you can contribute.</p> -<ul> -<li>Bug fixes are always welcome. Please consult the Bugzilla page for = the - current bug list. Patches should be submitted to the relevant Bugzi= lla - page.</li> - -<li>Ports to new platforms are also welcome. Note that porting is a big - task, and so it is worth asking on the - <?php echo vglink( 'vgdevel' ); ?> list - first if anyone else is working on a port to your chosen platform. - Ports require a very good knowledge of the platform you are porting = to. - Ports to widely used platforms are preferable.</li> - -<li>We are very conservative about adding new features. Only features - that are useful to many users, and that do not affect the code base = in - adverse ways are likely to be accepted. If you want to add a featur= e, - it is worth asking on the <?php echo vglink( 'vgdevel' ); ?> list if= it - is likely to be incorporated before starting.</li> -</ul> - -<p>Please understand that there is no guarantee that code you write will= be -incorporated into Valgrind. It depends on a number of factors: how well -written it is, how important are the issues it addresses, how does it af= fect -the code base's structure, and so on. Such is the nature of all free -software projects. However, if you consistently submit high quality -patches, you may be granted write access to the repository. This is how -most of the current developers got involved with Valgrind.</p> - <h3>Money</h3> <p>Donations are welcome, large or small. They help pay day-to-day running costs, such as bandwidth, web-hosting, electricity, and hardware |