|
From: Tony R. <ton...@bu...> - 2006-09-06 08:11:10
|
Le mardi 05 septembre 2006 =E0 06:56 +0200, Bart Van Assche a =E9crit : > Hello Tony, >=20 > Three questions: > - Is the paper about PTT available online ? Yes: http://www.linuxsymposium.org/2005/linuxsymposium_procv2.pdf Page 111 The PTT web-site provides 2006/June-updated explanations. > - Does the PTT tool report false positives ? PTT does not analyse where problems can be or if the program does obey the POSIX Thread rules. PTT simply traces the use of POSIX NPTL objects (Threads, Mutexes, ...) plus very important stages inside NPTL. A new feature also is to provide information about contention of threads. Analysis must be done AFTER traces have been recorded. So, one could build a tool aimed at analyzing the traces and checking if/when something goes wrong. Like 2 threads accessing an unprotected variable at same time, or more subtile programming mistakes. Based on the trace information, someone also could build a tool aimed at understanding how the application can be speed up, by understanding where are the bottlenecks (many threads waiting for one thread, as an example). > - You wrote about an impact on behavior. Can you explain this > further ? Yes. If you run a multi-threaded application on different environments: - different POSIX thread libraries - processors with different speeds - different Operating Systems - machines with different number of processors things will be executed in different ways. So bugs may stay hidden on an environment and suddenly appear when you change a very simple element of the environment. The same with a multi-threading debugging tool: the more the tool modifies the way the application runs its threads, the more you will be able to find new bugs, but the less you'll be able to catch the one appearing at customer's site, without the debugging tool at work. In fact, you cannot debug a multi-threaded application in many cases, since it makes the problem disappear ... PTT was designed in order to have the lowest possible impact to the performance, in order to modify as less as possible the order of scheduling of threads. No calls to routines or kernel are done. So that the bug can be traced and not got round, hidden. This has been tested with Java benchmark (Volano) and a HPC program (GLucas) on different architectures (ia32, x86_64, ppc, ia64). We know that scalability is not perfect with many CPUs (8, 16, ...). But the impact of PTT is less that 2 or 3 % on very pthread-consuming applications. Also, PTT can be used to understand if the problem is in the application or in NPTL, since it can trace 2 levels: use of NPTL, internal operations inside NPTL. Tony |