|
From: <sv...@va...> - 2007-09-17 22:28:19
|
Author: njn Date: 2007-09-17 23:28:21 +0100 (Mon, 17 Sep 2007) New Revision: 6853 Log: Add a section to the cachegrind manual suggesting how to act on the results. Modified: trunk/cachegrind/docs/cg-manual.xml Modified: trunk/cachegrind/docs/cg-manual.xml =================================================================== --- trunk/cachegrind/docs/cg-manual.xml 2007-09-17 22:19:01 UTC (rev 6852) +++ trunk/cachegrind/docs/cg-manual.xml 2007-09-17 22:28:21 UTC (rev 6853) @@ -1226,28 +1226,31 @@ <para> So, you've managed to profile your program with Cachegrind. Now what? What's the best way to actually act on the information it provides to speed -up your program?</para> +up your program? Here are some rules of thumb that we have found to be +useful.</para> <para> First of all, the global hit/miss rate numbers are not that useful. If you have multiple programs or multiple runs of a program, comparing the numbers -might identify if any are outliers. Otherwise, they're not enough to act -on.</para> +might identify if any are outliers and worthy of closer investigation. +Otherwise, they're not enough to act on.</para> <para> -The source code annotations are much more useful. In our experience, the -best place to start is by looking at the <computeroutput>Ir</computeroutput> -numbers. They simply measure how many instructions were executed for each -line, and don't include any cache information, but they can still be very -useful for identifying bottlenecks.</para> +The line-by-line source code annotations are much more useful. In our +experience, the best place to start is by looking at the +<computeroutput>Ir</computeroutput> numbers. They simply measure how many +instructions were executed for each line, and don't include any cache +information, but they can still be very useful for identifying +bottlenecks.</para> <para> After that, we have found that L2 misses are typically a much bigger source of slow-downs than L1 misses. So it's worth looking for any snippets of -code that cause a lot of L2 misses. If you find any, it's still not always -easy to work out how to improve things. You need to have a reasonable -understanding of how caches work, the principles of locality, and your -program's data access patterns. </para> +code that cause a high proportion of the L2 misses. If you find any, it's +still not always easy to work out how to improve things. You need to have a +reasonable understanding of how caches work, the principles of locality, and +your program's data access patterns. Improving things may require +redesigning a data structure, for example.</para> <para> In short, Cachegrind can tell you where some of the bottlenecks in your code |