From: Jed Brown <jed@59...>  20090616 02:11:25

Tim Kroeger wrote: > Thank you for your detailed hints about profiling tools. I'm somehow > always very reluctant towards them for a number of reasons. Skaling the > whole application down such that it is faster by a factor of 1000 could > result in hard work  and could also spoil the results. My experience is that this always pays off. You aren't running the small cases for the results it produces, you are running them to check for code correctness. My top three guidelines for developing parallel code are 1. Have a single runtime parameter for the size of the problem, it should scale between <1 second of runtime and a full production run. (This can be an input mesh or some other parameter controlling problem size.) 2. Make sure it works correctly in serial before trying in parallel. 3. Run in parallel on your workstation (small problem size) to check correctness before moving to the cluster. Also, I think this one is critical for any PDE solver: * Manufacture solutions so that you have exact solutions to compare to. This is really easy, even for very complex codes, if you write the problem as F(u)=0 or F(u',u,t) = 0 and your code can accomodate arbitrary forcing terms. Choose the solution u(x,t) *before* choosing the domain, boundary conditions or forcing. The only requirement is that it have sufficiently rich derivatives. Products of transcendental functions like tanh are good. Then, using a symbolic algebra package (Mathematica, Maple, Maxima, Sympy), apply your nonlinear differential operators to manufacture a forcing term and print this as C code (these packages can do this, don't worry if the expressions are pages long, you never have to read them). Paste the exact solutions and forcing terms into your code and use the exact solutions for inhomogeneous boundary conditions. Now you can compare to highly nontrivial exact solutions. Don't worry that they don't look anything like your real solutions because the forcing terms are highly nonphysical, they *will* test that your code is correct (converges to highly nontrivial exact solutions at the correct rate). This is way more useful than nearly degenerate (physical) exact solutions that are commonly used for testing correctness. [/pulpit] > Being root will be practically impossible. The admin might be willing > to help me, but he is located quite far away, so any help offered is > based on email (or possible telephone), and he won't supply the root > password to me. It's definitely worth asking him what profiling tools are available. Gprof is also an option. > Well, at least I have now enabled the already existing option of > stopping the simulation after a few number of steps. The problem is > that even 10 steps already take about 5 hours. This doesn't make sense to me. The full run was 576 steps in 18 hours (ghosted case), which works out to about 2 minutes per step. (Well, that many assemblies, are there many assemblies per step?) > I've added quite finegrained START_LOG()/STOP_LOG() pairs now inside > one of my assembly functions (that is, the one that is most probably > responsible for the poor performance) This is the poorman's profiler. It's a bit clumsy, but should work. Jed 