From: Albert G. <Dr....@t-...> - 2008-07-07 21:41:26
|
Albert Graef wrote: >>> In return I promise to investigate why the dict and set modules take >>> so long to compile. :) >> It would be nice, it is really slow. > > Ok, I did some profiling now, and it seems that the lion's share (over > 70%) of the time is spent within LLVM, only a small fraction (not > exceeding 15%) goes to the parser and all my own semantic routines. I think I pinned that one down now. Noticing that only set.pure and dict.pure are affected, but not the other container data structures whose code is in principle quite similar, I took a closer look at the generated pattern matching code of these modules. Lo and behold, there are in fact some functions in those two modules which have a lot of overlapping rules producing automata with some 260-300 states. So I removed some the special case rules in those definitions, and that alone made the modules load thrice as fast. Mind you, 260-300 states isn't big at all, my construction algorithm can easily deal with much bigger automata, but of course the automaton then also yields a routine in LLVM IR which includes a decision tree (nested switch statements) with a total of 260-300 target labels. My guess is that the LLVM IR builder gets really slow when dealing with "big" assembler routines like this (maybe some badly coded algorithms with quadratic complexity in there, I haven't looked at the code). So now I have to get Pure to compile with LLVM 2.3 to see whether they fixed those horrible inefficiencies with the new builder class. If not then as an intermediate measure some refactoring of those overlapping rules in dict.pure and set.pure will help. Cheers, Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |