Here's attached two profiler outputs and a test program used
to generate them (-Xrunhprof:cpu=samples was the profiler
option used). Viewable with PerfAnal or another profiler
output analyzer of your choice. On my machine this takes 25-
44 seconds. One profiler output is with
strip_whitespace=true, the other is with
strip_whitespace=false. It looks like whitespace stripping stuff
is the main cause of slowdown, even with
strip_whitespace=false (as explicit <#lt/>, <#rt/>, and <#t/>
are still looked up, and the lookup is time consuming as well).
I have no clear idea how to remedy this; one idea would be to
avoid having each and every TextBlock scan its surroundings
looking for explicit trim directives and rather have this info
propagated by the parser to surrounding text blocks when the
explicit trim directive is encountered - that would cause one
line scan/trim directive instead of one line scan/text block as
it is implemented now. If you have 5 text blocks on a line (as
now), you'll now get 5xlinecount scans of the line regardless
of whether there's a deliberate trim on the line or no.
Anyway, I don't dare touch that code, I don't grok it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Well, I just tried this against SVN head and a 64K template of this nature takes a bit under a second (800 ms) to parse on a 2.4 Ghz imac and jdk 1.5. (Rendering (to /dev/null of course) is only 58 ms and would probably get faster in a long-serving server app once hotspot got cranked up.)
All this still seems a bit slow to me to be honest. This new toy of mine is a killer machine. 800 ms is a lot of time on this box. But OTOH, I'm sure it's not 20x faster than the 1600 Mhz AMD box. It might be 3x or even 4x though. I think that, under the most pessimistic assumptions, this is about a 5x speedup on this test, though possibly closer to 10x.
Between 5x and 10x improvement, I think. In parsing that file, not in rendering, which is the same as it was before probably.
Probably this bug could be closed. It's not like parsing speed was generating so many complaints before, it's rendering speed that matters, at least for most people. So a significant improvement over a situation that was good enough for just about anybody seems good enough.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In very flat templates TextBlock.deliberate{Left,Right}Trim dominates the CPU time, slowing down parsing. A very flat template is one where there are thousands of TemplateElement-s that are siblings in the AST, like in the case of example template.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Logged In: YES
user_id=52489
Here's attached two profiler outputs and a test program used
to generate them (-Xrunhprof:cpu=samples was the profiler
option used). Viewable with PerfAnal or another profiler
output analyzer of your choice. On my machine this takes 25-
44 seconds. One profiler output is with
strip_whitespace=true, the other is with
strip_whitespace=false. It looks like whitespace stripping stuff
is the main cause of slowdown, even with
strip_whitespace=false (as explicit <#lt/>, <#rt/>, and <#t/>
are still looked up, and the lookup is time consuming as well).
I have no clear idea how to remedy this; one idea would be to
avoid having each and every TextBlock scan its surroundings
looking for explicit trim directives and rather have this info
propagated by the parser to surrounding text blocks when the
explicit trim directive is encountered - that would cause one
line scan/trim directive instead of one line scan/text block as
it is implemented now. If you have 5 text blocks on a line (as
now), you'll now get 5xlinecount scans of the line regardless
of whether there's a deliberate trim on the line or no.
Anyway, I don't dare touch that code, I don't grok it.
Logged In: YES
user_id=33187
Originator: NO
Well, I just tried this against SVN head and a 64K template of this nature takes a bit under a second (800 ms) to parse on a 2.4 Ghz imac and jdk 1.5. (Rendering (to /dev/null of course) is only 58 ms and would probably get faster in a long-serving server app once hotspot got cranked up.)
All this still seems a bit slow to me to be honest. This new toy of mine is a killer machine. 800 ms is a lot of time on this box. But OTOH, I'm sure it's not 20x faster than the 1600 Mhz AMD box. It might be 3x or even 4x though. I think that, under the most pessimistic assumptions, this is about a 5x speedup on this test, though possibly closer to 10x.
Between 5x and 10x improvement, I think. In parsing that file, not in rendering, which is the same as it was before probably.
Probably this bug could be closed. It's not like parsing speed was generating so many complaints before, it's rendering speed that matters, at least for most people. So a significant improvement over a situation that was good enough for just about anybody seems good enough.
In very flat templates
TextBlock.deliberate{Left,Right}Trimdominates the CPU time, slowing down parsing. A very flat template is one where there are thousands ofTemplateElement-s that are siblings in the AST, like in the case of example template.Now it scales lineally, so for template with a lot of sibling AST nodes it's orders of magnitude faster.
Last edit: Dániel Dékány 2015-07-05