Thread: [htmltmpl] Odd slowdown in HTML::Template with INCL and LOOP
Brought to you by:
samtregar
From: Hakim C. <hak...@da...> - 2007-07-03 14:02:40
|
Hi, We're having an "interesting" problem with a significant slowdown in processing of some HTML::Template templates. We are using: - using H::T 2.8 and 2.9 - an house framework that feeds params from a CGI object to H::T - CGI 2.81 and 3.15 - Perl 5.8.0 and 5.8.4 If the following conditions hold: - template X is <TMPL_INCL>uded - the parameter "region" is set (for <TMPL_LOOP NAME="REGION">) then processing time for the template is approx 1 second longer. However - if either the TMPL_INCL or the TMPL_LOOPs are commented, processing time is reduced - the TMPL_INCL does *not* refer to the TMPL_LOOP, and neither does the one template it in turn includes. The DProf runs include the following (with some entries for BEGIN on inhouse modules removed) : --------------- without including TMPL_LOOP super@adamo102:/SEVA/dada.net/beta2/cgi-bin/sn_test$ dprofpp -rI province_noprov.out Total Elapsed Time = 1.346560 Seconds Real Time = 1.346560 Seconds Inclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 65.7 0.030 0.885 5 0.0060 0.1770 main::BEGIN 43.8 0.008 0.590 1 0.0083 0.5899 Our::Application::pageOutput 43.8 0.000 0.590 1 0.0000 0.5897 Our::Application::printPageOutput 9.43 0.050 0.127 78 0.0006 0.0016 HTML::Template::BEGIN 8.69 0.000 0.117 539 0.0000 0.0002 JSON::Parser::value 8.24 0.120 0.111 9267 0.0000 0.0000 HTML::Template::param 7.28 0.000 0.098 2 0.0000 0.0488 HTML::Template::_init --------------- with the TMPL_LOOP super@adamo102:/SEVA/dada.net/beta2/cgi-bin/sn_test$ dprofpp -rI province_prov.out Total Elapsed Time = 2.443450 Seconds Real Time = 2.443450 Seconds Inclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 109. 1.149 2.676 387 0.0030 0.0069 HTML::Template::BEGIN 76.8 0.000 1.877 1 0.0000 1.8765 Our::Application::printPageOutput 76.4 0.027 1.867 1 0.0275 1.8668 Our::Application::pageOutput 55.5 0.010 1.356 8 0.0012 0.1696 HTML::Template::LOOP::output 34.9 0.020 0.853 5 0.0040 0.1705 main::BEGIN 10.4 0.300 0.256 28736 0.0000 0.0000 HTML::Template::param e.g. the odd things include - "HTML::Template::BEGIN" is called a large number of times. I don't see that many BEGIN { } or 'use' blocks in the H::T code, nor do I understand why they would be called because of a TMPL_INCL - HTML::Template::LOOP::output takes proportionally much longer. I'm sorry that this isn't a minimal test case - I'm working with our HTML editor to try to identify what it is! But if anyone has any suggestions on where to start trying to debug this, I'd be interested, and grateful, to see what you suggest. Regards, Hakim (osfameron) |
From: Hakim C. <hak...@da...> - 2007-07-04 12:40:26
|
Hakim Cassimally wrote: > Hi, > > We're having an "interesting" problem with a significant slowdown in > processing of some HTML::Template templates. The current suspicion, after HTML bod and $boss worked on this, by a process of elimination is that the problem is when the same data structure is used twice in the same ->output() call. So, in some cases, the loop "REGION" below, was used twice in two separate TMPL_LOOP cases. This seems to add as much as 0.5 seconds every time the parameter is reused. The workaround seems to be to clone the data structure and create two separate parameters, which seems very wrong. Has anyone else experienced this? (It might be a feature of our framework code rather than H::T, I'll work on the minimal test case on Friday after a release...) Cheerio, Hakim > We are using: > - using H::T 2.8 and 2.9 > - an house framework that feeds params from a CGI object to H::T > - CGI 2.81 and 3.15 > - Perl 5.8.0 and 5.8.4 > > If the following conditions hold: > > - template X is <TMPL_INCL>uded > - the parameter "region" is set (for <TMPL_LOOP NAME="REGION">) > > then processing time for the template is approx 1 second longer. > > However > > - if either the TMPL_INCL or the TMPL_LOOPs are commented, processing > time is reduced > - the TMPL_INCL does *not* refer to the TMPL_LOOP, and neither does the > one template it in turn includes. > > The DProf runs include the following (with some entries for BEGIN on > inhouse modules removed) : > > --------------- without including TMPL_LOOP > > $ dprofpp -rI > province_noprov.out > Total Elapsed Time = 1.346560 Seconds > Real Time = 1.346560 Seconds > Inclusive Times > %Time ExclSec CumulS #Calls sec/call Csec/c Name > 65.7 0.030 0.885 5 0.0060 0.1770 main::BEGIN > 43.8 0.008 0.590 1 0.0083 0.5899 Our::Application::pageOutput > 43.8 0.000 0.590 1 0.0000 0.5897 > Our::Application::printPageOutput > 9.43 0.050 0.127 78 0.0006 0.0016 HTML::Template::BEGIN > 8.69 0.000 0.117 539 0.0000 0.0002 JSON::Parser::value > 8.24 0.120 0.111 9267 0.0000 0.0000 HTML::Template::param > 7.28 0.000 0.098 2 0.0000 0.0488 HTML::Template::_init > > --------------- with the TMPL_LOOP > > $ dprofpp -rI > province_prov.out > Total Elapsed Time = 2.443450 Seconds > Real Time = 2.443450 Seconds > Inclusive Times > %Time ExclSec CumulS #Calls sec/call Csec/c Name > 109. 1.149 2.676 387 0.0030 0.0069 HTML::Template::BEGIN > 76.8 0.000 1.877 1 0.0000 1.8765 > Our::Application::printPageOutput > 76.4 0.027 1.867 1 0.0275 1.8668 Our::Application::pageOutput > 55.5 0.010 1.356 8 0.0012 0.1696 HTML::Template::LOOP::output > 34.9 0.020 0.853 5 0.0040 0.1705 main::BEGIN > 10.4 0.300 0.256 28736 0.0000 0.0000 HTML::Template::param > > e.g. the odd things include > > - "HTML::Template::BEGIN" is called a large number of times. I don't > see that many BEGIN { } or 'use' blocks in the H::T code, nor do I > understand why they would be called because of a TMPL_INCL > > - HTML::Template::LOOP::output takes proportionally much longer. > > I'm sorry that this isn't a minimal test case - I'm working with our > HTML editor to try to identify what it is! But if anyone has any > suggestions on where to start trying to debug this, I'd be interested, > and grateful, to see what you suggest. > > Regards, > > Hakim (osfameron) |
From: Mathew R. <mat...@ne...> - 2007-07-05 00:43:00
|
Hi Hakim, I think you might be over analysing the problem. :-) When a TMPL_LOOP is encountered, H::T needs to create (for want of a better description) a child context, that contains only the template variables for the child. Each child context is effectively a new H::T object, so that the very top H::T object accumulates the output from each child. By removing the LOOP, you are effectively completely avoiding the child context thus saving quite a significant amount of processing. If you are looking for performance increases, you should start simply by printing a hi-res time from various points within "output()". I have done this previously and found that I was able to find a dramatic increase in performance, just by removing foreach() loops. ** regards, Mathew Robertson ** If you interested, I can send you a patch which ("works for me") speed's up the template-variable symbol resolution. This patch has achieved close to 20x speed up under some circumstances. Most templates are speed up by about 1.5x. Hakim Cassimally wrote: > Hakim Cassimally wrote: > >> Hi, >> >> We're having an "interesting" problem with a significant slowdown in >> processing of some HTML::Template templates. >> > > The current suspicion, after HTML bod and $boss worked on this, > by a process of elimination is that the problem is when the same > data structure is used twice in the same ->output() call. > > So, in some cases, the loop "REGION" below, was used twice in > two separate TMPL_LOOP cases. This seems to add as much as > 0.5 seconds every time the parameter is reused. > > The workaround seems to be to clone the data structure and > create two separate parameters, which seems very wrong. Has > anyone else experienced this? (It might be a feature of our > framework code rather than H::T, I'll work on the minimal test > case on Friday after a release...) > > Cheerio, > Hakim > > >> We are using: >> - using H::T 2.8 and 2.9 >> - an house framework that feeds params from a CGI object to H::T >> - CGI 2.81 and 3.15 >> - Perl 5.8.0 and 5.8.4 >> >> If the following conditions hold: >> >> - template X is <TMPL_INCL>uded >> - the parameter "region" is set (for <TMPL_LOOP NAME="REGION">) >> >> then processing time for the template is approx 1 second longer. >> >> However >> >> - if either the TMPL_INCL or the TMPL_LOOPs are commented, processing >> time is reduced >> - the TMPL_INCL does *not* refer to the TMPL_LOOP, and neither does the >> one template it in turn includes. >> >> The DProf runs include the following (with some entries for BEGIN on >> inhouse modules removed) : >> >> --------------- without including TMPL_LOOP >> >> $ dprofpp -rI >> province_noprov.out >> Total Elapsed Time = 1.346560 Seconds >> Real Time = 1.346560 Seconds >> Inclusive Times >> %Time ExclSec CumulS #Calls sec/call Csec/c Name >> 65.7 0.030 0.885 5 0.0060 0.1770 main::BEGIN >> 43.8 0.008 0.590 1 0.0083 0.5899 Our::Application::pageOutput >> 43.8 0.000 0.590 1 0.0000 0.5897 >> Our::Application::printPageOutput >> 9.43 0.050 0.127 78 0.0006 0.0016 HTML::Template::BEGIN >> 8.69 0.000 0.117 539 0.0000 0.0002 JSON::Parser::value >> 8.24 0.120 0.111 9267 0.0000 0.0000 HTML::Template::param >> 7.28 0.000 0.098 2 0.0000 0.0488 HTML::Template::_init >> >> --------------- with the TMPL_LOOP >> >> $ dprofpp -rI >> province_prov.out >> Total Elapsed Time = 2.443450 Seconds >> Real Time = 2.443450 Seconds >> Inclusive Times >> %Time ExclSec CumulS #Calls sec/call Csec/c Name >> 109. 1.149 2.676 387 0.0030 0.0069 HTML::Template::BEGIN >> 76.8 0.000 1.877 1 0.0000 1.8765 >> Our::Application::printPageOutput >> 76.4 0.027 1.867 1 0.0275 1.8668 Our::Application::pageOutput >> 55.5 0.010 1.356 8 0.0012 0.1696 HTML::Template::LOOP::output >> 34.9 0.020 0.853 5 0.0040 0.1705 main::BEGIN >> 10.4 0.300 0.256 28736 0.0000 0.0000 HTML::Template::param >> >> e.g. the odd things include >> >> - "HTML::Template::BEGIN" is called a large number of times. I don't >> see that many BEGIN { } or 'use' blocks in the H::T code, nor do I >> understand why they would be called because of a TMPL_INCL >> >> - HTML::Template::LOOP::output takes proportionally much longer. >> >> I'm sorry that this isn't a minimal test case - I'm working with our >> HTML editor to try to identify what it is! But if anyone has any >> suggestions on where to start trying to debug this, I'd be interested, >> and grateful, to see what you suggest. >> >> Regards, >> >> Hakim (osfameron) >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ > Html-template-users mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/html-template-users > |
From: Hakim C. <hak...@da...> - 2007-07-09 09:22:24
|
Mathew Robertson ha scritto: > Hi Hakim, > > I think you might be over analysing the problem. :-) Heh, that is possible :-) > > When a TMPL_LOOP is encountered, H::T needs to create (for want of a > better description) a child context, that contains only the template > variables for the child. Each child context is effectively a new H::T > object, so that the very top H::T object accumulates the output from > each child. OK, that makes sense. I don't understand why DProf is showing more BEGIN blocks in this case though. As far as I could see from a scan of the code it should be just instantiating ->new objects rather than re-evalling, so I don't understand where this is coming from. > By removing the LOOP, you are effectively completely avoiding the child > context thus saving quite a significant amount of processing. Ah! Even if the child process is used to process a loop of 20 items, if the parent context is large, the whole context is copied non-lazily to the child? We have tried running the template output with the largest context object (dictionary for localisation) turned off, without noticing any impact on the pathological case. > If you are looking for performance increases, you should start simply by > printing a hi-res time from various points within "output()". I have > done this previously and found that I was able to find a dramatic > increase in performance, just by removing foreach() loops. ** That's a possibility, though I think loops are fairly basic functionality that I'd like not to have to prune from template code... > regards, > Mathew Robertson > > ** If you interested, I can send you a patch which ("works for me") > speed's up the template-variable symbol resolution. This patch has > achieved close to 20x speed up under some circumstances. Most templates > are speed up by about 1.5x. Shiny! I'd be interested to take a look, Thanks and regards, Hakim > > > Hakim Cassimally wrote: >> Hakim Cassimally wrote: >> >>> Hi, >>> >>> We're having an "interesting" problem with a significant slowdown in >>> processing of some HTML::Template templates. >>> >> >> The current suspicion, after HTML bod and $boss worked on this, >> by a process of elimination is that the problem is when the same >> data structure is used twice in the same ->output() call. >> >> So, in some cases, the loop "REGION" below, was used twice in >> two separate TMPL_LOOP cases. This seems to add as much as >> 0.5 seconds every time the parameter is reused. >> >> The workaround seems to be to clone the data structure and >> create two separate parameters, which seems very wrong. Has >> anyone else experienced this? (It might be a feature of our >> framework code rather than H::T, I'll work on the minimal test >> case on Friday after a release...) >> >> Cheerio, >> Hakim >> >> >>> We are using: >>> - using H::T 2.8 and 2.9 >>> - an house framework that feeds params from a CGI object to H::T >>> - CGI 2.81 and 3.15 >>> - Perl 5.8.0 and 5.8.4 >>> >>> If the following conditions hold: >>> >>> - template X is <TMPL_INCL>uded >>> - the parameter "region" is set (for <TMPL_LOOP NAME="REGION">) >>> >>> then processing time for the template is approx 1 second longer. >>> >>> However >>> >>> - if either the TMPL_INCL or the TMPL_LOOPs are commented, processing >>> time is reduced >>> - the TMPL_INCL does *not* refer to the TMPL_LOOP, and neither does the >>> one template it in turn includes. >>> >>> The DProf runs include the following (with some entries for BEGIN on >>> inhouse modules removed) : >>> >>> --------------- without including TMPL_LOOP >>> >>> $ dprofpp -rI >>> province_noprov.out >>> Total Elapsed Time = 1.346560 Seconds >>> Real Time = 1.346560 Seconds >>> Inclusive Times >>> %Time ExclSec CumulS #Calls sec/call Csec/c Name >>> 65.7 0.030 0.885 5 0.0060 0.1770 main::BEGIN >>> 43.8 0.008 0.590 1 0.0083 0.5899 Our::Application::pageOutput >>> 43.8 0.000 0.590 1 0.0000 0.5897 >>> Our::Application::printPageOutput >>> 9.43 0.050 0.127 78 0.0006 0.0016 HTML::Template::BEGIN >>> 8.69 0.000 0.117 539 0.0000 0.0002 JSON::Parser::value >>> 8.24 0.120 0.111 9267 0.0000 0.0000 HTML::Template::param >>> 7.28 0.000 0.098 2 0.0000 0.0488 HTML::Template::_init >>> >>> --------------- with the TMPL_LOOP >>> >>> $ dprofpp -rI >>> province_prov.out >>> Total Elapsed Time = 2.443450 Seconds >>> Real Time = 2.443450 Seconds >>> Inclusive Times >>> %Time ExclSec CumulS #Calls sec/call Csec/c Name >>> 109. 1.149 2.676 387 0.0030 0.0069 HTML::Template::BEGIN >>> 76.8 0.000 1.877 1 0.0000 1.8765 >>> Our::Application::printPageOutput >>> 76.4 0.027 1.867 1 0.0275 1.8668 Our::Application::pageOutput >>> 55.5 0.010 1.356 8 0.0012 0.1696 HTML::Template::LOOP::output >>> 34.9 0.020 0.853 5 0.0040 0.1705 main::BEGIN >>> 10.4 0.300 0.256 28736 0.0000 0.0000 HTML::Template::param >>> >>> e.g. the odd things include >>> >>> - "HTML::Template::BEGIN" is called a large number of times. I don't >>> see that many BEGIN { } or 'use' blocks in the H::T code, nor do I >>> understand why they would be called because of a TMPL_INCL >>> >>> - HTML::Template::LOOP::output takes proportionally much longer. >>> >>> I'm sorry that this isn't a minimal test case - I'm working with our >>> HTML editor to try to identify what it is! But if anyone has any >>> suggestions on where to start trying to debug this, I'd be interested, >>> and grateful, to see what you suggest. >>> >>> Regards, >>> >>> Hakim (osfameron) >>> >> >> ------------------------------------------------------------------------- >> This SF.net email is sponsored by DB2 Express >> Download DB2 Express C - the FREE version of DB2 express and take >> control of your XML. No limits. Just data. Click to get it now. >> http://sourceforge.net/powerbar/db2/ >> _______________________________________________ >> Html-template-users mailing list >> Htm...@li... >> https://lists.sourceforge.net/lists/listinfo/html-template-users >> |
From: Mathew R. <mat...@ne...> - 2007-07-09 23:41:02
|
>> When a TMPL_LOOP is encountered, H::T needs to create (for want of a >> better description) a child context, that contains only the template >> variables for the child. Each child context is effectively a new >> H::T object, so that the very top H::T object accumulates the output >> from each child. > > OK, that makes sense. I don't understand why DProf is showing more > BEGIN blocks in this case though. As far as I could see from a scan > of the code it should be just instantiating ->new objects rather than > re-evalling, so I don't understand where this is coming from. My experience with DProf has been good and bad; I find the output a little confusing.... Usually "print nanosecond-timer" gives me the good oil. > >> By removing the LOOP, you are effectively completely avoiding the >> child context thus saving quite a significant amount of processing. > > Ah! Even if the child process is used to process a loop of 20 items, > if the parent context is large, the whole context is copied non-lazily > to the child? Depends if "global_vars" is set; although, that option doesn't immediately cause too much of a slow down. > > We have tried running the template output with the largest context > object (dictionary for localisation) turned off, without noticing > any impact on the pathological case. did you mean "largest object 'associate'd with H::T"? In any case, most of the code is O(n) for template size and O(1) or similar for param-value lookups. > >> If you are looking for performance increases, you should start simply >> by printing a hi-res time from various points within "output()". I >> have done this previously and found that I was able to find a >> dramatic increase in performance, just by removing foreach() loops. ** > > That's a possibility, though I think loops are fairly basic > functionality that I'd like not to have to prune from template code... Past programming experience tells me that "for { for { ... }}" is O(n^2)... so that is always the first thing I look for. H::T has this case at the very start of the output() code block -> so its a good candidate for profiling. > >> regards, >> Mathew Robertson >> >> ** If you interested, I can send you a patch which ("works for me") >> speed's up the template-variable symbol resolution. This patch has >> achieved close to 20x speed up under some circumstances. Most >> templates are speed up by about 1.5x. > > Shiny! I'd be interested to take a look, No sweat. This replaces the "associate magic" code block lines 2658 to 2686 (H::T 2.9, 2.8 is similar): # support the associate magic, searching for undefined params and # attempting to fill them from the associated objects. if (scalar(@{$options->{associate}})) { my @undef_params; foreach my $param (keys %{$self->{param_map}}) { next if (defined $self->param($param)); push @undef_params, $param; } if (scalar(@undef_params)) { my $value; # if case sensitive mode or no CGI objects, we can use the fast path if ($options->{case_sensitive} or (grep { !/^1/ } map { UNIVERSAL::isa($_,'HTML::Template') } @{$options->{associate}}) == 0) { foreach my $param (@undef_params) { foreach my $associated_object (reverse @{$options->{associate}}) { $value = $associated_object->param($param); next unless (defined $value); $self->param($param, scalar $value); last; } } } else { my %case_map; foreach my $associated_object (@{$options->{associate}}) { map { $case_map{$associated_object}{lc($_)} = $_ } $associated_object->param(); } my $associated_param; foreach my $param (@undef_params) { foreach my $associated_object (reverse @{$options->{associate}}) { $associated_param = $case_map{$associated_object}{$param}; next unless (defined $associated_param); $value = $associated_object->param($associated_param); next unless (defined $value); $self->param($param, scalar $value); last; } } } } } You probably want to do a timing check using the old code vs this new code - just to ensure that it isn't making the problem worse. cheers, Mathew |