Thread: [Parseperl-discuss] Performance hacks
Brought to you by:
adamkennedy
From: Chris D. <ch...@ch...> - 2006-07-20 05:13:47
|
The Perl::Critic team has been working on some performance improvements, some of which rely on caching PPI information. If you have created any PPI performance hacks, we'd love to hear about them, either on this list or on de...@pe... (subscribe via http://perlcritic.tigris.org/servlets/ProjectMailingListList) To see our accomplishments to date, take a look at (for example): PPI::Document::find() caching http://perlcritic.tigris.org/source/browse/perlcritic/trunk/Perl- Critic/lib/Perl/Critic/Document.pm?rev=527&view=markup PPI::Node::content() caching http://perlcritic.tigris.org/servlets/ReadMsg?list=dev&msgNo=464 Chris -- Chris Dolan, Software Developer, http://www.chrisdolan.net/ Public key: http://www.chrisdolan.net/public.key vCard: http://www.chrisdolan.net/ChrisDolan.vcf |
From: Adam K. <ad...@ph...> - 2006-07-20 05:59:25
|
Both of those look like fairly straight forward CPU vs memory tradeoffs. Both assume the document is readonly (which PPI itself doesn't do) although that's fine in your case. The second one is dangerous, because it assumes that ALL PPI document loaded in the entire process will always be readonly. It would be unsafe if Perl::Critic was ever used in the same process as something that manipulated documents. I think we might be able to make some trivial changes to make what you are doing a little safer though. Firstly, we could add a ->readonly flag to PPI::Document, which you would set with my $doc = PPI::Document->new( $something, readonly => 1, ); I've had this document syntax planned for some time now, so we could do things like tabwidth => 8 and so on. What that would then allow you to do is to have some sort of a module that aggressively tweaks PPI's internals for the read-only case. I'd be cool with you guys maintaining it, and then bundling it into the main PPI distribution. When this mode was enabled, the document constructor would refuse to create any document that didn't have ->readonly set to true, so that the assumptions set by your optimisation hacks would be forced to hold true. This way, if you are running a dedicated perlcritic process, you can turn on ReadOnly mode and make perlcritic go faster, but in the case you need to mix critic into some other program like an editor, it will still work, just slower. A simple technique is already used by PPI::XS, which overlays faster versions of several functions over the top of the default PPI ones. Further, if you happen to know a decent XS coder, get them to do some hacking on PPI::XS. The framework and backwards+forwards compatibility capability is already in place, we just need people to write faster versions of single functions. Between XS improvements, and a formalised ReadOnly mode, you should have much much greater scope for writing more and more-varied optimisations. How does that sound? Adam K Chris Dolan wrote: > The Perl::Critic team has been working on some performance > improvements, some of which rely on caching PPI information. If you > have created any PPI performance hacks, we'd love to hear about them, > either on this list or on de...@pe... (subscribe via > http://perlcritic.tigris.org/servlets/ProjectMailingListList) > > To see our accomplishments to date, take a look at (for example): > > PPI::Document::find() caching > http://perlcritic.tigris.org/source/browse/perlcritic/trunk/Perl- > Critic/lib/Perl/Critic/Document.pm?rev=527&view=markup > > PPI::Node::content() caching > http://perlcritic.tigris.org/servlets/ReadMsg?list=dev&msgNo=464 > > Chris > > -- > Chris Dolan, Software Developer, http://www.chrisdolan.net/ > Public key: http://www.chrisdolan.net/public.key > vCard: http://www.chrisdolan.net/ChrisDolan.vcf > > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Parseperl-discuss mailing list > Par...@li... > https://lists.sourceforge.net/lists/listinfo/parseperl-discuss |
From: Chris D. <ch...@ch...> - 2006-07-20 07:24:56
|
On Jul 20, 2006, at 12:57 AM, Adam Kennedy wrote: > Both of those look like fairly straight forward CPU vs memory > tradeoffs. > > Both assume the document is readonly (which PPI itself doesn't do) > although that's fine in your case. That's right. > The second one is dangerous, because it assumes that ALL PPI document > loaded in the entire process will always be readonly. It would be > unsafe > if Perl::Critic was ever used in the same process as something that > manipulated documents. Agreed, which is why I didn't commit it to Perl::Critic. > I think we might be able to make some trivial changes to make what you > are doing a little safer though. > > Firstly, we could add a ->readonly flag to PPI::Document, which you > would set with > > my $doc = PPI::Document->new( $something, > readonly => 1, > ); I was thinking something quite similar earlier today, but I couldn't think of a good way to implement it without either 1) sprinkling "if ($self->top->readonly)" everywhere, or 2) multiple inheritance or mixin of a ::Mutable class for the non-readonly case. > I've had this document syntax planned for some time now, so we > could do > things like tabwidth => 8 and so on. > > What that would then allow you to do is to have some sort of a module > that aggressively tweaks PPI's internals for the read-only case. > I'd be > cool with you guys maintaining it, and then bundling it into the main > PPI distribution. > > When this mode was enabled, the document constructor would refuse to > create any document that didn't have ->readonly set to true, so > that the > assumptions set by your optimisation hacks would be forced to hold > true. Wait, are you suggesting a global readonly flag, or a per-instance readonly? It sounds like the former. I was instead thinking of the latter. > This way, if you are running a dedicated perlcritic process, you can > turn on ReadOnly mode and make perlcritic go faster, but in the > case you > need to mix critic into some other program like an editor, it will > still > work, just slower. That certainly makes sense, whether global or per-instance. > A simple technique is already used by PPI::XS, which overlays faster > versions of several functions over the top of the default PPI ones. > > Further, if you happen to know a decent XS coder, get them to do some > hacking on PPI::XS. The framework and backwards+forwards compatibility > capability is already in place, we just need people to write faster > versions of single functions. > > Between XS improvements, and a formalised ReadOnly mode, you should > have > much much greater scope for writing more and more-varied > optimisations. > > How does that sound? Great. I'll play with PPI::XS. Chris -- Chris Dolan, Software Developer, http://www.chrisdolan.net/ Public key: http://www.chrisdolan.net/public.key vCard: http://www.chrisdolan.net/ChrisDolan.vcf |
From: Adam K. <ad...@ph...> - 2006-07-20 07:44:40
|
>> my $doc = PPI::Document->new( $something, >> readonly => 1, >> ); > > I was thinking something quite similar earlier today, but I couldn't > think of a good way to implement it without either 1) sprinkling "if > ($self->top->readonly)" everywhere, or 2) multiple inheritance or mixin > of a ::Mutable class for the non-readonly case. Agreed, actually checking it in every ->prune call would be onerous and slow things down, and doing some form of mutation or mixin for non-read-only seems infeasible without having entire seperate trees, tripling the number of classes. :/ > Wait, are you suggesting a global readonly flag, or a per-instance > readonly? It sounds like the former. I was instead thinking of the > latter. What I'm thinking is that we make ->readonly an advisory flag. If you create with ->readonly, then at least we (or anyone else) can know that the document as a whole is _intended_ to be readonly. We might not actually enforce it at the add_before level, but it can be checked in all sorts of places like PPI::Transform and so on, that deal with entire documents. It means at least anything that cares about readonly has the option to look for it. It also means we can then IN ADDITION to the advisory ->readonly attribute, we can implement a global ReadOnly mode to modify the function undef PPI::... Once that global ReadOnly is enabled, PPI::Document->new would simply die or return undef, and refuse to allow any document to be created that _doesn't_ have the ->readonly flag. So it would be a partly-cooperative, partly-enforced concept. Enforced at the per-document level, but cooperative at the per-element level. Because indeed doing ->top->readonly all over the place would be fairly hideous and slow things down (although we can look at adding it down the line). Adam K |
From: Chris D. <ch...@ch...> - 2006-07-20 14:52:25
|
On Jul 20, 2006, at 2:42 AM, Adam Kennedy wrote: >>> my $doc = PPI::Document->new( $something, >>> readonly => 1, >>> ); >> I was thinking something quite similar earlier today, but I >> couldn't think of a good way to implement it without either 1) >> sprinkling "if ($self->top->readonly)" everywhere, or 2) multiple >> inheritance or mixin of a ::Mutable class for the non-readonly case. > > Agreed, actually checking it in every ->prune call would be onerous > and slow things down, and doing some form of mutation or mixin for > non-read-only seems infeasible without having entire seperate > trees, tripling the number of classes. :/ Like you suggested, I'll play with PPI-XS before actually trying to optimize PPI, but this is an interesting topic. How about using caches with epoch number? It might still be too much overhead (especially in $self->top), but here's some noodling: package PPI::Node; sub remove { ... $self->top->inc_epoch; ... } sub inc_epoch { $self->{epoch}++; # maybe delete some caches pre-emptively? } sub epoch { $self->{epoch} || 0; } sub content { my ($self) = @_; my $epoch = $self->top->epoch; $self->{cache} = {epoch => $epoch} if (!$self->{cache} || $self- >{cache}->{epoch} != $epoch); return $self->{cache}->{content} ||= join q{}, map { $_- >content } @{$self->{children}}; } If the lexer can be exempt from caching and the ratio of reads to writes is large (like Perl::Critic), this could be a win, but I'm not sure. Chris -- Chris Dolan, Software Developer, Clotho Advanced Media Inc. 608-294-7900, fax 294-7025, 1435 E Main St, Madison WI 53703 vCard: http://www.chrisdolan.net/ChrisDolan.vcf Clotho Advanced Media, Inc. - Creators of MediaLandscape Software (http://www.media-landscape.com/) and partners in the revolutionary Croquet project (http://www.opencroquet.org/) |