[Lxr-commits] CVS: lxr/lib/LXR Files.pm, 1.22, 1.23 Index.pm, 1.23, 1.24 Lang.pm, 1.50, 1.51 Markup
Brought to you by:
ajlittoz
From: Andre-Littoz <ajl...@us...> - 2013-09-21 12:54:55
|
Update of /cvsroot/lxr/lxr/lib/LXR In directory sfp-cvs-1.v30.ch3.sourceforge.com:/tmp/cvs-serv4903/lib/LXR Modified Files: Files.pm Index.pm Lang.pm Markup.pm SimpleParse.pm Log Message: Files.pm, Index.pm, Lang.pm, Markup.pm, SimpleParse.pm, Files/*, Index/*, Lang/*: better comments, source code improvement & optilmisation Index: Files.pm =================================================================== RCS file: /cvsroot/lxr/lxr/lib/LXR/Files.pm,v retrieving revision 1.22 retrieving revision 1.23 diff -u -d -r1.22 -r1.23 --- Files.pm 18 Jan 2013 17:48:50 -0000 1.22 +++ Files.pm 21 Sep 2013 12:54:52 -0000 1.23 @@ -48,7 +48,7 @@ =item 1 C<$params> -an optional I<hash> from lxr.conf used to pass extra information +an optional I<hash> reference from lxr.conf used to pass extra information to the real constructor =back @@ -124,7 +124,7 @@ return @dircontents; } -=head2 C<getfile ($pathname, $releaseid, $withannot)> +=head2 C<getfile ($pathname, $releaseid)> C<getfile> returns a file content in a string. @@ -139,10 +139,6 @@ the release (or version) in which C<$pathname> is expected to be found -=item 1 C<$withannot> - -optional, if defined request an annotated file - =back Function result is C<undef> if file does not exist in this version. @@ -192,6 +188,8 @@ =over +=item + Starting with release 1.1, this method should only be used for internal needs of the derived classes because annotation editing has been drastically changed in script I<source>. @@ -277,7 +275,7 @@ return ++$len; } -=head2 C<getauthor ($pathname, $annotation)> +=head2 C<getauthor ($pathname, $releaseid, $annotation)> C<getauthor> returns the author of the designated revision. @@ -461,7 +459,7 @@ eventually adding C</> suffix. Afterwards, all is needed is test for the trailing slash. -This sub is used when the existence must be confirmed, such as +This method is used when the existence must be confirmed, such as when processing an include link since it is independent from the currently displayed file. @@ -492,7 +490,7 @@ =back -This sub is used when the existence must be confirmed, such as +This method is used when the existence must be confirmed, such as when processing an include link since it is independent from the currently displayed file. @@ -502,7 +500,7 @@ =item -I<< When the file is subsequently accessed, it is much simpler and +I<When the file is subsequently accessed, it is much simpler and efficient to use C<getfilehandle>, since a handle will be required anyway.> @@ -535,7 +533,7 @@ =back -Extract content of the path from repository and stuf it into a +Extract content of the path from repository and stuff it into a temporary file whose name is returned. B<Note:> @@ -558,7 +556,7 @@ $fileh = $self->getfilehandle ($filename, $releaseid); return undef unless defined($fileh); - $tmp = $config->tmpdir + $tmp = $config->{'tmpdir'} . '/lxrtmp.' . time . '.' . $$ @@ -614,9 +612,7 @@ C<_ignoredirs> is an internal (as indicated by _ prefix) filter utility to exclude directories containing any partial path defined in configuration -parameter C<'ignoredirs'>. - -The filter is to be called from C<getdir()>. +parameters C<'ignoredirs'> and C<'filterdirs'>. =over @@ -630,22 +626,28 @@ =back -Only the last part is tested since the parent is supposed to have been -scanned by a previous step of the recursive directory tree traversal. +Only the last part is tested for C<'ignoredirs'> since the parent +is supposed to have been scanned by a previous step of the recursive +directory tree traversal. If a higher element matched one of the C<'ignoredirs'> strings, that path part was filtered out and no further part is presented to this function. +C<'filterdirs'> operates on the full path, +I<i.e.> C<$path> concatenated with C<$node>. + B<Note:> =over +=item + The filter is to be called from C<getdir()>. -I<<This usage choice leaves the possibility to override the filter through +I<This usage choice leaves the possibility to override the filter through manually entering the path in the URL. Since it does not go through C<getdir()>, the "forbidden" path subdirectory is transmitted unaltered -to the source display script.>> +to the source display script.> =back @@ -654,7 +656,7 @@ sub _ignoredirs { my ($self, $path, $node) = @_; - return 1 if $node =~ m/^\./; # ignore "dot" dirs + return 1 if substr($node, 0, 1) eq '.'; # ignore "dot" dirs foreach my $ignoredir (@{$config->{'ignoredirs'}}) { return 1 if $node eq $ignoredir; } @@ -668,9 +670,7 @@ C<_ignorefiles> is an internal (as indicated by _ prefix) filter utility to exclude files containing patterns defined in configuration -parameter C<'ignorefiles'>. - -The filter is to be called from C<getdir()>. +parameters C<'ignorefiles'> and C<'filterfiles'>. =over @@ -684,24 +684,26 @@ =back -Presently, only filename filtering is done, i.e. the same filter is -applied in every directory. +Only filename filtering is done for C<'ignorefiles'>, +i.e. the same filter is applied in every directory. Usually, it screens off "dot" files, editor backups, binaries, ... -A more specific filtering could be implemented taking into account -both the parent directory and the filename. -But this extended feature will be added only on user request due to -its time-cost on huge trees such as Linux kernel. + +C<'filterfiles'> operates on the full path, +I<i.e.> concatenation of the parent directory C<$path> +and the filename C<$node>. B<Note:> =over +=item + The filter is to be called from C<getdir()>. -I<<This usage choice leaves the possibility to override the filter through +I<This usage choice leaves the possibility to override the filter through manually entering the path in the URL. Since it does not go through C<getdir()>, the "forbidden" filename is transmitted unaltered -to the source display script.>> +to the source display script.> =back Index: Index.pm =================================================================== RCS file: /cvsroot/lxr/lxr/lib/LXR/Index.pm,v retrieving revision 1.23 retrieving revision 1.24 diff -u -d -r1.23 -r1.24 --- Index.pm 1 Dec 2012 15:03:19 -0000 1.23 +++ Index.pm 21 Sep 2013 12:54:52 -0000 1.24 @@ -30,7 +30,6 @@ $CVSID = '$Id$ '; -use LXR::Common; use strict; # @@ -59,10 +58,10 @@ =item -I<<There used to be a second C<@args> argument which passed file +I<There used to be a second C<@args> argument which passed file open-attributes (such as C<O_RDWR> or C<O_CREAT>) when the DB -made of a set of files. -This is no longer used.> +was made of a set of files. +This is no longer used with DB engines.> =back @@ -91,18 +90,15 @@ # only once and do not contribute to the running time behaviour. sub new { - my ($self, $dbname) = @_; + my ($self, $dbname, $prefix) = @_; my $index; %files = (); %symcache = (); %cntcache = (); - my $prefix; - if (defined($config->{'dbprefix'})) { - $prefix = $config->{'dbprefix'}; - } else { - $prefix = "lxr_"; + if (!defined($prefix)) { + $prefix = 'lxr_'; } if ($dbname =~ m/^DBI:/i) { @@ -135,19 +131,19 @@ $index->{'files_select'} = $index->{dbh}->prepare ( "select fileid from ${prefix}files" - . " where filename = ? and revision = ?" + . ' where filename = ? and revision = ?' ); } if (!exists($index->{'allfiles_select'})) { $index->{'allfiles_select'} = $index->{dbh}->prepare - ( "select f.fileid, f.filename, f.revision, t.relcount" + ( 'select f.fileid, f.filename, f.revision, t.relcount' . " from ${prefix}files f, ${prefix}status t" . ", ${prefix}releases r" - . " where r.releaseid = ?" - . " and f.fileid = r.fileid" - . " and t.fileid = r.fileid" - . " order by f.filename, f.revision" + . ' where r.releaseid = ?' + . ' and f.fileid = r.fileid' + . ' and t.fileid = r.fileid' + . ' order by f.filename, f.revision' ); } @@ -156,38 +152,38 @@ $index->{'symbols_byname'} = $index->{dbh}->prepare ( "select symid, symcount from ${prefix}symbols" - . " where symname = ?" + . ' where symname = ?' ); } if (!exists($index->{'symbols_byid'})) { $index->{'symbols_byid'} = $index->{dbh}->prepare ( "select symname from ${prefix}symbols" - . " where symid = ?" + . ' where symid = ?' ); } if (!exists($index->{'symbols_setref'})) { $index->{'symbols_setref'} = $index->{dbh}->prepare ( "update ${prefix}symbols" - . " set symcount = ?" - . " where symid = ?" + . ' set symcount = ?' + . ' where symid = ?' ); } if (!exists($index->{'related_symbols_select'})) { $index->{'related_symbols_select'} = $index->{dbh}->prepare - ( "select s.symid, s.symcount, s.symname" + ( 'select s.symid, s.symcount, s.symname' . " from ${prefix}symbols s, ${prefix}definitions d" - . " where d.fileid = ?" - . " and s.symid = d.relid" + . ' where d.fileid = ?' + . ' and s.symid = d.relid' ); } if (!exists($index->{'delete_symbols'})) { $index->{'delete_symbols'} = $index->{dbh}->prepare ( "delete from ${prefix}symbols" - . " where symcount = 0" + . ' where symcount = 0' ); } @@ -195,32 +191,32 @@ $index->{'definitions_insert'} = $index->{dbh}->prepare ( "insert into ${prefix}definitions" - . " (symid, fileid, line, langid, typeid, relid)" - . " values (?, ?, ?, ?, ?, ?)" + . ' (symid, fileid, line, langid, typeid, relid)' + . ' values (?, ?, ?, ?, ?, ?)' ); } if (!exists($index->{'definitions_select'})) { $index->{'definitions_select'} = $index->{dbh}->prepare - ( "select f.filename, d.line, l.declaration, d.relid" + ( 'select f.filename, d.line, l.declaration, d.relid' . " from ${prefix}symbols s, ${prefix}definitions d" . ", ${prefix}files f, ${prefix}releases r" . ", ${prefix}langtypes l" - . " where s.symname = ?" - . " and r.releaseid = ?" - . " and d.fileid = r.fileid" - . " and d.symid = s.symid" - . " and d.langid = l.langid" - . " and d.typeid = l.typeid" - . " and f.fileid = r.fileid" - . " order by f.filename, d.line, l.declaration" + . ' where s.symname = ?' + . ' and r.releaseid = ?' + . ' and d.fileid = r.fileid' + . ' and d.symid = s.symid' + . ' and d.langid = l.langid' + . ' and d.typeid = l.typeid' + . ' and f.fileid = r.fileid' + . ' order by f.filename, d.line, l.declaration' ); } if (!exists($index->{'delete_file_definitions'})) { $index->{'delete_file_definitions'} = $index->{dbh}->prepare ( "delete from ${prefix}definitions" - . " where fileid = ?" + . ' where fileid = ?' ); } # 'delete_definitions' mandatory but syntax varies @@ -228,13 +224,13 @@ $index->{'delete_definitions'} = $index->{dbh}->prepare ( "delete from ${prefix}definitions" - . " where fileid in" - . " (select r.fileid" + . ' where fileid in' + . ' (select r.fileid' . " from ${prefix}releases r, ${prefix}status t" - . " where r.releaseid = ?" - . " and t.fileid = r.fileid" - . " and t.relcount = 1" - . " )" + . ' where r.releaseid = ?' + . ' and t.fileid = r.fileid' + . ' and t.relcount = 1' + . ' )' ); } @@ -242,31 +238,31 @@ $index->{'releases_insert'} = $index->{dbh}->prepare ( "insert into ${prefix}releases" - . " (fileid, releaseid)" - . " values (?, ?)" + . ' (fileid, releaseid)' + . ' values (?, ?)' ); } if (!exists($index->{'releases_select'})) { $index->{'releases_select'} = $index->{dbh}->prepare ( "select fileid from ${prefix}releases" - . " where fileid = ?" - . " and releaseid = ?" + . ' where fileid = ?' + . ' and releaseid = ?' ); } if (!exists($index->{'delete_one_release'})) { $index->{'delete_one_release'} = $index->{dbh}->prepare ( "delete from ${prefix}releases" - . " where fileid = ?" - . " and releaseid = ?" + . ' where fileid = ?' + . ' and releaseid = ?' ); } if (!exists($index->{'delete_releases'})) { $index->{'delete_releases'} = $index->{dbh}->prepare ( "delete from ${prefix}releases" - . " where releaseid = ?" + . ' where releaseid = ?' ); } @@ -274,45 +270,45 @@ $index->{'status_insert'} = $index->{dbh}->prepare ( "insert into ${prefix}status" - . " (fileid, relcount, indextime, status)" - . " values (?, 0, 0, ?)" + . ' (fileid, relcount, indextime, status)' + . ' values (?, 0, 0, ?)' ); } if (!exists($index->{'status_select'})) { $index->{'status_select'} = $index->{dbh}->prepare ( "select status from ${prefix}status" - . " where fileid = ?" + . ' where fileid = ?' ); } if (!exists($index->{'status_update'})) { $index->{'status_update'} = $index->{dbh}->prepare ( "update ${prefix}status" - . " set status = ?" - . " where fileid = ?" + . ' set status = ?' + . ' where fileid = ?' ); } if (!exists($index->{'status_timestamp'})) { $index->{'status_timestamp'} = $index->{dbh}->prepare ( "select indextime from ${prefix}status" - . " where fileid = ?" + . ' where fileid = ?' ); } if (!exists($index->{'status_update_timestamp'})) { $index->{'status_update_timestamp'} = $index->{dbh}->prepare ( "update ${prefix}status" - . " set indextime = ?" - . " where fileid = ?" + . ' set indextime = ?' + . ' where fileid = ?' ); } if (!exists($index->{'delete_unused_status'})) { $index->{'delete_unused_status'} = $index->{dbh}->prepare ( "delete from ${prefix}status" - . " where relcount = 0" + . ' where relcount = 0' ); } @@ -320,29 +316,29 @@ $index->{'usages_insert'} = $index->{dbh}->prepare ( "insert into ${prefix}usages" - . " (fileid, line, symid)" - . " values (?, ?, ?)" + . ' (fileid, line, symid)' + . ' values (?, ?, ?)' ); } if (!exists($index->{'usages_select'})) { $index->{'usages_select'} = $index->{dbh}->prepare - ( "select f.filename, u.line" + ( 'select f.filename, u.line' . " from ${prefix}symbols s, ${prefix}files f" . ", ${prefix}releases r, ${prefix}usages u" - . " where s.symname = ?" - . " and r.releaseid = ?" - . " and u.symid = s.symid" - . " and f.fileid = r.fileid" - . " and u.fileid = r.fileid" - . " order by f.filename, u.line" + . ' where s.symname = ?' + . ' and r.releaseid = ?' + . ' and u.symid = s.symid' + . ' and f.fileid = r.fileid' + . ' and u.fileid = r.fileid' + . ' order by f.filename, u.line' ); } if (!exists($index->{'delete_file_usages'})) { $index->{'delete_file_usages'} = $index->{dbh}->prepare ( "delete from ${prefix}usages" - . " where fileid = ?" + . ' where fileid = ?' ); } # 'delete_definitions' mandatory but syntax varies @@ -350,13 +346,13 @@ $index->{'delete_usages'} = $index->{dbh}->prepare ( "delete from ${prefix}usages" - . " where fileid in" - . " (select r.fileid" + . ' where fileid in' + . ' (select r.fileid' . " from ${prefix}releases r, ${prefix}status t" - . " where r.releaseid = ?" - . " and t.fileid = r.fileid" - . " and t.relcount = 1" - . " )" + . ' where r.releaseid = ?' + . ' and t.fileid = r.fileid' + . ' and t.relcount = 1' + . ' )' ); } @@ -365,8 +361,8 @@ $index->{'langtypes_select'} = $index->{dbh}->prepare ( "select typeid from ${prefix}langtypes" - . " where langid = ?" - . " and declaration = ?" + . ' where langid = ?' + . ' and declaration = ?' ); } if (!exists($index->{'langtypes_count'})) { @@ -383,7 +379,7 @@ . ", ${prefix}usages, ${prefix}langtypes" . ", ${prefix}symbols, ${prefix}releases" . ", ${prefix}status, ${prefix}files" - . " cascade" + . ' cascade' ); } return $index; @@ -394,11 +390,12 @@ # Generic implementation of this interface # -=head2 C<fileidifexists ($filename, $revision)> - =head2 C<fileid ($filename, $revision)> -C<fileid> returns a unique id for a file with a given revision. +=head2 C<fileidifexists ($filename, $revision)> + +C<fileid> returns a unique id for a file with a given revision, +creating it if it does not exist. C<fileidifexists> is similar, but returns C<undef> if the given revision is unknown, which can happen if the revision was created @@ -414,8 +411,8 @@ the revision for the file -CAUTION: this is not a release id! -It is computed by method filerev in the Files classes. +B<CAUTION:> this is not a release id! +It is computed by method C<filerev> in the I<Files> classes. =back @@ -426,9 +423,12 @@ =over -=item C<files_select> +=item * C<files_select> -=item C<files_insert> +=item * C<files_insert> + +=item * C<status_insert> +B<I<(>C<fileid> I<only)>> =back @@ -438,12 +438,12 @@ my ($self, $filename, $revision) = @_; my $fileid; - unless (defined($fileid = $files{"$filename\t$revision"})) { +# unless (defined($fileid = $files{"$filename\t$revision"})) { $self->{'files_select'}->execute($filename, $revision); ($fileid) = $self->{'files_select'}->fetchrow_array(); # opt $self->{'files_select'}->finish(); - $files{"$filename\t$revision"} = $fileid; - } +# $files{"$filename\t$revision"} = $fileid; +# } return $fileid } @@ -458,7 +458,7 @@ ($fileid) = $self->{'files_select'}->fetchrow_array(); $self->{'status_insert'}->execute($fileid, 0); # opt $self->{'files_select'}->finish(); - $files{"$filename\t$revision"} = $fileid; +# $files{"$filename\t$revision"} = $fileid; } return $fileid; } @@ -471,7 +471,7 @@ =item 1 C<$releaseid> -the release (or version) for which all recorded file should be returned +the release (or version) for which all recorded files should be returned =back @@ -482,7 +482,7 @@ =over -=item C<allfiles_select> +=item * C<allfiles_select> =back @@ -506,7 +506,7 @@ =over -=item Previous initialisation by C<getallfilesinit> +=item * Previous initialisation by C<getallfilesinit> =back @@ -540,9 +540,9 @@ =over -=item C<releases_select> +=item * C<releases_select> -=item C<releases_insert> +=item * C<releases_insert> =back @@ -592,7 +592,9 @@ =over -=item C<delete_one_release> +=item * C<delete_one_release> + +=back =cut @@ -657,11 +659,11 @@ =over -=item C<status_select> +=item * C<status_select> -=item C<status_insert> +=item * C<status_insert> -=item C<status_update> +=item * C<status_update> =back @@ -696,23 +698,11 @@ =back -B<Note:> - -=over - -=item - -I<A file must> always I<<be indexed before being parsed for -reference. Calling C<setfilereferenced> implicitly sets -C<fileindexed> as well.> - -=back - B<Requires:> =over -=item C<status_select> +=item * C<status_select> =back @@ -744,17 +734,28 @@ =back +B<Note:> + +=over + +=item + +I<A file must> always I<be indexed before being parsed for +references.> + +=back + B<Requires:> =over -=item C<status_select> +=item * C<status_select> -=item C<status_insert> +=item * C<status_insert> -=item C<status_update> +=item * C<status_update> -=item C<status_update_timestamp> +=item * C<status_update_timestamp> =back @@ -793,13 +794,7 @@ =over -=item C<status_select> - -=item C<status_insert> - -=item C<status_update> - -=item C<status_update_timestamp> +=item * C<status_timestamp> =back @@ -839,7 +834,7 @@ =over -=item C<definitions_select> +=item * C<definitions_select> =back @@ -895,7 +890,7 @@ =over -=item C<definitions_insert> +=item * C<definitions_insert> =back @@ -947,7 +942,7 @@ =over -=item C<usages_select> +=item * C<usages_select> =back @@ -991,9 +986,9 @@ =over -=item C<symbols_byname> +=item * C<symbols_byname> -=item C<usages_insert> +=item * C<usages_insert> =back @@ -1038,8 +1033,9 @@ =head2 C<issymbol ($symname, $releaseid)> -C<issymbol> returns a unique id for a symbol in a given release -if it exists in the DB, C<undef> otherwise. +C<issymbol> returns I<true> (1) for an existing symbol in a given release +according to the DB, +0 otherwise. =over @@ -1057,7 +1053,7 @@ =over -=item C<symbols_byname> +=item * C<symbols_byname> =back @@ -1105,7 +1101,7 @@ C<symid> returns a unique id for a symbol. If symbol is unknown, insert it into the DB with a zero reference count. -The reference count is adjusted by the method which add definition +The reference count is adjusted by the methods which add definition or usage. Decrementing the reference count is only done when purging the database. @@ -1121,9 +1117,9 @@ =over -=item C<symbols_byname> +=item * C<symbols_byname> -=item C<symbols_insert> +=item * C<symbols_insert> =back @@ -1169,7 +1165,7 @@ =over -=item C<symbols_byid> +=item * C<symbols_byid> =back @@ -1207,9 +1203,9 @@ =over -=item C<langtypes_select> +=item * C<langtypes_select> -=item C<langtypes_insert> +=item * C<langtypes_insert> =back @@ -1222,6 +1218,8 @@ =over +=item + I<This implementation is valid for DB engines with auto-incrementing fields. It must be overridden when the auto-incrementation feature is missing (e.g. PostgreSQL and SQLite).> @@ -1248,7 +1246,7 @@ =head2 C<deccount ()> -C<deccount> retrieves the number of declaration types in the database. +C<deccount> retrieves the number of type declarations in the database. It is used as a check to see if the database has been initialised. The previous mechanism based on a package variable in F<Generic.pm> @@ -1258,7 +1256,7 @@ =over -=item C<langtypes_count> +=item * C<langtypes_count> =back @@ -1316,6 +1314,8 @@ =over +=item + I<With the implementation of> C<flushcache>I<, this function is no longer necessary since the cache is also emptied in that subroutine.> @@ -1340,7 +1340,7 @@ optional argument to force 0-count write back (When creating the database, reference counts are incremented. -Consequently, if the final count is still zero, the symbols has not +Consequently, if the final count is still zero, the symbol has not been referenced and there is no need to overwrite the record. On the contrary, when purging the database, reference counts may decrement to zero and it is then mandatory to update the record @@ -1359,6 +1359,14 @@ The cache is then emptied +B<Requires:> + +=over + +=item * C<symbols_setref> + +=back + =cut sub flushcache { @@ -1400,11 +1408,11 @@ =over -=item C<related_symbols_select> +=item * C<related_symbols_select> -=item C<delete_file_definitions> +=item * C<delete_file_definitions> -=item C<delete_file_usages> +=item * C<delete_file_usages> =back @@ -1495,15 +1503,15 @@ =over -=item C<delete_definitions> +=item * C<delete_definitions> -=item C<delete_usages> +=item * C<delete_usages> -=item C<delete_symbolss> +=item * C<delete_symbolss> -=item C<delete_releases> +=item * C<delete_releases> -=item C<delete_unused_status> +=item * C<delete_unused_status> which should also delete I<files> table @@ -1513,6 +1521,8 @@ =over +=item + DBD C<commit()> is explicitly called to bypass possible disabling caused by private overriding method C<commit>. @@ -1522,6 +1532,8 @@ =over +=item + Manage the I<relid> relationship in I<definitions> =back @@ -1553,39 +1565,19 @@ $self->{dbh}{'AutoCommit'} = $oldcommitmode; } -=head2 C<purgeall> +=head2 C<purgeall ()> C<purgeall> deletes all data in the DB. -This is a more extensive version of C<purge> aimed at -C<--reindexall --allversions> with VCSes -which do not manage versions very well (e.g. CVS). - -=over - -=item 1 C<$releaseid> - -the target release (or version) - -=back +This is a brutal way of erasing everything, I<e.g.> for +C<--reindexall --allversions>. +It is much more efficient than a sequence of C<purge> on every version. B<Requires:> =over -=item C<purge_langtypes> - -=item C<purge_files> - -=item C<purge_definitions> - -=item C<purge_releases> - -=item C<purge_status> - -=item C<purge_symbols> - -=item C<purge_usages> +=item * C<purge_all> =back @@ -1597,9 +1589,9 @@ $self->{'purge_all'}->execute(); } -=head2 C<final_cleanup> +=head2 C<final_cleanup ()> -C<final_cleanup> allows to execute last actions on the database +C<final_cleanup> allows to execute last-minute actions on the database and disconnects. Must be called before C<Index> object disappears. Index: Lang.pm =================================================================== RCS file: /cvsroot/lxr/lxr/lib/LXR/Lang.pm,v retrieving revision 1.50 retrieving revision 1.51 diff -u -d -r1.50 -r1.51 --- Lang.pm 19 Apr 2013 12:42:14 -0000 1.50 +++ Lang.pm 21 Sep 2013 12:54:52 -0000 1.51 @@ -25,6 +25,12 @@ It is responsible for creating the parser and handling the specific categories editing. +A language parser is an I<object> with associated methods. +I<Lang.pm> creates the base object, to be augmented/overriden by +specific parsers in directory I<Lang>. +The methods described below are generally only dummy declarations +to capture missing specific implementations. + =cut package LXR::Lang; @@ -49,14 +55,39 @@ a I<string> containing the release (version) of the file to parse +B<Note:> + +=over + +=item + +I<Considering all call locations, +this argument is not really necessary and we could as well +use the global variable.> + +=back + =item 1 C<@itag> -an I<array> of 3 elements used to generate an C<<E<lt>AE<gt> >> link +an I<array> of 3 elements used to generate an C<E<lt>AE<gt>> link for the identifiers found in the file (just insert the identifier name between the array elements) =back +Creation of a specific parser is attempted first based on the file name +and information from configuration parameter C<'filetype'>. +If the file type is unknown, +the first line of the file is read to tentatively extract a I<shebang> +processed through configuration parameter C<'interpreters'>. +In case there is no I<shebang>, +an emacs-style C<mode:> is looked for. + +If all fail, C<undef> is returned. + +The LXR language name and argument C<@itag> are recorded in +the created parser which is then returned. + =cut sub new { @@ -185,10 +216,10 @@ =back -The fragment is surrounded with C<<E<lt>spanE<gt> >> and C<<E<lt>/spanE<gt> >> +The fragment is surrounded with C<E<lt>spanE<gt>> and C<E<lt>/spanE<gt>> tags. Special care is taken to repeat these tags at ends of line, so that the fragment can be correctly displayed on several lines without -disturbing other highlighting (suv as line numbers or difference marks). +disturbing other highlighting (such as line numbers or difference marks). =cut @@ -196,7 +227,7 @@ my ($frag, $css) = @_; $$frag = "<span class=\"$css\">$$frag</span>"; $$frag =~ s!\n!</span>\n<span class="$css">!g; - $$frag =~ s!<span class="comment"></span>$!! ; #remove excess marking + $$frag =~ s!<span class=".+?"></span>$!!; #remove excess marking } @@ -319,7 +350,7 @@ my $tail; if (!defined($link)) { - if ($path !~ m!/!) { + if (index($path, '/') < 0) { $tail = $file; } elsif (substr($path, -1) eq '/') { # Path ends with /: it may be a directory or an HTTP request. @@ -327,13 +358,13 @@ chop($path); $tail = $sep; $file = substr($file, 0, rindex($file, $sep)); - $link = &LXR::Common::incdirref($file, "include", $path, $dir); + $link = incdirref($file, 'include', $path, $dir); } } - # If incref or incdiref did not return a link to the file, + # If incref or incdirref did not return a link to the file, # explore however the path to see if directories are # known along the way. - while ( $path =~ m!/! + while ( index($path, '/') >= 0 && substr($link, 0, 1) ne '<' ) { # NOTE: the following rindex never returns -1, because @@ -343,12 +374,12 @@ $tail = substr($file, $sp) . $tail; $file = substr($file, 0, $sp); $path =~ s!/[^/]+$!!; - $link = &LXR::Common::incdirref($file, "include", $path, $dir); + $link = incdirref($file, 'include', $path, $dir); } # A known directory (at least) has been found. # Build links to higher path elements if (substr($link, 0, 1) eq '<') { - while ($path =~ m!/!) { + while (index($path, '/') >= 0) { # NOTE: see note above about rindex $l = index ($link, '>'); $r = rindex ($link, '<'); @@ -359,7 +390,7 @@ $sp = rindex ($file, $sep); $file = substr($file, 0, $sp); $path =~ s!/[^/]+$!!; - $link = &LXR::Common::incdirref($file, "include", $path, $dir); + $link = incdirref($file, 'include', $path, $dir); } } return $link . $tail; @@ -403,8 +434,12 @@ =over -I<This method is nowhere invoked. It corresponds to no category. It is -thus candidate for removal. +=item + +I<This method is nowhere invoked because keywords are processed in +C<processcode> simultaneouly with identifiers. +It corresponds to no category. It is +thus candidate for removal.> =back @@ -442,7 +477,7 @@ a I<reference> to the index (DB) object -=itm 1 C<$config> +=item 1 C<$config> a I<reference> to the configuration objet @@ -482,7 +517,7 @@ a I<reference> to the index (DB) object -=itm 1 C<$config> +=item 1 C<$config> a I<reference> to the configuration objet @@ -500,7 +535,7 @@ =head2 C<language ()> Method C<language> is usually a shorthand notation for -C<<$lang-E<gt>{'language'}>>. +C<$lang-E<gt>{'language'}>. =cut Index: Markup.pm =================================================================== RCS file: /cvsroot/lxr/lxr/lib/LXR/Markup.pm,v retrieving revision 1.8 retrieving revision 1.9 diff -u -d -r1.8 -r1.9 --- Markup.pm 21 Nov 2012 13:39:28 -0000 1.8 +++ Markup.pm 21 Sep 2013 12:54:52 -0000 1.9 @@ -21,7 +21,7 @@ # =encoding utf8 Not recognised?? -=head1 Template module +=head1 Markup module This module is the markup engine in charge of highlighting the syntactic components or otherwise interesting elements of a block. @@ -85,8 +85,8 @@ # Look for identifiers and create links with identifier search query. # TODO: Is there a performance problem with this? - $string =~ s#(^|\s)([a-zA-Z_~][a-zA-Z0-9_]*)\b# - $1.(is_linkworthy($2) ? &idref($2, "", $2) : $2)#ge; + $string =~ s/(^|\s)([a-zA-Z_~][a-zA-Z0-9_]*)\b/ + $1.(is_linkworthy($2) ? &idref($2, '', $2) : $2)/ge; # HTMLify the special characters we marked earlier, # but not the ones in the recently added xref html links. @@ -95,8 +95,8 @@ $string =~ s/\0>/>/g; # HTMLify email addresses and urls. - $string =~ - s#((ftp|http|nntp|snews|news)://(\w|\w\.\w|\~|\-|\/|\#)+(?!\.\b))#<a href=\"$1\">$1</a>#g; + $string =~ s{((ftp|http|nntp|snews|news)://(\w|\w\.\w|\~|\-|\/|\#)+(?!\.\b))} + {<a href=\"$1\">$1</a>}g; # htmlify certain addresses which aren't surrounded by <> $string =~ s/([\w\-\_]*\@netscape.com)(?!>)/<a class='offshore' href=\"mailto:$1\">$1<\/a>/g; @@ -108,8 +108,8 @@ $string =~ s/(<)(.*@.*)(>)/$1<a class='offshore' href=\"mailto:$2\">$2<\/a>$3/g; # HTMLify file names, assuming file is in the directory defined by $virtp. - $string =~ - s#\b(([\w\-_\/]+\.(c|h|cc|cp|hpp|cpp|java))|README)\b#{fileref($1, '', $virtp . $1);}#ge; + $string =~ s{\b(([\w\-_\/]+\.(c|h|cc|cp|hpp|cpp|java))|README)\b} + {fileref($1, '', $virtp . $1);}ge; return ($string); } @@ -142,25 +142,27 @@ =over -Presently, the DB check is not implemented. -It could be through C<index->symreferences($string, $releaseid)> -or C<$index->symdeclarations($string, $releaseid)> +=item + +DB check is not implemented. +It could be through C<index-E<gt>symreferences($string, $releaseid)> +or C<$index-E<gt>symdeclarations($string, $releaseid)> if we want to consider only declared identifiers. +=back + =cut sub is_linkworthy { my ($string) = @_; - if ($string =~ m/....../ - && ($string =~ m/_/ || $string =~ m/.[A-Z]/) - && $string !~ m/README/ -# && defined($xref{$string}) FIXME - ) { - return (1); - } else { - return (0); - } + return ( 5 < length($string) + && ( 0 <= index($string, '_') + || $string =~ m/.[A-Z]/ + ) + && 0 > index($string, 'README') + # && defined($xref{$string}) FIXME + ); } @@ -177,7 +179,7 @@ =back -This sub is called before editing (highlighting) its content +This sub is called before editing (highlighting) the string argument so that we can later distinguish between original litteral HTML special characters and those added as part of HTML tags. @@ -219,9 +221,9 @@ } -=head2 C<htmlquote ($string)> +=head2 C<freetextmarkup ($string)> -Function C<htmlquote> creates links in its argument for URLs and e-mail addresses. +Function C<freetextmarkup> creates links in its argument for URLs and e-mail addresses. =over @@ -249,18 +251,18 @@ =over -=item 1 C<$sfileh> +=item 1 C<$fileh> a I<filehandle> for the source file -=item 1 C<$sfileh> +=item 1 C<$outfun> a reference to a I<sub> which outputs the HTML stream =back This sub calls the parser to split the source file into homogeneous -fragments which are highlited by various specialized support routines. +fragments which are highlighted by various specialized support routines. Sub C<&outfun> is called to output the HTML stream. Use of a subroutine allows to do the highlighting with C<markupfile> in @@ -284,20 +286,20 @@ # 2: '</a>' # Later, it only needs to insert line numbers betwwen 0-1 and 1-2 to # have the correct anchor. - &fileref(1, "fline", $pathname, 1) =~ m/^(<a.*?)href.*\#(\d+)(\">)\d+(<\/a>)$/; + &fileref(1, 'fline', $pathname, 1) =~ m/^(<a.*?)href.*\#(\d+)(\">)\d+(<\/a>)$/; my @ltag; $ltag[0] = $1 . 'name="'; my $line = $2; $ltag[1] = $3; - $ltag[2] = $4 . " "; + $ltag[2] = $4 . ' '; # As an optimisation, the skeleton of the <A> link marking for an # identifier will be cached in the $lang object. # To guard against any modification of the <A> link structure by # sub idref, a very specific (and improbable) identifier is used. # This allows to make no assumption on idref result. - my $itagtarget = "!!!"; - my @itag = &idref("$itagtarget", "fid", $itagtarget) =~ m/^(.*)$itagtarget(.*)$itagtarget(.*)$/; + my $itagtarget = '---'; + my @itag = &idref($itagtarget, 'fid', $itagtarget) =~ m/^(.*)$itagtarget(.*)$itagtarget(.*)$/; my $lang = LXR::Lang->new($pathname, $releaseid, @itag); if ($lang) { @@ -343,13 +345,13 @@ } elsif ($pathname =~ m/\.($graphic)$/) { # Graphic files are detected by their extension - &$outfun("<b>Image: </b>"); - &$outfun("<img src=\"" - . $config->{'sourceaccess'} - . "/" . $config->variable('v') - . $pathname - . "\" border=\"0\"" - . " alt=\"No access to $pathname or browser cannot display this format\">"); + &$outfun('<b>Image: </b>'); + &$outfun('<img src="' + . $config->{'sourceaccess'} + . '/' . $config->variable('v') + . $pathname + . '" border="0"' + . " alt=\"No access to $pathname or browser cannot display this format\">"); } elsif ($pathname =~ m|/CREDITS$|) { # Special case while (defined($_ = $fileh->getline)) { @@ -366,19 +368,27 @@ # If it's not a script or something with an Emacs spec header and # the first line is very long or containts control characters... - if ( m/^#!/ - && m/-\*-.*-\*-/i - && (length($_) > 132 || m/[\x00-\x08\x0B\x0C\x0E-\x1F\x80-\x9F]/) + if ( substr($_, 0, 2) ne '#!' + && ! m/-\*-.*-\*-/ + && ( length($_) > 132 + || m/[\x00-\x08\x0B\x0C\x0E-\x1F\x80-\x9F]/ + ) ) { # We postulate that it's a binary file. - &$outfun("<ul><b>Binary File: "); + &$outfun('<ul><b>Binary File: '); # jwz: URL-quote any special characters. my $uname = $pathname; $uname =~ s|([^-a-zA-Z0-9.\@/_\r\n])|sprintf("%%%02X", ord($1))|ge; - &$outfun("<a href=\"$config->{virtroot}/source" . $uname . &urlargs("_raw=1") . "\">"); + &$outfun ( '<a href="' + . $config->{'virtroot'} + . 'source' + . $uname + . &urlargs('_raw=1') + . '">' + ); &$outfun("$pathname</a></b>"); - &$outfun("</ul>"); + &$outfun('</ul>'); } else { # Unqualified text file, do minimal work Index: SimpleParse.pm =================================================================== RCS file: /cvsroot/lxr/lxr/lib/LXR/SimpleParse.pm,v retrieving revision 1.21 retrieving revision 1.22 diff -u -d -r1.21 -r1.22 --- SimpleParse.pm 18 Aug 2012 15:47:21 -0000 1.21 +++ SimpleParse.pm 21 Sep 2013 12:54:52 -0000 1.22 @@ -88,26 +88,25 @@ sub init { my @blksep; - $fileh = ""; @frags = (); @bodyid = (); @open = (); @term = (); @stay = (); - $split = ""; - $open = ""; - $continue = ""; + $split = ''; + $open = ''; + $continue = ''; $tabwidth = 8; my $tabhint; ($fileh, $tabhint, @blksep) = @_; - $tabwidth = $tabhint || $tabwidth; + $tabwidth = $tabhint // $tabwidth; # Consider every specification in the order given foreach my $s (@blksep) { # $k is category name (e.g; comment, string, ...) my $k = (keys(%$s))[0]; - if ($k eq "atom") { # special case for uncategorised fragments + if ($k eq 'atom') { # special case for uncategorised fragments $continue = $$s{$k}; } else { @@ -129,7 +128,7 @@ } # Replace the anchors with a Start_of_Line marker - # The markers are removed by sub C<markupfile before + # The markers are removed by sub markupfile before # emiting HTML code foreach (@open) { $_ =~ s/^\^/\xFF/; @@ -172,7 +171,7 @@ Note that this sub is presently only used by sub C<markupfile> when no specific parser definition could be found. -No attempt is made to interpret an Emacs-style tab specification. +No attempt is made to interpret an emacs-style tab specification. Consequently, tab width can be erroneous. =cut @@ -216,12 +215,16 @@ =over +=item + I<Speed is acceptable when displaying a file (since time here is dominated by HTML editing).> -I<<Raw speed can be seen during C<genxref> where the full tree is +=item + +I<Raw speed can be seen during C<genxref> where the full tree is parsed. It could be worth to replace the parser by a compiled -deterministic FSA version.>> +deterministic FSA version.> =back @@ -233,7 +236,7 @@ my $term = undef; # closing delim pattern my $stay = $continue; # lock pattern my $line = ''; # line buffer - # These initial values sets the state for the "anonymous" + # These initial values set the state for the "anonymous" # default category (i.e. code). It is switched to another # state if $next (the following characters to process) # begins with a starting delimiter. @@ -302,7 +305,7 @@ } my $opos = undef; # Look for "term" or any "open delim" if not defined - my $change = $term || $split; + my $change = $term // $split; if ($next =~ m/$change/) { # Compute the position of the "end" delimiter $next =~ m/^(.*?)($change)/s; @@ -324,7 +327,7 @@ # Is it a named category? # Add to output buffer till we find a closing delimiter. # Remember that "stay" constructs have been processed above. - if (defined($btype)) { + if (defined($btype) && defined($term)) { if ($next =~ m/$term/) { # A close delim in this fragment? # Next instruction group is 5.8 compatible but does # not allow capture parenthesis in regexps @@ -452,9 +455,11 @@ =over +=item + When using this sub, pay special attention to the order of requests so that you do not create permutations of source -sequences: it is a LIFO! +sequences: it is a stack (LIFO)! =back |