Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Help on non-regression check on modifications

Help
2011-03-04
2013-01-02
  • Andre-Littoz
    Andre-Littoz
    2011-03-04

    Hello to you all,

    I installed 0.9.8 and stumbled into all the mentioned problems. I tackled them and came to a solution. While doing this bug chasing, I added a few features in LXR. All this a summarized in the text below.

    By the way, has anyone and idea how to attach a text file to a topic? This would save the pain of retyping.

    Sorry for the mix below, the corrections and additions are sorted by filename instead of by topic.

    I am now working on documentation: explaining the parameters in lxr.conf and generic.conf. Interested? If yes, e-mail me on sourceforce.

    Regards,

    Pat

    LXR 0.9.8-PG

    Base: LXR 0.9.8

    Common.pm

    110226 markupfile
      All lines of a file can be the target of a link. For that, they are tagged
      with an anchor <a name=#line>#line</a>. But in the standard implementation
      they also contain href=(to itself) which does not make sense since we'll
      never need to jump from "here" to the "same place".
      Getting rid of the href= will also contribute transferring less data.
      Upon return &fileref(1,"fline",$pathname,1) is split in @ltag to later
      easily generate the tag. The split is modified to "forget" the href:
      instead of /^(<a)(.*\#)001(\">)1(<\/a>)$/
      use /^(<a.*?)(?:href.*\#)001(\">)1(<\/a>)$/
      and change accordingly @ltag
      @ltag BEFORE:
        0 - <a (with name= appended)
        1 - class … href="…#
        2 - ">
        3 - </a>
      @ltag AFTER:
        0 - <a class… (with name=" appended) - double quote needed
        2 - ">
        3 - </a>
      which sums up to:
    !-263,265 110226
    !    my @ltag = &fileref(1, "fline", $pathname, 1) =~ /^(<a.*?)(?:href.*\#)001(\">)1(<\/a>)$/;
    !    $ltag .= 'name="';
    !    $ltag .= " ";

    110227 printhttp
      Some trees may not be written with iso-8859-1 but use other encodings.
      Add a new config parameter to tell LXR which charset has been used.
      Then issue a header with the charset defined in parameter 'encoding'
    !-448,448 110227
    !        print("Content-Type: text/html; charset=", $config->{'encoding'}, "\n");

    110226 httpinit
      In case something goes wrong with initialization (config read in), LXR
      silently dies (from the user point of view).
      Send an error page if it is not too bad (only case presently: no baseurl found)
      If Config.pm::initialize detected an error, it created parameter 'configerror'.
      NOTE: if need arises, detail could be passed through the value of this parameter.
    !-496,496 110226
    !    if (exists $config->{'configerror'}) {
    !        makeerrorpage('htmlfatal');
    !        die "Can't find config for " . $HTTP->{'this_url'};
    !    };

    110227 makeheader
      Add a new variable substitution, so that 'encoding' could be copied
      in a <meta > tag
    !-852 110227
    !                'encoding'   => sub { return $config->{'encoding'}; },

    110226 new sub makeerrorpage
      Issue an error page instead of 'Die'ing silently.
      Since LXR is badly initialized, use as few features as possible.
      For instance, if you don't trust a smart lxr.conf, don't use
      'stylesheet' in your error page template. Anyway, if 'stylesheet'
      does not exist, the page is still built without substitution; it
      might do funny things to the UA but anyhow the page is displayed
      with default styles.
      Added a 'treeextract' parameter to provide a pattern (regexp) to
      extract the source tree name from URL since where to put it is a
      matter of personal taste.
      If 'treeextract' does not exist, extract the second to last part of
      SCRIPT_NAME.
    !-895 110226 & 110301
    !sub makeerrorpage {
    ! my $who = shift;
    ! my $tmplname;
    ! my $template = "<html><body><hr>\n"
    !       . "<div align='center'>\n"
    !       . "<h1>Unrecoverable Error</h1><br>\n"
    !       . "\$tree unknown\n"
    !       . "</div>\n</body></html>\n";
    !
    ! $tmplname = $who;
    !
    ! if ($config->value($tmplname)) {
    ! if (open(TEMPL, $config->value($tmplname))) {
    ! local ($/) = undef;
    ! $template = <TEMPL>;
    ! close(TEMPL);
    ! } else {
    ! warning("Template " . $config->value($tmplname) . " does not exist in ".`pwd`);
    ! }
    ! }
    !
    ! print("Content-Type: text/html; charset=iso-8859-1\n");
    ! print("\n");
    !
    ! my $treeextract = '(*)/*$'; # default: capture before-last fragment
    ! if (exists ($config->{'treeextract'})) {
    ! $treeextract = $config->treeextract;
    ! }
    !
    ! print(
    ! expandtemplate(
    ! $template,
    ! (
    ! 'tree'    => sub { $_ = $ENV{'SCRIPT_NAME' }; m!$treeextract!; return $1; },
    ! 'stylesheet' => sub { stylesheet(@_) },
    ! )
    ! )
    ! );
    ! $config = undef;
    ! $files  = undef;
    ! $index  = undef;
    !}
    !

    Config.pm

    110226 initialize
      Make sure parameter 'encoding' has a reasonable default value
    !-109 110227
    !
    !    $$self{'encoding'} = "iso-8859-1" unless (exists $self->{'encoding'});

    110227 initialize
      In case something goes wrong with initialization (config read in), LXR
      silently dies (from the user point of view).
      Create 'configerror' to tell caller something went screwy. Value explains why.
      Don't do it for genxref, since it is executed from the console and STDERR is
      displayed normally. There, it is better to stop the script.
    !-111,112 110226
    !    if(!exists $self->{baseurl}) {
    !        if("genxref" ne ($0 =~ /(*)$/)) {
    !            $$self{'configerror'} = "nobaseurl"; return 1;
    !        }  elsif($url =~ m!http://.+\.!) {

    ident

    110226 refexpand
    110226 usesexpand
      When the API for the DBs was changed, it seems that the implementor
      forgot to forward the changes in ident.
        Change $release to $releaseid
    getindex symdeclarations
    getreference symreferences
    !-52,52
    !    my @refs = $index->symdeclarations($identifier, $releaseid);
    !-89,89
    !    my @uses = $index->symreferences($identifier, $releaseid);

    Generic.pm

    110304 new
      Feature request: make recognition of identifiers language dependant
      A new parameter has been defined in the 'langmap' for a language:
      'identdef' => partial pattern
      The partial pattern contains only the regexp for an identifier without
      pattern delimiters. It will be used by processcode to tag identifiers
      AND reserved words. It MUST then encompass these 2 classes of symbols.
      processcode add a \b to the right of this pattern.
      Since it is a critical part of the process, ensure that a sensible
      default always exists (here, the pattern used in version 0.9.8).
      Second part of patch in processcode
    !-51 110304
    !    $$self{'langmap'}{$lang}{'identdef'} = '*'
    !        unless defined $self->langinfo('identdef');
    !

    110303 processcode
      This sub HTML-marks the sequences of character equal to a reserved word,
      then does the same for identifiers. If it happens that one identifier looks
      the same as an HTML tag, attribute or attribute value, the HTML marking
      gets marked itself. W3C says HTML sentences cannot be embedded inside a
      <…>. The net result is the browser gets confused and displays garbage.
      HTML-markings must be protected from rescan of file fragment. The only
      way to do it safely, whatever the other processings, is to do all
      replacements simultaneously, so that they are mutually exclusive.
      It has been chosen to scan the fragment from left to right without
      backtracking: candidate prefixes are removed from the fragment and
      processed for replacement. The replacement result is the appended to
      a string. This string is returned as the value of the sub when scanning
      is complete.
      NOTE: lines containing $identdef are related language dependent recognition
    of identifiers (patch dated 110304)
    !155,155 110303 remove processreserved
    !    my $source = $$code;
    !    my $answer = '';
    !    my $identdef = $self->langinfo('identdef');
    !-158,165 110303 replace regexp substitution
    !    while ( $source =~ s/^(.*?)($identdef)\b//s){
    !      $answer .= "$1" .
    !                 ( $self->isreserved($2)
    !                 ? "<span class='reserved'>$2</span>"
    !                 :
    !                   ( $index->issymbol($2, $$self{'releaseid'})
    !                   ? join($2, @{$$self{'itag'}})
    !                   : $2
    !                   )
    !                 );
    !    }
    !    $$code = $answer . $source;

    Lang.pm

    110227 new
      When trying to determine script language, test is for the beginning
      of the last part of a path, i.e. /interp. Matches if key of 'interpreters'
      is a prefix of the script interpreter.
      Unhappily, prevents LXR-scripts from being recognised as they do not
      include a path.
      Test changed to match 'interpreters' name at end of string (without /)
      Potential problem: it is no longer a prefix test; are there cases where
      it matters?
    !-53,53 110227
    !            if ($shebang =~ /$patt$/) {

    110304 processinclude
      Same design flaw as in processcode (Generic.pm): all replacement/markings
      should be processed simultaneously. Since we are supposed to find a single
      include in the fragment, it can be dealt with without loop.
      Beware! It has been very tedious to debug.
    !-77,80 110304
    !    my $source = $$frag;
    !
    ! $source =~ s/^ # reminder: no initial space in the grammar
    ! (\s**) # reserved keyword for include construct
    ! (\s+) # space
    ! (?| (\")(.+?)(\") # C syntax
    ! | (\0<)(.+?)(\0>) # C alternate syntax
    ! | ()(+)(\b) # Perl and others
    ! )
    ! //sx ;
    ! $$frag = ( $self->isreserved($1)
    ! ? "<span class='reserved'>$1</span>"
    ! : "$1"
    ! )
    ! . "$2$3"
    ! . &LXR::Common::incref($4, "include" ,$4 ,$dir)
    ! . "$5"
    ! . $source; # tail if any (e.g. in Perl)

    110304 processcomment
      Leaves an extra <span class="comment"></span> with an empty comment range
      at end of fragment. Removed with following regexp.
    !-87 110304
    ! $$frag =~ s#<span class=\"comment\"></span>$## ; #remove excess marking

    source

    110225 direxpand
      Cosmetic change: to be more XML compliant, close opened <p> with </p>
    !-154,154
    !              . " does not exist.</i>\n</p>\n");
    !-156,156
    !            "\&lt;p align=\"center\">\n<i>This directory might exist in other versions, try 'Show attic files' or select a different Version.</i>\n</p>\n"

    110225 printfile
    !-293,293
    !                "\&lt;p align=\"center\">\n<i>The file $pathname does not exist.</i>\n</p>\n"
    !-296,296
    !                "\&lt;p align=\"center\">\n<i>This file might exist in other versions, try 'Show attic files' or select a different Version.</i>\n</p>\n"

    110225 source init code
    !-308,308
    !    print("\&lt;p align=\"center\">\n<i>The file $pathname does not exist.</i>\n</p>\n");

    Template files

    html-head.html

    With the introduction of the 'encoding' config parameter to advertise the
    character set used in the content, add the following line after <title>:
    !-4
    !<meta http-equiv="content-type" content="text/html; charset=$encoding">

     
  •  tarela_v
    tarela_v
    2011-04-05

    Hello,

    interesting article and a lot of usefull information. Itried to implement you changes but it fails in line**

    (?| (\")(.+?)(\") # C syntax

    with error message:

    Sequence (?|…) not recognized in regex; marked by <- HERE in m/^                                  # reminder: no initial space in the grammar
                                    (\s**)        # reserved keyword for include construct
                                    (\s+)                   # space
                                    (?| <- HERE      (")(.+?)(")   # C syntax
                                    |       (\0<)(.+?)(\0>) # C alternate syntax
                                    |       ()(+)(\b)  # Perl and others
                                    )
                                    / at lib/LXR/Lang.pm line 91.

    Any ide what can be wrong? I am not expert in Perl. On my RedHat Linux install Perl 5.8.8

    Thanks,
    Val**

     
  • Andre-Littoz
    Andre-Littoz
    2011-04-05

    Val,

    Forgot to mention that all the fixes above and much more are in the 0.9.9 release, but it is written Perl 5.10 flavour.

     
  •  tarela_v
    tarela_v
    2011-04-05

    Thanks for detailed answer. Probably I would upgrade to Perl 5.10
    And I did not know that LXR 0.9.9 was already released.

    Thanks,
    Val