lxr-developer Mailing List for LXR Cross Referencer

Brought to you by: ajlittoz

lxr-developer — Don't use, active developers only

You can subscribe to this list here.

2001	Jan	Feb	Mar	Apr	May (11)	Jun (21)	Jul (14)	Aug (83)	Sep (23)	Oct (37)	Nov (52)	Dec (10)
2002	Jan (28)	Feb (40)	Mar (21)	Apr (8)	May (21)	Jun (13)	Jul (9)	Aug (5)	Sep (8)	Oct (7)	Nov (2)	Dec
2003	Jan (2)	Feb (1)	Mar (11)	Apr (4)	May (6)	Jun (15)	Jul (4)	Aug (4)	Sep (9)	Oct (1)	Nov (1)	Dec (1)
2004	Jan (4)	Feb	Mar (4)	Apr (12)	May (5)	Jun (9)	Jul (47)	Aug (1)	Sep (1)	Oct (7)	Nov	Dec (1)
2005	Jan (4)	Feb (2)	Mar (3)	Apr (10)	May (9)	Jun (15)	Jul (3)	Aug (1)	Sep (8)	Oct (9)	Nov (10)	Dec (4)
2006	Jan (1)	Feb	Mar (9)	Apr (5)	May (1)	Jun (6)	Jul (2)	Aug	Sep (5)	Oct (2)	Nov	Dec (3)
2007	Jan (2)	Feb (1)	Mar (32)	Apr (3)	May (3)	Jun (16)	Jul (1)	Aug	Sep	Oct (2)	Nov (4)	Dec (3)
2008	Jan	Feb (1)	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2009	Jan	Feb	Mar (46)	Apr (70)	May (15)	Jun (13)	Jul (1)	Aug	Sep (7)	Oct	Nov	Dec
2010	Jan (5)	Feb (4)	Mar	Apr	May (2)	Jun (1)	Jul (1)	Aug	Sep	Oct (7)	Nov (6)	Dec
2011	Jan (1)	Feb	Mar (85)	Apr (18)	May (4)	Jun (3)	Jul (4)	Aug (1)	Sep	Oct (2)	Nov (2)	Dec (20)
2012	Jan (17)	Feb (16)	Mar (13)	Apr (18)	May	Jun (6)	Jul (6)	Aug (10)	Sep (15)	Oct (10)	Nov (25)	Dec (1)

Flat | Threaded

1 2 3 .. 48 > >> (Page 1 of 48)

[Lxr-dev] [ lxr-Bugs-3594514 ] Template 'htmlfatal' not found

From: SourceForge.net <no...@so...> - 2012-12-10 15:38:25

Bugs item #3594514, was opened at 2012-12-10 07:38
Message generated for change (Tracker Item Submitted) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3594514&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Browsing
Group: current cvs
Status: Open
Resolution: None
Priority: 3
Private: No
Submitted By: Andre-Littoz (ajlittoz)
Assigned to: Nobody/Anonymous (nobody)
Summary: Template 'htmlfatal' not found

Initial Comment:
LXR: all versions

When an incorrect URL is submitted to LXR, it tries to display some meaningful message with template 'htmlfatal'. In this context, no matching 'virtroot' has been found and Config::new returned undef. This means $config references nothing, not even the global parameter group where 'htmlfatal' is defined.

This ends up with 1) an error about 'htmlfatal' not found, 2) using the default built-in minimal template.

By the way, this buil-in template uses substitution marker $tree instead of $target (not upgraded when name changed!)

To fix this malfunction, find a way to get global parameter group into $config in any circumstance.



----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3594514&group_id=2735

[Lxr-dev] [ lxr-Bugs-3588471 ] Missing identifier occurrences in case-insensitive mode

From: SourceForge.net <no...@so...> - 2012-11-21 15:10:34

Bugs item #3588471, was opened at 2012-11-19 02:47
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3588471&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Browsing
Group: v1.1
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Andre-Littoz (ajlittoz)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Missing identifier occurrences in case-insensitive mode

Initial Comment:
LXR 1.0 and 1.1-beta

When querying for identifier in 'ident', occurrences are not reported if they differ in case from the query key. This is important for case-insensitive languages where the identifier is recorded in uppercase into the database.

Suggested fix: query the database a second time with the identifier folded to uppercase then merge the two lists, eliminating duplicates. To warn user, uppercase matches (i.e. approximate matches) could be flagged with a distinctive character.

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-21 07:10

Message:
Feature implemented as suggested

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3588471&group_id=27350

[Lxr-dev] [ lxr-Bugs-3587475 ] Files sometimes mistaken for graphic files

From: SourceForge.net <no...@so...> - 2012-11-21 13:41:44

Bugs item #3587475, was opened at 2012-11-15 03:14
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3587475&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Browsing
Group: v1.0
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Andre-Littoz (ajlittoz)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Files sometimes mistaken for graphic files

Initial Comment:
Release 1.0, but affects all releases since 0.9.9

A pair of parentheses is missing around variable $graphicfile (containing the pattern from lxr.conf) in pattern matching for graphic files at line 344 in Markup.pm. This results in the dot only taken in consideration for the first alternative and the dollar for the last. Consequently, the inner alternatives may match on any part of the filename, erroneously classifying the file as graphics if it has no parser associated with it.

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-21 05:41

Message:
Fixed by surrounding configuration parameter with parentheses in pattern.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3587475&group_id=27350

[Lxr-dev] [ lxr-Bugs-3588471 ] Missing identifier occurrences in case-insensitive mode

From: SourceForge.net <no...@so...> - 2012-11-19 10:47:24

Bugs item #3588471, was opened at 2012-11-19 02:47
Message generated for change (Tracker Item Submitted) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3588471&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Browsing
Group: v1.1
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Andre-Littoz (ajlittoz)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Missing identifier occurrences in case-insensitive mode

Initial Comment:
LXR 1.0 and 1.1-beta

When querying for identifier in 'ident', occurrences are not reported if they differ in case from the query key. This is important for case-insensitive languages where the identifier is recorded in uppercase into the database.

Suggested fix: query the database a second time with the identifier folded to uppercase then merge the two lists, eliminating duplicates. To warn user, uppercase matches (i.e. approximate matches) could be flagged with a distinctive character.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3588471&group_id=27350

[Lxr-dev] [ lxr-Bugs-3587475 ] Files sometimes mistaken for graphic files

From: SourceForge.net <no...@so...> - 2012-11-15 11:14:31

Bugs item #3587475, was opened at 2012-11-15 03:14
Message generated for change (Tracker Item Submitted) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3587475&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Browsing
Group: v1.0
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Andre-Littoz (ajlittoz)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Files sometimes mistaken for graphic files

Initial Comment:
Release 1.0, but affects all releases since 0.9.9

A pair of parentheses is missing around variable $graphicfile (containing the pattern from lxr.conf) in pattern matching for graphic files at line 344 in Markup.pm. This results in the dot only taken in consideration for the first alternative and the dollar for the last. Consequently, the inner alternatives may match on any part of the filename, erroneously classifying the file as graphics if it has no parser associated with it.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3587475&group_id=27350

[Lxr-dev] [ lxr-Bugs-3586369 ] General search for files errors out

From: SourceForge.net <no...@so...> - 2012-11-15 11:04:13

Bugs item #3586369, was opened at 2012-11-12 07:42
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3586369&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Browsing
Group: v1.0
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Andre-Littoz (ajlittoz)
Assigned to: Andre-Littoz (ajlittoz)
Summary: General search for files errors out

Initial Comment:
Release 1.0, but bug present since at least 0.9.7

search script

When looking for files ONLY (without general text), matching filenames are stuffed in the @results array. However, in the other case when a test is search for, the results are an array of several elements, of which the first is the filename. To be able to crawl into the generic results processing, file results must look like one-element arrays instead of mere strings.

Fix: at line 423 (line number for release 1.0), enclose $_ in square brackets as [ $_ ]

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-15 03:04

Message:
Fixed as suggested in CVS

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3586369&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578666 ] Add ignorefiles and extend ignoredirs

From: SourceForge.net <no...@so...> - 2012-11-15 08:11:49

Feature Requests item #3578666, was opened at 2012-10-20 04:14
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578666&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
>Status: Closed
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Add ignorefiles and extend ignoredirs

Initial Comment:
It would be nice to add possibility to add ignorefiles option for files just like ignoredirs for directories. I have added this for my lxr and it's just 3 lines of code.
Would it be also possible for ignoredirs option to handle regexp? I would like to exclude /include dir from indexing (as header files are also within libs).

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-15 00:11

Message:
Extension implemented as 3 new configuration parameters while retaining the
present simple and fast 'ignoredirs'.

a/ 'ignorefiles' is a regexp against the final path segment (aka.
filename). If it matches, file is skipped.
b/ 'filterdirs' is an array of regexp against the full path. If one
matches, directory is skipped.
c/ 'filterfiles' is an array of regexp against the full path. If one
matches, file is skipped.

These exclusion rules are tested inside getfir() function. This function is
a method inside the storage engines (Files/*.pm). It provides a list of a
directory content, considering separately sub-directories and files. Rule
order of application is 'ignore***' first, then 'filter***' if first rule
did not exclude the candidate directory/file.

The exclusion rules are checked only in getdir(). This allows to bypass
them by typing an otherwise forbidden path as an URL in the browser address
bar. Of course, the locally declared variables or functions will not be
highlighted since they have not been indexed by genxref. Ther's no free
meal!

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-05 06:45

Message:
I was thinking of a new pair of parameters.

In your specification proposal, you want to be able to filter the full
path.

Presently, 'ignoredirs' and the new 'ignorefiles' are activated in sub
getdir() when scanning the "current" directory. It is thus very fast to
check only the last segment of the path. I could extend 'ignoredirs' to be
a mixed list of strings and regexps (if I can find an efficient Perl way to
discriminate between then) but still on the last path segment.

The new set (or may be a single parameter, a path is a string after all)
would be an indication that full path filtering is wanted. The reason why
I'd like to have both sets separate is I fear the cost of repetitively
regexp-testing the full path when genxref'ing the kernel (38'000 files and
hundreds of directories with an average path length over 60 characters,
max. around 110 characters). Presently, my best indexing time on my
high-end computer (3.4GHz) is 2 hours 40 minutes on a 3.1 kernel. I had a
hard time to squeeze it from 3:50 to 2:40 (this was through DB requests
restructuring, but directory tree traversal seems also expensive -- I know
the worst step is reference collecting because the parser is written in
Perl [interpretation not execution!!] with regexp instead of a good LR
finite state automaton).

If the set does not exist, I can quickly skip the test. If it exist, I can
launch a "long" test on the full path.

In the single set solution, I don't see how I can keep the fast
last-segment test and switch to the long full-path regexp test.

On what kind of tree do you need such detailed exclusion control? (number
of files/directories, any conventional pattern in names?, mixture of
languages, ...) This information could give me leads in better
understanding your needs.

ajl

PS I've uploaded a beta version of the User Manual with a description of
'ignorefiles'. You can download it through a link in
http://lxr.sf.net/en/index.html. Please give me your feedback.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-05 02:58

Message:
Would it be possible to leave 'ignoredirs'? The list could be extended to
handle something like r'abc', where abc would be regexp. If just name got
given then it would work as it worked before.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 06:04

Message:
I rearchitected the "storage" backend through common factoring 'ignoredirs'
and file filtering processing. They are now located in a single Files.pm
method which can be referenced from the specific classes.

dirs: I can add a new parameter to filter out based on full path instead of
last segment. It is preferentially a regexp to allow accurate exclusion.
However, I fear performance impact on kernel indexing (more than 38'000
files which would trigger the regexp -- mostly to tell "go ahead")

What would suggest for the name of the global directory-excluding
parameter?

files: I replaced the various hard-coded regexp in the storage backends by
a call to the new method which uses regexp contained in 'ignorefiles'. I
also removed the filter in source's direxpand since the regexp already
excludes the previously discarded files (and it is more efficient since the
removal is done when enumerating the directory).

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:21

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-24 07:55

Message:
Mmmh! Your "specification" is hard to twist into the present
implementation. It was designed to be rather efficient: 'ignoredirs' is
taken into consideration when function getdir() is invoked to enumerate the
content of a directory. 'ignoredirs' subdirectories are filtered here. This
is also where 'ignorefiles' could be filtered. But, only this very "local"
path element is compared, not the whole absolute path.

This is very good for large sized projects such as the Linux kernel (~37
000 files and hundreds, maybe thousands, directories). I want to keep
performance on such projects.

'ignoredirs' is also scanned in toreal() function with pattern matching.
This is compatible with a longer path fragment (i.e. containing path
separators). But this function exists only ib Plain.pm and CVS.pm, not in
GIT.pm nor Subversion.pm. Consequently, this is not the place for
implementation.

While I think about an angle of attack, what about the following strategy
since your concern is to prevent duplicates from entering into the DB:
- before genxref step, disable (or remove) the links (ln) causing the
duplicates,
- launch genxref to create the DB without duplicates,
- recreate the links.

This could temporarily solve your problem. If there are too many links, you
can design a small script so that you only type a short command to do the
removal/creation.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-10-21 08:48

Message:
Exactly, I have duplicate files in /include folder. Due to that any search
results in duplcate results. I also cannot add this folder to ignoredirs
because I have some other include dirs in some libs. So really I would like
to ignore only /include folder and not /somelib/include.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:28

Message:
I experimented with 'filter' and finally got it right. To include only Perl
files for instance, add in lxr.conf:

, 'filter' => '(\\/$|\\.pm$)'

The first alternative keeps directories (they have a canonical trailing
slash as fixed in LXR::Common::httpinit); the second keeps only .pm files;

I admit that this INCLUDE rule is probably less flexible as in EXCLUDE
rule. Second, it does not prevent genxref from indexing. I'll add an
'ignorefiles' parameter for both genxref and source.

Could you better explain your "exclude /include dir from indexing (as
header files are also within libs)". Do you mean there is a link resulting
in duplicate files: one set accessed through /include and another accessed
through /libs? I'll see if the "already indexed" featured can cope with
this. Otherwise, add one of the set to 'ignoredirs'.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-10-21 02:00

Message:
Actually, I do not mind displaying the file/directory. I would rather it
not be indexed. 
For ignoring files (from indexing) I just modified the following:

LXR/Files/Plain.pm

                # Check directories to ignore
                if (-d $dir . $node) {
                        foreach my $ignoredir (@{$config->{'ignoredirs'}})
{
                                next FILE if $node eq $ignoredir;
                        }
                        # Directory to keep: suffix name with a slash
                        push(@dirs, $node . '/');
                } else {
-->                        foreach my $ignorefile ($config->ignorefiles) {
-->                                next FILE if $node eq $ignorefile;
-->                        }
                        # File: don't change the name
                        push(@files, $node);
                }

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-20 07:12

Message:
Well, this is new feature which could be included in the next release.

1/ regexp for 'ignoredirs'
I had a quick look at the code sections related to 'ignoredirs'. The change
seems to involve the 'getdir' sub in the vatious Files/ handlers.

2/ files exclusion
I wonder if it is not already there. Look at the end of 'source' script
(lines 401-406 in release 1.0). There is an undocumented call to lxr.conf's
parameter 'filter'. It looks like it should be a regexp SELECTING (not
excluding) which file or directory is displayed. This has been lurking in
'source' for ages and I really never succeeded in setting it up correctly.
The main difficulty for the regexp is to be valid both for directories
(otherwise they can't be listed) and for wanted files. All failures end up
with 'fil does not exist' (which is what I always got!). You might
experiment with it.

Note that this does not exclude files from indexing, meaning you have no
speed improvement in genxref.

I suppose your 3-line solution is something equivalent to line 242 (release
1.0) which excludes *.o, *.a, core files (and also the index files of the
initial LXR implementation). Can you send your patch?

Best regards
ajl

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578666&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578866 ] Add date of last indexation

From: SourceForge.net <no...@so...> - 2012-11-15 07:58:16

Feature Requests item #3578866, was opened at 2012-10-21 02:16
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
>Status: Closed
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Add date of last indexation

Initial Comment:
I run lxr indexation on regular basis. I would like to know if run was successful or not. Would it be possible to add date/time of last successful run somewhere on the page? Or have a separate page with history of successful/failed runs.

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-14 23:58

Message:
Implemented in CVS - will be available in release 1.1

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-08 05:50

Message:
I've done the kernel test. Reported time by 'time ./genxref ...' is 2:40:12
instead of 2;39;40, i.e. it is the same within the uncertainty of the
measurement. It is quite fun to monitor indexing progress in real time in
directory display with a browser.

I proceed with other tests, then I'll implement the warning flag in
identifier search.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-08 02:53

Message:
1- I understand that your main concern is determining if genxref is still
running or not. I needed it also. That's why I changed genxref log so that
work on a file is only one line long, with colours indicating status (or
type of indexing). To quickly have an idea of where genxref is, directory
name is repeated every so often the purple lines) so that it remains
visible even with scrolling.

Of course, if genxref is launched in the background (or on a remote
computer as a batch or cron job), there is no associated terminal and you
have no progress output. In that case, you can try 'top' utility (or at
least 'ps', but it is not very user-friendly). On Windows, I think your
choice is limited to the 'Task manager'.

2- In my test implementation, I record the current time when reference
collection terminates. Definition collection is supposed to have been done
in a previous pass (the one handled by ctags). Since this previous pass
with ctags is quite fast, I thought it was not necessary to log its
completion time.

This time-stamp is an individual file property. Consequently, you can use
your browser to monitor indexing progress by refreshing a directory page.
You'll see the changes in the "Last indexed" column.

As presently implemented, different versions for a file have their own
timestamp which are shown if they are OK. With a small smart test, I can
make the difference between a file which cannot be indexed because there is
no scanner for its language (marked as - in the 'last indexed' column) and
a file modified since last genxref (marked as 'Not valid' since displaying
the date would need a careful visual comparison between last modification
and last indexed dates -- I chose this indication because the date is not
important for me if it is stale).

Visually, it is very fast to see which files will give questionable
cross-reference results.

3- As soon as I've checked the feature reliability, I'll also implement
some flagging in identifier results where it is much more important to know
indexing might be wrong when clicking on a line number (which might not
jump on the identifier line!).

4- I have not yet reindexed a kernel to measure the performance impact. On
my small text cases, it seems negligible but duration is too short to
create a real botherance.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-08 01:43

Message:
Actually "Or maybe warning on directory
listing page, that not all files are indexed?" might not be so easy to do.
Forget it. :)

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-08 01:29

Message:
I guess it's hard to determine successful run. I guess for me it is when
genxref indexed/tried to index all the files. I think it could be even just
finishing genxref (with or without error) because some servers may get
overloaded and operation may not finish overnight. I think I would like to
know that it is still running after 12h. :)
For point 2, I think it could help. I have doubts about that background
color - maybe just warning would be enough. Or maybe warning on directory
listing page, that not all files are indexed? And I guess you will be able
to determine per version if file needs indexing or not?


----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-07 06:13

Message:
What is the definition of "successful run"? or a "failed run"?

Presently, there is nearly no status information returned. There are too
many independent software layers with their own ways of reporting what they
consider an error.

For example, DB errors are "displayed" with an error message, but when it
returns back to Perl, then to LXR, this information is lost!

Globally, genxref is a dispatcher. If spawning an operation is OK, it
considers it a "success", no matter how the dispatched task behaves.

I have already implemented part of the "indexing time" column. It can give
2 pieces of information:
1- whether the file really went through a parser (if not, column is blank;
e.g. .txt files)
2- whether the DB state is up-to-date with file state (if indexing time is
BEFORE file revision time, then cross-references are not valid).

For item 2, I'm having some trouble with svn. In the end, I could give a
background color (light red) in directory listing for files which
cross-references are not valid. I could also add a warning at the top of
file content. And maybe, why not, have a background in identifier search
for references in files which cross-refs are invalid.

Would that be of any help for your concern?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-07 03:27

Message:
Well, it's not really what I was thinking about. I was hoping to have last
date of successful run of genxref (for each tree and version). Or last
couple of runs - successful and failed ones. But its ok to not have this
feature. :) Maybe later.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-05 06:57

Message:
Oh! I missed that point. Following your remark, I can add a field in
lxr_status (which is associated with each base version of a file). With
this implementation there would be a date with each file. There is no DB
table associated with a "version" (because "versions" are rather virtual
with some storage, such as CVS or Git).

If date is in lxr_status, its retrieval is for free. The drawback is I
can't provide it in identifier search (at least until I get a result).

Also, when you index your tree incrementally, date for "file already
indexed" does not change; date is updated only for changed files.

Proposal:
- directory listing: a new column with date of last indexation
- file display: a line with date of last indexation

What do you think?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-05 02:53

Message:
I think it would be ok. Would there be a date per each version for given
tree?

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 05:47

Message:
Would it be right if the date of last indexation appears in the showconfig
page?
This way, the DB would be queried for this date only when that page is
requested. Otherwise, if the date is printed in header/footer area, DB is
queried for every page displayed. Note this is relatively unimportant
because of the numerous other DB accesses. This could also be displayed in
ident page where it is important to be aware of result relevancy.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:22

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:33

Message:
The structure of the database is not compatible with an indexing history
(in my opinion): there is provision for only one state of the references.
Once genxref has begun its work, it should go to the end (unless something
is wrong with the access rights). The only case of inconsistent DB is that
latter error.

I will experiment with a new table to contain a time stamp and put it
somewhere in the header or footer.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

[Lxr-dev] [ lxr-Bugs-3586369 ] General search for files errors out

From: SourceForge.net <no...@so...> - 2012-11-12 15:42:49

Bugs item #3586369, was opened at 2012-11-12 07:42
Message generated for change (Tracker Item Submitted) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3586369&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Browsing
Group: v1.0
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Andre-Littoz (ajlittoz)
Assigned to: Andre-Littoz (ajlittoz)
Summary: General search for files errors out

Initial Comment:
Release 1.0, but bug present since at least 0.9.7

search script

When looking for files ONLY (without general text), matching filenames are stuffed in the @results array. However, in the other case when a test is search for, the results are an array of several elements, of which the first is the filename. To be able to crawl into the generic results processing, file results must look like one-element arrays instead of mere strings.

Fix: at line 423 (line number for release 1.0), enclose $_ in square brackets as [ $_ ]

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3586369&group_id=27350

[Lxr-dev] [ lxr-Bugs-3583172 ] Inconsistent reference to Registry in htaccess

From: SourceForge.net <no...@so...> - 2012-11-11 14:01:26

Bugs item #3583172, was opened at 2012-11-04 01:32
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3583172&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Browsing
Group: v1.0
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Andre-Littoz (ajlittoz)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Inconsistent reference to Registry in htaccess

Initial Comment:
File htaccess-generic configures the Apache web server to run under various configuration: Apache 1.x or 2.x, prefork ou worker module, mod_perl version, ...

When under mod_perl 2 and worker module, Registry module is referenced as Apache::Registry (which is the name to use under mod_perl 1) instread of ModPerl::Registry. On a brand new Apache 2.x configuration, i.e. not incrementally updated for ages, the Apache::Registry module is not found and LXR errors out.

Change for consistent reference.

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-11 06:01

Message:
Fixed in CVS

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3583172&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578866 ] Add date of last indexation

From: SourceForge.net <no...@so...> - 2012-11-08 13:50:56

Feature Requests item #3578866, was opened at 2012-10-21 02:16
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Add date of last indexation

Initial Comment:
I run lxr indexation on regular basis. I would like to know if run was successful or not. Would it be possible to add date/time of last successful run somewhere on the page? Or have a separate page with history of successful/failed runs.

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-08 05:50

Message:
I've done the kernel test. Reported time by 'time ./genxref ...' is 2:40:12
instead of 2;39;40, i.e. it is the same within the uncertainty of the
measurement. It is quite fun to monitor indexing progress in real time in
directory display with a browser.

I proceed with other tests, then I'll implement the warning flag in
identifier search.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-08 02:53

Message:
1- I understand that your main concern is determining if genxref is still
running or not. I needed it also. That's why I changed genxref log so that
work on a file is only one line long, with colours indicating status (or
type of indexing). To quickly have an idea of where genxref is, directory
name is repeated every so often the purple lines) so that it remains
visible even with scrolling.

Of course, if genxref is launched in the background (or on a remote
computer as a batch or cron job), there is no associated terminal and you
have no progress output. In that case, you can try 'top' utility (or at
least 'ps', but it is not very user-friendly). On Windows, I think your
choice is limited to the 'Task manager'.

2- In my test implementation, I record the current time when reference
collection terminates. Definition collection is supposed to have been done
in a previous pass (the one handled by ctags). Since this previous pass
with ctags is quite fast, I thought it was not necessary to log its
completion time.

This time-stamp is an individual file property. Consequently, you can use
your browser to monitor indexing progress by refreshing a directory page.
You'll see the changes in the "Last indexed" column.

As presently implemented, different versions for a file have their own
timestamp which are shown if they are OK. With a small smart test, I can
make the difference between a file which cannot be indexed because there is
no scanner for its language (marked as - in the 'last indexed' column) and
a file modified since last genxref (marked as 'Not valid' since displaying
the date would need a careful visual comparison between last modification
and last indexed dates -- I chose this indication because the date is not
important for me if it is stale).

Visually, it is very fast to see which files will give questionable
cross-reference results.

3- As soon as I've checked the feature reliability, I'll also implement
some flagging in identifier results where it is much more important to know
indexing might be wrong when clicking on a line number (which might not
jump on the identifier line!).

4- I have not yet reindexed a kernel to measure the performance impact. On
my small text cases, it seems negligible but duration is too short to
create a real botherance.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-08 01:43

Message:
Actually "Or maybe warning on directory
listing page, that not all files are indexed?" might not be so easy to do.
Forget it. :)

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-08 01:29

Message:
I guess it's hard to determine successful run. I guess for me it is when
genxref indexed/tried to index all the files. I think it could be even just
finishing genxref (with or without error) because some servers may get
overloaded and operation may not finish overnight. I think I would like to
know that it is still running after 12h. :)
For point 2, I think it could help. I have doubts about that background
color - maybe just warning would be enough. Or maybe warning on directory
listing page, that not all files are indexed? And I guess you will be able
to determine per version if file needs indexing or not?


----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-07 06:13

Message:
What is the definition of "successful run"? or a "failed run"?

Presently, there is nearly no status information returned. There are too
many independent software layers with their own ways of reporting what they
consider an error.

For example, DB errors are "displayed" with an error message, but when it
returns back to Perl, then to LXR, this information is lost!

Globally, genxref is a dispatcher. If spawning an operation is OK, it
considers it a "success", no matter how the dispatched task behaves.

I have already implemented part of the "indexing time" column. It can give
2 pieces of information:
1- whether the file really went through a parser (if not, column is blank;
e.g. .txt files)
2- whether the DB state is up-to-date with file state (if indexing time is
BEFORE file revision time, then cross-references are not valid).

For item 2, I'm having some trouble with svn. In the end, I could give a
background color (light red) in directory listing for files which
cross-references are not valid. I could also add a warning at the top of
file content. And maybe, why not, have a background in identifier search
for references in files which cross-refs are invalid.

Would that be of any help for your concern?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-07 03:27

Message:
Well, it's not really what I was thinking about. I was hoping to have last
date of successful run of genxref (for each tree and version). Or last
couple of runs - successful and failed ones. But its ok to not have this
feature. :) Maybe later.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-05 06:57

Message:
Oh! I missed that point. Following your remark, I can add a field in
lxr_status (which is associated with each base version of a file). With
this implementation there would be a date with each file. There is no DB
table associated with a "version" (because "versions" are rather virtual
with some storage, such as CVS or Git).

If date is in lxr_status, its retrieval is for free. The drawback is I
can't provide it in identifier search (at least until I get a result).

Also, when you index your tree incrementally, date for "file already
indexed" does not change; date is updated only for changed files.

Proposal:
- directory listing: a new column with date of last indexation
- file display: a line with date of last indexation

What do you think?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-05 02:53

Message:
I think it would be ok. Would there be a date per each version for given
tree?

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 05:47

Message:
Would it be right if the date of last indexation appears in the showconfig
page?
This way, the DB would be queried for this date only when that page is
requested. Otherwise, if the date is printed in header/footer area, DB is
queried for every page displayed. Note this is relatively unimportant
because of the numerous other DB accesses. This could also be displayed in
ident page where it is important to be aware of result relevancy.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:22

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:33

Message:
The structure of the database is not compatible with an indexing history
(in my opinion): there is provision for only one state of the references.
Once genxref has begun its work, it should go to the end (unless something
is wrong with the access rights). The only case of inconsistent DB is that
latter error.

I will experiment with a new table to contain a time stamp and put it
somewhere in the header or footer.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578866 ] Add date of last indexation

From: SourceForge.net <no...@so...> - 2012-11-08 10:53:26

Feature Requests item #3578866, was opened at 2012-10-21 02:16
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Add date of last indexation

Initial Comment:
I run lxr indexation on regular basis. I would like to know if run was successful or not. Would it be possible to add date/time of last successful run somewhere on the page? Or have a separate page with history of successful/failed runs.

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-08 02:53

Message:
1- I understand that your main concern is determining if genxref is still
running or not. I needed it also. That's why I changed genxref log so that
work on a file is only one line long, with colours indicating status (or
type of indexing). To quickly have an idea of where genxref is, directory
name is repeated every so often the purple lines) so that it remains
visible even with scrolling.

Of course, if genxref is launched in the background (or on a remote
computer as a batch or cron job), there is no associated terminal and you
have no progress output. In that case, you can try 'top' utility (or at
least 'ps', but it is not very user-friendly). On Windows, I think your
choice is limited to the 'Task manager'.

2- In my test implementation, I record the current time when reference
collection terminates. Definition collection is supposed to have been done
in a previous pass (the one handled by ctags). Since this previous pass
with ctags is quite fast, I thought it was not necessary to log its
completion time.

This time-stamp is an individual file property. Consequently, you can use
your browser to monitor indexing progress by refreshing a directory page.
You'll see the changes in the "Last indexed" column.

As presently implemented, different versions for a file have their own
timestamp which are shown if they are OK. With a small smart test, I can
make the difference between a file which cannot be indexed because there is
no scanner for its language (marked as - in the 'last indexed' column) and
a file modified since last genxref (marked as 'Not valid' since displaying
the date would need a careful visual comparison between last modification
and last indexed dates -- I chose this indication because the date is not
important for me if it is stale).

Visually, it is very fast to see which files will give questionable
cross-reference results.

3- As soon as I've checked the feature reliability, I'll also implement
some flagging in identifier results where it is much more important to know
indexing might be wrong when clicking on a line number (which might not
jump on the identifier line!).

4- I have not yet reindexed a kernel to measure the performance impact. On
my small text cases, it seems negligible but duration is too short to
create a real botherance.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-08 01:43

Message:
Actually "Or maybe warning on directory
listing page, that not all files are indexed?" might not be so easy to do.
Forget it. :)

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-08 01:29

Message:
I guess it's hard to determine successful run. I guess for me it is when
genxref indexed/tried to index all the files. I think it could be even just
finishing genxref (with or without error) because some servers may get
overloaded and operation may not finish overnight. I think I would like to
know that it is still running after 12h. :)
For point 2, I think it could help. I have doubts about that background
color - maybe just warning would be enough. Or maybe warning on directory
listing page, that not all files are indexed? And I guess you will be able
to determine per version if file needs indexing or not?


----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-07 06:13

Message:
What is the definition of "successful run"? or a "failed run"?

Presently, there is nearly no status information returned. There are too
many independent software layers with their own ways of reporting what they
consider an error.

For example, DB errors are "displayed" with an error message, but when it
returns back to Perl, then to LXR, this information is lost!

Globally, genxref is a dispatcher. If spawning an operation is OK, it
considers it a "success", no matter how the dispatched task behaves.

I have already implemented part of the "indexing time" column. It can give
2 pieces of information:
1- whether the file really went through a parser (if not, column is blank;
e.g. .txt files)
2- whether the DB state is up-to-date with file state (if indexing time is
BEFORE file revision time, then cross-references are not valid).

For item 2, I'm having some trouble with svn. In the end, I could give a
background color (light red) in directory listing for files which
cross-references are not valid. I could also add a warning at the top of
file content. And maybe, why not, have a background in identifier search
for references in files which cross-refs are invalid.

Would that be of any help for your concern?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-07 03:27

Message:
Well, it's not really what I was thinking about. I was hoping to have last
date of successful run of genxref (for each tree and version). Or last
couple of runs - successful and failed ones. But its ok to not have this
feature. :) Maybe later.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-05 06:57

Message:
Oh! I missed that point. Following your remark, I can add a field in
lxr_status (which is associated with each base version of a file). With
this implementation there would be a date with each file. There is no DB
table associated with a "version" (because "versions" are rather virtual
with some storage, such as CVS or Git).

If date is in lxr_status, its retrieval is for free. The drawback is I
can't provide it in identifier search (at least until I get a result).

Also, when you index your tree incrementally, date for "file already
indexed" does not change; date is updated only for changed files.

Proposal:
- directory listing: a new column with date of last indexation
- file display: a line with date of last indexation

What do you think?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-05 02:53

Message:
I think it would be ok. Would there be a date per each version for given
tree?

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 05:47

Message:
Would it be right if the date of last indexation appears in the showconfig
page?
This way, the DB would be queried for this date only when that page is
requested. Otherwise, if the date is printed in header/footer area, DB is
queried for every page displayed. Note this is relatively unimportant
because of the numerous other DB accesses. This could also be displayed in
ident page where it is important to be aware of result relevancy.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:22

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:33

Message:
The structure of the database is not compatible with an indexing history
(in my opinion): there is provision for only one state of the references.
Once genxref has begun its work, it should go to the end (unless something
is wrong with the access rights). The only case of inconsistent DB is that
latter error.

I will experiment with a new table to contain a time stamp and put it
somewhere in the header or footer.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578866 ] Add date of last indexation

From: SourceForge.net <no...@so...> - 2012-11-08 09:43:38

Feature Requests item #3578866, was opened at 2012-10-21 02:16
Message generated for change (Comment added) made by myny
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Add date of last indexation

Initial Comment:
I run lxr indexation on regular basis. I would like to know if run was successful or not. Would it be possible to add date/time of last successful run somewhere on the page? Or have a separate page with history of successful/failed runs.

----------------------------------------------------------------------

>Comment By: Lukasz M (myny)
Date: 2012-11-08 01:43

Message:
Actually "Or maybe warning on directory
listing page, that not all files are indexed?" might not be so easy to do.
Forget it. :)

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-08 01:29

Message:
I guess it's hard to determine successful run. I guess for me it is when
genxref indexed/tried to index all the files. I think it could be even just
finishing genxref (with or without error) because some servers may get
overloaded and operation may not finish overnight. I think I would like to
know that it is still running after 12h. :)
For point 2, I think it could help. I have doubts about that background
color - maybe just warning would be enough. Or maybe warning on directory
listing page, that not all files are indexed? And I guess you will be able
to determine per version if file needs indexing or not?


----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-07 06:13

Message:
What is the definition of "successful run"? or a "failed run"?

Presently, there is nearly no status information returned. There are too
many independent software layers with their own ways of reporting what they
consider an error.

For example, DB errors are "displayed" with an error message, but when it
returns back to Perl, then to LXR, this information is lost!

Globally, genxref is a dispatcher. If spawning an operation is OK, it
considers it a "success", no matter how the dispatched task behaves.

I have already implemented part of the "indexing time" column. It can give
2 pieces of information:
1- whether the file really went through a parser (if not, column is blank;
e.g. .txt files)
2- whether the DB state is up-to-date with file state (if indexing time is
BEFORE file revision time, then cross-references are not valid).

For item 2, I'm having some trouble with svn. In the end, I could give a
background color (light red) in directory listing for files which
cross-references are not valid. I could also add a warning at the top of
file content. And maybe, why not, have a background in identifier search
for references in files which cross-refs are invalid.

Would that be of any help for your concern?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-07 03:27

Message:
Well, it's not really what I was thinking about. I was hoping to have last
date of successful run of genxref (for each tree and version). Or last
couple of runs - successful and failed ones. But its ok to not have this
feature. :) Maybe later.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-05 06:57

Message:
Oh! I missed that point. Following your remark, I can add a field in
lxr_status (which is associated with each base version of a file). With
this implementation there would be a date with each file. There is no DB
table associated with a "version" (because "versions" are rather virtual
with some storage, such as CVS or Git).

If date is in lxr_status, its retrieval is for free. The drawback is I
can't provide it in identifier search (at least until I get a result).

Also, when you index your tree incrementally, date for "file already
indexed" does not change; date is updated only for changed files.

Proposal:
- directory listing: a new column with date of last indexation
- file display: a line with date of last indexation

What do you think?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-05 02:53

Message:
I think it would be ok. Would there be a date per each version for given
tree?

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 05:47

Message:
Would it be right if the date of last indexation appears in the showconfig
page?
This way, the DB would be queried for this date only when that page is
requested. Otherwise, if the date is printed in header/footer area, DB is
queried for every page displayed. Note this is relatively unimportant
because of the numerous other DB accesses. This could also be displayed in
ident page where it is important to be aware of result relevancy.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:22

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:33

Message:
The structure of the database is not compatible with an indexing history
(in my opinion): there is provision for only one state of the references.
Once genxref has begun its work, it should go to the end (unless something
is wrong with the access rights). The only case of inconsistent DB is that
latter error.

I will experiment with a new table to contain a time stamp and put it
somewhere in the header or footer.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578866 ] Add date of last indexation

From: SourceForge.net <no...@so...> - 2012-11-08 09:29:38

Feature Requests item #3578866, was opened at 2012-10-21 02:16
Message generated for change (Comment added) made by myny
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Add date of last indexation

Initial Comment:
I run lxr indexation on regular basis. I would like to know if run was successful or not. Would it be possible to add date/time of last successful run somewhere on the page? Or have a separate page with history of successful/failed runs.

----------------------------------------------------------------------

>Comment By: Lukasz M (myny)
Date: 2012-11-08 01:29

Message:
I guess it's hard to determine successful run. I guess for me it is when
genxref indexed/tried to index all the files. I think it could be even just
finishing genxref (with or without error) because some servers may get
overloaded and operation may not finish overnight. I think I would like to
know that it is still running after 12h. :)
For point 2, I think it could help. I have doubts about that background
color - maybe just warning would be enough. Or maybe warning on directory
listing page, that not all files are indexed? And I guess you will be able
to determine per version if file needs indexing or not?


----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-07 06:13

Message:
What is the definition of "successful run"? or a "failed run"?

Presently, there is nearly no status information returned. There are too
many independent software layers with their own ways of reporting what they
consider an error.

For example, DB errors are "displayed" with an error message, but when it
returns back to Perl, then to LXR, this information is lost!

Globally, genxref is a dispatcher. If spawning an operation is OK, it
considers it a "success", no matter how the dispatched task behaves.

I have already implemented part of the "indexing time" column. It can give
2 pieces of information:
1- whether the file really went through a parser (if not, column is blank;
e.g. .txt files)
2- whether the DB state is up-to-date with file state (if indexing time is
BEFORE file revision time, then cross-references are not valid).

For item 2, I'm having some trouble with svn. In the end, I could give a
background color (light red) in directory listing for files which
cross-references are not valid. I could also add a warning at the top of
file content. And maybe, why not, have a background in identifier search
for references in files which cross-refs are invalid.

Would that be of any help for your concern?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-07 03:27

Message:
Well, it's not really what I was thinking about. I was hoping to have last
date of successful run of genxref (for each tree and version). Or last
couple of runs - successful and failed ones. But its ok to not have this
feature. :) Maybe later.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-05 06:57

Message:
Oh! I missed that point. Following your remark, I can add a field in
lxr_status (which is associated with each base version of a file). With
this implementation there would be a date with each file. There is no DB
table associated with a "version" (because "versions" are rather virtual
with some storage, such as CVS or Git).

If date is in lxr_status, its retrieval is for free. The drawback is I
can't provide it in identifier search (at least until I get a result).

Also, when you index your tree incrementally, date for "file already
indexed" does not change; date is updated only for changed files.

Proposal:
- directory listing: a new column with date of last indexation
- file display: a line with date of last indexation

What do you think?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-05 02:53

Message:
I think it would be ok. Would there be a date per each version for given
tree?

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 05:47

Message:
Would it be right if the date of last indexation appears in the showconfig
page?
This way, the DB would be queried for this date only when that page is
requested. Otherwise, if the date is printed in header/footer area, DB is
queried for every page displayed. Note this is relatively unimportant
because of the numerous other DB accesses. This could also be displayed in
ident page where it is important to be aware of result relevancy.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:22

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:33

Message:
The structure of the database is not compatible with an indexing history
(in my opinion): there is provision for only one state of the references.
Once genxref has begun its work, it should go to the end (unless something
is wrong with the access rights). The only case of inconsistent DB is that
latter error.

I will experiment with a new table to contain a time stamp and put it
somewhere in the header or footer.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578866 ] Add date of last indexation

From: SourceForge.net <no...@so...> - 2012-11-07 14:13:24

Feature Requests item #3578866, was opened at 2012-10-21 02:16
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Add date of last indexation

Initial Comment:
I run lxr indexation on regular basis. I would like to know if run was successful or not. Would it be possible to add date/time of last successful run somewhere on the page? Or have a separate page with history of successful/failed runs.

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-07 06:13

Message:
What is the definition of "successful run"? or a "failed run"?

Presently, there is nearly no status information returned. There are too
many independent software layers with their own ways of reporting what they
consider an error.

For example, DB errors are "displayed" with an error message, but when it
returns back to Perl, then to LXR, this information is lost!

Globally, genxref is a dispatcher. If spawning an operation is OK, it
considers it a "success", no matter how the dispatched task behaves.

I have already implemented part of the "indexing time" column. It can give
2 pieces of information:
1- whether the file really went through a parser (if not, column is blank;
e.g. .txt files)
2- whether the DB state is up-to-date with file state (if indexing time is
BEFORE file revision time, then cross-references are not valid).

For item 2, I'm having some trouble with svn. In the end, I could give a
background color (light red) in directory listing for files which
cross-references are not valid. I could also add a warning at the top of
file content. And maybe, why not, have a background in identifier search
for references in files which cross-refs are invalid.

Would that be of any help for your concern?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-07 03:27

Message:
Well, it's not really what I was thinking about. I was hoping to have last
date of successful run of genxref (for each tree and version). Or last
couple of runs - successful and failed ones. But its ok to not have this
feature. :) Maybe later.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-05 06:57

Message:
Oh! I missed that point. Following your remark, I can add a field in
lxr_status (which is associated with each base version of a file). With
this implementation there would be a date with each file. There is no DB
table associated with a "version" (because "versions" are rather virtual
with some storage, such as CVS or Git).

If date is in lxr_status, its retrieval is for free. The drawback is I
can't provide it in identifier search (at least until I get a result).

Also, when you index your tree incrementally, date for "file already
indexed" does not change; date is updated only for changed files.

Proposal:
- directory listing: a new column with date of last indexation
- file display: a line with date of last indexation

What do you think?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-05 02:53

Message:
I think it would be ok. Would there be a date per each version for given
tree?

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 05:47

Message:
Would it be right if the date of last indexation appears in the showconfig
page?
This way, the DB would be queried for this date only when that page is
requested. Otherwise, if the date is printed in header/footer area, DB is
queried for every page displayed. Note this is relatively unimportant
because of the numerous other DB accesses. This could also be displayed in
ident page where it is important to be aware of result relevancy.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:22

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:33

Message:
The structure of the database is not compatible with an indexing history
(in my opinion): there is provision for only one state of the references.
Once genxref has begun its work, it should go to the end (unless something
is wrong with the access rights). The only case of inconsistent DB is that
latter error.

I will experiment with a new table to contain a time stamp and put it
somewhere in the header or footer.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578866 ] Add date of last indexation

From: SourceForge.net <no...@so...> - 2012-11-07 11:27:46

Feature Requests item #3578866, was opened at 2012-10-21 02:16
Message generated for change (Comment added) made by myny
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Add date of last indexation

Initial Comment:
I run lxr indexation on regular basis. I would like to know if run was successful or not. Would it be possible to add date/time of last successful run somewhere on the page? Or have a separate page with history of successful/failed runs.

----------------------------------------------------------------------

>Comment By: Lukasz M (myny)
Date: 2012-11-07 03:27

Message:
Well, it's not really what I was thinking about. I was hoping to have last
date of successful run of genxref (for each tree and version). Or last
couple of runs - successful and failed ones. But its ok to not have this
feature. :) Maybe later.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-05 06:57

Message:
Oh! I missed that point. Following your remark, I can add a field in
lxr_status (which is associated with each base version of a file). With
this implementation there would be a date with each file. There is no DB
table associated with a "version" (because "versions" are rather virtual
with some storage, such as CVS or Git).

If date is in lxr_status, its retrieval is for free. The drawback is I
can't provide it in identifier search (at least until I get a result).

Also, when you index your tree incrementally, date for "file already
indexed" does not change; date is updated only for changed files.

Proposal:
- directory listing: a new column with date of last indexation
- file display: a line with date of last indexation

What do you think?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-05 02:53

Message:
I think it would be ok. Would there be a date per each version for given
tree?

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 05:47

Message:
Would it be right if the date of last indexation appears in the showconfig
page?
This way, the DB would be queried for this date only when that page is
requested. Otherwise, if the date is printed in header/footer area, DB is
queried for every page displayed. Note this is relatively unimportant
because of the numerous other DB accesses. This could also be displayed in
ident page where it is important to be aware of result relevancy.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:22

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:33

Message:
The structure of the database is not compatible with an indexing history
(in my opinion): there is provision for only one state of the references.
Once genxref has begun its work, it should go to the end (unless something
is wrong with the access rights). The only case of inconsistent DB is that
latter error.

I will experiment with a new table to contain a time stamp and put it
somewhere in the header or footer.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578866 ] Add date of last indexation

From: SourceForge.net <no...@so...> - 2012-11-05 14:57:41

Feature Requests item #3578866, was opened at 2012-10-21 02:16
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Add date of last indexation

Initial Comment:
I run lxr indexation on regular basis. I would like to know if run was successful or not. Would it be possible to add date/time of last successful run somewhere on the page? Or have a separate page with history of successful/failed runs.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-05 06:57

Message:
Oh! I missed that point. Following your remark, I can add a field in
lxr_status (which is associated with each base version of a file). With
this implementation there would be a date with each file. There is no DB
table associated with a "version" (because "versions" are rather virtual
with some storage, such as CVS or Git).

If date is in lxr_status, its retrieval is for free. The drawback is I
can't provide it in identifier search (at least until I get a result).

Also, when you index your tree incrementally, date for "file already
indexed" does not change; date is updated only for changed files.

Proposal:
- directory listing: a new column with date of last indexation
- file display: a line with date of last indexation

What do you think?

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-05 02:53

Message:
I think it would be ok. Would there be a date per each version for given
tree?

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 05:47

Message:
Would it be right if the date of last indexation appears in the showconfig
page?
This way, the DB would be queried for this date only when that page is
requested. Otherwise, if the date is printed in header/footer area, DB is
queried for every page displayed. Note this is relatively unimportant
because of the numerous other DB accesses. This could also be displayed in
ident page where it is important to be aware of result relevancy.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:22

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:33

Message:
The structure of the database is not compatible with an indexing history
(in my opinion): there is provision for only one state of the references.
Once genxref has begun its work, it should go to the end (unless something
is wrong with the access rights). The only case of inconsistent DB is that
latter error.

I will experiment with a new table to contain a time stamp and put it
somewhere in the header or footer.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578666 ] Add ignorefiles and extend ignoredirs

From: SourceForge.net <no...@so...> - 2012-11-05 14:45:54

Feature Requests item #3578666, was opened at 2012-10-20 04:14
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578666&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Add ignorefiles and extend ignoredirs

Initial Comment:
It would be nice to add possibility to add ignorefiles option for files just like ignoredirs for directories. I have added this for my lxr and it's just 3 lines of code.
Would it be also possible for ignoredirs option to handle regexp? I would like to exclude /include dir from indexing (as header files are also within libs).

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-05 06:45

Message:
I was thinking of a new pair of parameters.

In your specification proposal, you want to be able to filter the full
path.

Presently, 'ignoredirs' and the new 'ignorefiles' are activated in sub
getdir() when scanning the "current" directory. It is thus very fast to
check only the last segment of the path. I could extend 'ignoredirs' to be
a mixed list of strings and regexps (if I can find an efficient Perl way to
discriminate between then) but still on the last path segment.

The new set (or may be a single parameter, a path is a string after all)
would be an indication that full path filtering is wanted. The reason why
I'd like to have both sets separate is I fear the cost of repetitively
regexp-testing the full path when genxref'ing the kernel (38'000 files and
hundreds of directories with an average path length over 60 characters,
max. around 110 characters). Presently, my best indexing time on my
high-end computer (3.4GHz) is 2 hours 40 minutes on a 3.1 kernel. I had a
hard time to squeeze it from 3:50 to 2:40 (this was through DB requests
restructuring, but directory tree traversal seems also expensive -- I know
the worst step is reference collecting because the parser is written in
Perl [interpretation not execution!!] with regexp instead of a good LR
finite state automaton).

If the set does not exist, I can quickly skip the test. If it exist, I can
launch a "long" test on the full path.

In the single set solution, I don't see how I can keep the fast
last-segment test and switch to the long full-path regexp test.

On what kind of tree do you need such detailed exclusion control? (number
of files/directories, any conventional pattern in names?, mixture of
languages, ...) This information could give me leads in better
understanding your needs.

ajl

PS I've uploaded a beta version of the User Manual with a description of
'ignorefiles'. You can download it through a link in
http://lxr.sf.net/en/index.html. Please give me your feedback.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-11-05 02:58

Message:
Would it be possible to leave 'ignoredirs'? The list could be extended to
handle something like r'abc', where abc would be regexp. If just name got
given then it would work as it worked before.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 06:04

Message:
I rearchitected the "storage" backend through common factoring 'ignoredirs'
and file filtering processing. They are now located in a single Files.pm
method which can be referenced from the specific classes.

dirs: I can add a new parameter to filter out based on full path instead of
last segment. It is preferentially a regexp to allow accurate exclusion.
However, I fear performance impact on kernel indexing (more than 38'000
files which would trigger the regexp -- mostly to tell "go ahead")

What would suggest for the name of the global directory-excluding
parameter?

files: I replaced the various hard-coded regexp in the storage backends by
a call to the new method which uses regexp contained in 'ignorefiles'. I
also removed the filter in source's direxpand since the regexp already
excludes the previously discarded files (and it is more efficient since the
removal is done when enumerating the directory).

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:21

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-24 07:55

Message:
Mmmh! Your "specification" is hard to twist into the present
implementation. It was designed to be rather efficient: 'ignoredirs' is
taken into consideration when function getdir() is invoked to enumerate the
content of a directory. 'ignoredirs' subdirectories are filtered here. This
is also where 'ignorefiles' could be filtered. But, only this very "local"
path element is compared, not the whole absolute path.

This is very good for large sized projects such as the Linux kernel (~37
000 files and hundreds, maybe thousands, directories). I want to keep
performance on such projects.

'ignoredirs' is also scanned in toreal() function with pattern matching.
This is compatible with a longer path fragment (i.e. containing path
separators). But this function exists only ib Plain.pm and CVS.pm, not in
GIT.pm nor Subversion.pm. Consequently, this is not the place for
implementation.

While I think about an angle of attack, what about the following strategy
since your concern is to prevent duplicates from entering into the DB:
- before genxref step, disable (or remove) the links (ln) causing the
duplicates,
- launch genxref to create the DB without duplicates,
- recreate the links.

This could temporarily solve your problem. If there are too many links, you
can design a small script so that you only type a short command to do the
removal/creation.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-10-21 08:48

Message:
Exactly, I have duplicate files in /include folder. Due to that any search
results in duplcate results. I also cannot add this folder to ignoredirs
because I have some other include dirs in some libs. So really I would like
to ignore only /include folder and not /somelib/include.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:28

Message:
I experimented with 'filter' and finally got it right. To include only Perl
files for instance, add in lxr.conf:

, 'filter' => '(\\/$|\\.pm$)'

The first alternative keeps directories (they have a canonical trailing
slash as fixed in LXR::Common::httpinit); the second keeps only .pm files;

I admit that this INCLUDE rule is probably less flexible as in EXCLUDE
rule. Second, it does not prevent genxref from indexing. I'll add an
'ignorefiles' parameter for both genxref and source.

Could you better explain your "exclude /include dir from indexing (as
header files are also within libs)". Do you mean there is a link resulting
in duplicate files: one set accessed through /include and another accessed
through /libs? I'll see if the "already indexed" featured can cope with
this. Otherwise, add one of the set to 'ignoredirs'.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-10-21 02:00

Message:
Actually, I do not mind displaying the file/directory. I would rather it
not be indexed. 
For ignoring files (from indexing) I just modified the following:

LXR/Files/Plain.pm

                # Check directories to ignore
                if (-d $dir . $node) {
                        foreach my $ignoredir (@{$config->{'ignoredirs'}})
{
                                next FILE if $node eq $ignoredir;
                        }
                        # Directory to keep: suffix name with a slash
                        push(@dirs, $node . '/');
                } else {
-->                        foreach my $ignorefile ($config->ignorefiles) {
-->                                next FILE if $node eq $ignorefile;
-->                        }
                        # File: don't change the name
                        push(@files, $node);
                }

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-20 07:12

Message:
Well, this is new feature which could be included in the next release.

1/ regexp for 'ignoredirs'
I had a quick look at the code sections related to 'ignoredirs'. The change
seems to involve the 'getdir' sub in the vatious Files/ handlers.

2/ files exclusion
I wonder if it is not already there. Look at the end of 'source' script
(lines 401-406 in release 1.0). There is an undocumented call to lxr.conf's
parameter 'filter'. It looks like it should be a regexp SELECTING (not
excluding) which file or directory is displayed. This has been lurking in
'source' for ages and I really never succeeded in setting it up correctly.
The main difficulty for the regexp is to be valid both for directories
(otherwise they can't be listed) and for wanted files. All failures end up
with 'fil does not exist' (which is what I always got!). You might
experiment with it.

Note that this does not exclude files from indexing, meaning you have no
speed improvement in genxref.

I suppose your 3-line solution is something equivalent to line 242 (release
1.0) which excludes *.o, *.a, core files (and also the index files of the
initial LXR implementation). Can you send your patch?

Best regards
ajl

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578666&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578666 ] Add ignorefiles and extend ignoredirs

From: SourceForge.net <no...@so...> - 2012-11-05 10:58:09

Feature Requests item #3578666, was opened at 2012-10-20 04:14
Message generated for change (Comment added) made by myny
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578666&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Add ignorefiles and extend ignoredirs

Initial Comment:
It would be nice to add possibility to add ignorefiles option for files just like ignoredirs for directories. I have added this for my lxr and it's just 3 lines of code.
Would it be also possible for ignoredirs option to handle regexp? I would like to exclude /include dir from indexing (as header files are also within libs).

----------------------------------------------------------------------

>Comment By: Lukasz M (myny)
Date: 2012-11-05 02:58

Message:
Would it be possible to leave 'ignoredirs'? The list could be extended to
handle something like r'abc', where abc would be regexp. If just name got
given then it would work as it worked before.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 06:04

Message:
I rearchitected the "storage" backend through common factoring 'ignoredirs'
and file filtering processing. They are now located in a single Files.pm
method which can be referenced from the specific classes.

dirs: I can add a new parameter to filter out based on full path instead of
last segment. It is preferentially a regexp to allow accurate exclusion.
However, I fear performance impact on kernel indexing (more than 38'000
files which would trigger the regexp -- mostly to tell "go ahead")

What would suggest for the name of the global directory-excluding
parameter?

files: I replaced the various hard-coded regexp in the storage backends by
a call to the new method which uses regexp contained in 'ignorefiles'. I
also removed the filter in source's direxpand since the regexp already
excludes the previously discarded files (and it is more efficient since the
removal is done when enumerating the directory).

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:21

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-24 07:55

Message:
Mmmh! Your "specification" is hard to twist into the present
implementation. It was designed to be rather efficient: 'ignoredirs' is
taken into consideration when function getdir() is invoked to enumerate the
content of a directory. 'ignoredirs' subdirectories are filtered here. This
is also where 'ignorefiles' could be filtered. But, only this very "local"
path element is compared, not the whole absolute path.

This is very good for large sized projects such as the Linux kernel (~37
000 files and hundreds, maybe thousands, directories). I want to keep
performance on such projects.

'ignoredirs' is also scanned in toreal() function with pattern matching.
This is compatible with a longer path fragment (i.e. containing path
separators). But this function exists only ib Plain.pm and CVS.pm, not in
GIT.pm nor Subversion.pm. Consequently, this is not the place for
implementation.

While I think about an angle of attack, what about the following strategy
since your concern is to prevent duplicates from entering into the DB:
- before genxref step, disable (or remove) the links (ln) causing the
duplicates,
- launch genxref to create the DB without duplicates,
- recreate the links.

This could temporarily solve your problem. If there are too many links, you
can design a small script so that you only type a short command to do the
removal/creation.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-10-21 08:48

Message:
Exactly, I have duplicate files in /include folder. Due to that any search
results in duplcate results. I also cannot add this folder to ignoredirs
because I have some other include dirs in some libs. So really I would like
to ignore only /include folder and not /somelib/include.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:28

Message:
I experimented with 'filter' and finally got it right. To include only Perl
files for instance, add in lxr.conf:

, 'filter' => '(\\/$|\\.pm$)'

The first alternative keeps directories (they have a canonical trailing
slash as fixed in LXR::Common::httpinit); the second keeps only .pm files;

I admit that this INCLUDE rule is probably less flexible as in EXCLUDE
rule. Second, it does not prevent genxref from indexing. I'll add an
'ignorefiles' parameter for both genxref and source.

Could you better explain your "exclude /include dir from indexing (as
header files are also within libs)". Do you mean there is a link resulting
in duplicate files: one set accessed through /include and another accessed
through /libs? I'll see if the "already indexed" featured can cope with
this. Otherwise, add one of the set to 'ignoredirs'.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-10-21 02:00

Message:
Actually, I do not mind displaying the file/directory. I would rather it
not be indexed. 
For ignoring files (from indexing) I just modified the following:

LXR/Files/Plain.pm

                # Check directories to ignore
                if (-d $dir . $node) {
                        foreach my $ignoredir (@{$config->{'ignoredirs'}})
{
                                next FILE if $node eq $ignoredir;
                        }
                        # Directory to keep: suffix name with a slash
                        push(@dirs, $node . '/');
                } else {
-->                        foreach my $ignorefile ($config->ignorefiles) {
-->                                next FILE if $node eq $ignorefile;
-->                        }
                        # File: don't change the name
                        push(@files, $node);
                }

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-20 07:12

Message:
Well, this is new feature which could be included in the next release.

1/ regexp for 'ignoredirs'
I had a quick look at the code sections related to 'ignoredirs'. The change
seems to involve the 'getdir' sub in the vatious Files/ handlers.

2/ files exclusion
I wonder if it is not already there. Look at the end of 'source' script
(lines 401-406 in release 1.0). There is an undocumented call to lxr.conf's
parameter 'filter'. It looks like it should be a regexp SELECTING (not
excluding) which file or directory is displayed. This has been lurking in
'source' for ages and I really never succeeded in setting it up correctly.
The main difficulty for the regexp is to be valid both for directories
(otherwise they can't be listed) and for wanted files. All failures end up
with 'fil does not exist' (which is what I always got!). You might
experiment with it.

Note that this does not exclude files from indexing, meaning you have no
speed improvement in genxref.

I suppose your 3-line solution is something equivalent to line 242 (release
1.0) which excludes *.o, *.a, core files (and also the index files of the
initial LXR implementation). Can you send your patch?

Best regards
ajl

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578666&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578866 ] Add date of last indexation

From: SourceForge.net <no...@so...> - 2012-11-05 10:53:30

Feature Requests item #3578866, was opened at 2012-10-21 02:16
Message generated for change (Comment added) made by myny
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Add date of last indexation

Initial Comment:
I run lxr indexation on regular basis. I would like to know if run was successful or not. Would it be possible to add date/time of last successful run somewhere on the page? Or have a separate page with history of successful/failed runs.

----------------------------------------------------------------------

>Comment By: Lukasz M (myny)
Date: 2012-11-05 02:53

Message:
I think it would be ok. Would there be a date per each version for given
tree?

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 05:47

Message:
Would it be right if the date of last indexation appears in the showconfig
page?
This way, the DB would be queried for this date only when that page is
requested. Otherwise, if the date is printed in header/footer area, DB is
queried for every page displayed. Note this is relatively unimportant
because of the numerous other DB accesses. This could also be displayed in
ident page where it is important to be aware of result relevancy.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:22

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:33

Message:
The structure of the database is not compatible with an indexing history
(in my opinion): there is provision for only one state of the references.
Once genxref has begun its work, it should go to the end (unless something
is wrong with the access rights). The only case of inconsistent DB is that
latter error.

I will experiment with a new table to contain a time stamp and put it
somewhere in the header or footer.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

[Lxr-dev] [ lxr-Bugs-3583172 ] Inconsistent reference to Registry in htaccess

From: SourceForge.net <no...@so...> - 2012-11-04 09:32:31

Bugs item #3583172, was opened at 2012-11-04 01:32
Message generated for change (Tracker Item Submitted) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3583172&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Browsing
Group: v1.0
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Andre-Littoz (ajlittoz)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Inconsistent reference to Registry in htaccess

Initial Comment:
File htaccess-generic configures the Apache web server to run under various configuration: Apache 1.x or 2.x, prefork ou worker module, mod_perl version, ...

When under mod_perl 2 and worker module, Registry module is referenced as Apache::Registry (which is the name to use under mod_perl 1) instread of ModPerl::Registry. On a brand new Apache 2.x configuration, i.e. not incrementally updated for ages, the Apache::Registry module is not found and LXR errors out.

Change for consistent reference.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3583172&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578666 ] Add ignorefiles and extend ignoredirs

From: SourceForge.net <no...@so...> - 2012-11-02 13:04:59

Feature Requests item #3578666, was opened at 2012-10-20 04:14
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578666&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
>Category: General
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
>Assigned to: Andre-Littoz (ajlittoz)
Summary: Add ignorefiles and extend ignoredirs

Initial Comment:
It would be nice to add possibility to add ignorefiles option for files just like ignoredirs for directories. I have added this for my lxr and it's just 3 lines of code.
Would it be also possible for ignoredirs option to handle regexp? I would like to exclude /include dir from indexing (as header files are also within libs).

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 06:04

Message:
I rearchitected the "storage" backend through common factoring 'ignoredirs'
and file filtering processing. They are now located in a single Files.pm
method which can be referenced from the specific classes.

dirs: I can add a new parameter to filter out based on full path instead of
last segment. It is preferentially a regexp to allow accurate exclusion.
However, I fear performance impact on kernel indexing (more than 38'000
files which would trigger the regexp -- mostly to tell "go ahead")

What would suggest for the name of the global directory-excluding
parameter?

files: I replaced the various hard-coded regexp in the storage backends by
a call to the new method which uses regexp contained in 'ignorefiles'. I
also removed the filter in source's direxpand since the regexp already
excludes the previously discarded files (and it is more efficient since the
removal is done when enumerating the directory).

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:21

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-24 07:55

Message:
Mmmh! Your "specification" is hard to twist into the present
implementation. It was designed to be rather efficient: 'ignoredirs' is
taken into consideration when function getdir() is invoked to enumerate the
content of a directory. 'ignoredirs' subdirectories are filtered here. This
is also where 'ignorefiles' could be filtered. But, only this very "local"
path element is compared, not the whole absolute path.

This is very good for large sized projects such as the Linux kernel (~37
000 files and hundreds, maybe thousands, directories). I want to keep
performance on such projects.

'ignoredirs' is also scanned in toreal() function with pattern matching.
This is compatible with a longer path fragment (i.e. containing path
separators). But this function exists only ib Plain.pm and CVS.pm, not in
GIT.pm nor Subversion.pm. Consequently, this is not the place for
implementation.

While I think about an angle of attack, what about the following strategy
since your concern is to prevent duplicates from entering into the DB:
- before genxref step, disable (or remove) the links (ln) causing the
duplicates,
- launch genxref to create the DB without duplicates,
- recreate the links.

This could temporarily solve your problem. If there are too many links, you
can design a small script so that you only type a short command to do the
removal/creation.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-10-21 08:48

Message:
Exactly, I have duplicate files in /include folder. Due to that any search
results in duplcate results. I also cannot add this folder to ignoredirs
because I have some other include dirs in some libs. So really I would like
to ignore only /include folder and not /somelib/include.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:28

Message:
I experimented with 'filter' and finally got it right. To include only Perl
files for instance, add in lxr.conf:

, 'filter' => '(\\/$|\\.pm$)'

The first alternative keeps directories (they have a canonical trailing
slash as fixed in LXR::Common::httpinit); the second keeps only .pm files;

I admit that this INCLUDE rule is probably less flexible as in EXCLUDE
rule. Second, it does not prevent genxref from indexing. I'll add an
'ignorefiles' parameter for both genxref and source.

Could you better explain your "exclude /include dir from indexing (as
header files are also within libs)". Do you mean there is a link resulting
in duplicate files: one set accessed through /include and another accessed
through /libs? I'll see if the "already indexed" featured can cope with
this. Otherwise, add one of the set to 'ignoredirs'.

----------------------------------------------------------------------

Comment By: Lukasz M (myny)
Date: 2012-10-21 02:00

Message:
Actually, I do not mind displaying the file/directory. I would rather it
not be indexed. 
For ignoring files (from indexing) I just modified the following:

LXR/Files/Plain.pm

                # Check directories to ignore
                if (-d $dir . $node) {
                        foreach my $ignoredir (@{$config->{'ignoredirs'}})
{
                                next FILE if $node eq $ignoredir;
                        }
                        # Directory to keep: suffix name with a slash
                        push(@dirs, $node . '/');
                } else {
-->                        foreach my $ignorefile ($config->ignorefiles) {
-->                                next FILE if $node eq $ignorefile;
-->                        }
                        # File: don't change the name
                        push(@files, $node);
                }

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-20 07:12

Message:
Well, this is new feature which could be included in the next release.

1/ regexp for 'ignoredirs'
I had a quick look at the code sections related to 'ignoredirs'. The change
seems to involve the 'getdir' sub in the vatious Files/ handlers.

2/ files exclusion
I wonder if it is not already there. Look at the end of 'source' script
(lines 401-406 in release 1.0). There is an undocumented call to lxr.conf's
parameter 'filter'. It looks like it should be a regexp SELECTING (not
excluding) which file or directory is displayed. This has been lurking in
'source' for ages and I really never succeeded in setting it up correctly.
The main difficulty for the regexp is to be valid both for directories
(otherwise they can't be listed) and for wanted files. All failures end up
with 'fil does not exist' (which is what I always got!). You might
experiment with it.

Note that this does not exclude files from indexing, meaning you have no
speed improvement in genxref.

I suppose your 3-line solution is something equivalent to line 242 (release
1.0) which excludes *.o, *.a, core files (and also the index files of the
initial LXR implementation). Can you send your patch?

Best regards
ajl

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578666&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578866 ] Add date of last indexation

From: SourceForge.net <no...@so...> - 2012-11-02 12:47:51

Feature Requests item #3578866, was opened at 2012-10-21 02:16
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
>Category: General
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
>Assigned to: Andre-Littoz (ajlittoz)
Summary: Add date of last indexation

Initial Comment:
I run lxr indexation on regular basis. I would like to know if run was successful or not. Would it be possible to add date/time of last successful run somewhere on the page? Or have a separate page with history of successful/failed runs.

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 05:47

Message:
Would it be right if the date of last indexation appears in the showconfig
page?
This way, the DB would be queried for this date only when that page is
requested. Otherwise, if the date is printed in header/footer area, DB is
queried for every page displayed. Note this is relatively unimportant
because of the numerous other DB accesses. This could also be displayed in
ident page where it is important to be aware of result relevancy.

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:22

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:33

Message:
The structure of the database is not compatible with an indexing history
(in my opinion): there is provision for only one state of the references.
Once genxref has begun its work, it should go to the end (unless something
is wrong with the access rights). The only case of inconsistent DB is that
latter error.

I will experiment with a new table to contain a time stamp and put it
somewhere in the header or footer.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

[Lxr-dev] [ lxr-Bugs-3580435 ] Files not filtered out with GIT

From: SourceForge.net <no...@so...> - 2012-11-02 09:00:38

Bugs item #3580435, was opened at 2012-10-26 00:46
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3580435&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: SCM support
Group: v1.0
>Status: Closed
>Resolution: Fixed
Priority: 4
Private: No
Submitted By: Andre-Littoz (ajlittoz)
Assigned to: Andre-Littoz (ajlittoz)
Summary: Files not filtered out with GIT

Initial Comment:
Release 1.0.0, GIT module

Files beginning with dot or ending with tilde are listed instead of being skipped.

Caused by reference to wrong variable ($node instead of $entryname) at line 115 in GIT.pm (unchecked copy-and-paste!)

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-02 02:00

Message:
Fixed in CVS

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390117&aid=3580435&group_id=27350

[Lxr-dev] [ lxr-Feature Requests-3578866 ] Add date of last indexation

From: SourceForge.net <no...@so...> - 2012-11-01 08:22:10

Feature Requests item #3578866, was opened at 2012-10-21 02:16
Message generated for change (Comment added) made by ajlittoz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
>Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Lukasz M (myny)
Assigned to: Nobody/Anonymous (nobody)
Summary: Add date of last indexation

Initial Comment:
I run lxr indexation on regular basis. I would like to know if run was successful or not. Would it be possible to add date/time of last successful run somewhere on the page? Or have a separate page with history of successful/failed runs.

----------------------------------------------------------------------

>Comment By: Andre-Littoz (ajlittoz)
Date: 2012-11-01 01:22

Message:
Transferred from "support request" to "feature request"

----------------------------------------------------------------------

Comment By: Andre-Littoz (ajlittoz)
Date: 2012-10-21 08:33

Message:
The structure of the database is not compatible with an indexing history
(in my opinion): there is provision for only one state of the references.
Once genxref has begun its work, it should go to the end (unless something
is wrong with the access rights). The only case of inconsistent DB is that
latter error.

I will experiment with a new table to contain a time stamp and put it
somewhere in the header or footer.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=390120&aid=3578866&group_id=27350

6 messages has been excluded from this view by a project administrator.

Flat | Threaded

1 2 3 .. 48 > >> (Page 1 of 48)