lxr-developer Mailing List for LXR Cross Referencer (Page 40)
Brought to you by:
ajlittoz
You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(11) |
Jun
(21) |
Jul
(14) |
Aug
(83) |
Sep
(23) |
Oct
(37) |
Nov
(52) |
Dec
(10) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(28) |
Feb
(40) |
Mar
(21) |
Apr
(8) |
May
(21) |
Jun
(13) |
Jul
(9) |
Aug
(5) |
Sep
(8) |
Oct
(7) |
Nov
(2) |
Dec
|
2003 |
Jan
(2) |
Feb
(1) |
Mar
(11) |
Apr
(4) |
May
(6) |
Jun
(15) |
Jul
(4) |
Aug
(4) |
Sep
(9) |
Oct
(1) |
Nov
(1) |
Dec
(1) |
2004 |
Jan
(4) |
Feb
|
Mar
(4) |
Apr
(12) |
May
(5) |
Jun
(9) |
Jul
(47) |
Aug
(1) |
Sep
(1) |
Oct
(7) |
Nov
|
Dec
(1) |
2005 |
Jan
(4) |
Feb
(2) |
Mar
(3) |
Apr
(10) |
May
(9) |
Jun
(15) |
Jul
(3) |
Aug
(1) |
Sep
(8) |
Oct
(9) |
Nov
(10) |
Dec
(4) |
2006 |
Jan
(1) |
Feb
|
Mar
(9) |
Apr
(5) |
May
(1) |
Jun
(6) |
Jul
(2) |
Aug
|
Sep
(5) |
Oct
(2) |
Nov
|
Dec
(3) |
2007 |
Jan
(2) |
Feb
(1) |
Mar
(32) |
Apr
(3) |
May
(3) |
Jun
(16) |
Jul
(1) |
Aug
|
Sep
|
Oct
(2) |
Nov
(4) |
Dec
(3) |
2008 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2009 |
Jan
|
Feb
|
Mar
(46) |
Apr
(70) |
May
(15) |
Jun
(13) |
Jul
(1) |
Aug
|
Sep
(7) |
Oct
|
Nov
|
Dec
|
2010 |
Jan
(5) |
Feb
(4) |
Mar
|
Apr
|
May
(2) |
Jun
(1) |
Jul
(1) |
Aug
|
Sep
|
Oct
(7) |
Nov
(6) |
Dec
|
2011 |
Jan
(1) |
Feb
|
Mar
(85) |
Apr
(18) |
May
(4) |
Jun
(3) |
Jul
(4) |
Aug
(1) |
Sep
|
Oct
(2) |
Nov
(2) |
Dec
(20) |
2012 |
Jan
(17) |
Feb
(16) |
Mar
(13) |
Apr
(18) |
May
|
Jun
(6) |
Jul
(6) |
Aug
(10) |
Sep
(15) |
Oct
(10) |
Nov
(25) |
Dec
(1) |
From: Arne G. G. <ar...@li...> - 2001-11-19 08:26:37
|
* Kristoffer Gleditsch > I run into the error mentioned in <200...@lu...> > pretty often: "** Fatal: Can't locate LXR/Lang/Generic.pm in @INC > (@INC contains: ...". It's the same module it complains about every > time, but I'm not very familiar with mod_perl, so I don't know where > to start debugging it. I fudged this one like this over here: Index: .htaccess =================================================================== RCS file: /cvsroot/lxr/lxr/.htaccess,v retrieving revision 1.3 diff -u -w -r1.3 .htaccess --- .htaccess 2000/10/31 12:52:10 1.3 +++ .htaccess 2001/11/19 08:03:32 @@ -13,4 +13,5 @@ <Files ~ (find|search|source|ident|diff|cgi-bin)$> SetHandler perl-script PerlHandler Apache::Registry +PerlSetEnv PERL5LIB /home/argggh/src/ping/lxr/lib </Files> A bit gross perhaps, but it works. > Syntax stuff in the initdb-postgres file (I don't know if this will > work on older versions of Postgres): > > > Index: initdb-postgres > =================================================================== > RCS file: /cvsroot/lxr/lxr/initdb-postgres,v > retrieving revision 1.4 > diff -u -b -B -r1.4 initdb-postgres > --- initdb-postgres 2001/11/18 03:31:33 1.4 > +++ initdb-postgres 2001/11/19 00:42:06 I think you want @@ -72,4 +76,4 @@ grant select on releases to public; grant select on usage to public; grant select on status to public; - +grant select on declarations to public; somewhere in there as well. > I made some changes to LXR::Index::Postgres.pm as well. (This diff is > made with -ubB, as the indenting in that file is not consistent, and I > don't want to make the patch bigger than necessary.) > > - My PSQL setup needs the username and passwd arguments to > $dbi->connect(), so I added those. > > - Fixed the occasional typo; a forgotten $ and a place where a field > in the database had changed name from type to declid. I think d.type is now d.declaration, actually, but I'm not sure. Malcolm? Additionally, I tried indexing 2.4.8 on my laptop using this setup. I had to restart the indexing process a few times, and in so doing I ended up profiling genxref a bit and trimming the database-activity some. I think something like this would be nice: Index: lib/LXR/Lang/Generic.pm =================================================================== RCS file: /cvsroot/lxr/lxr/lib/LXR/Lang/Generic.pm,v retrieving revision 1.8 diff -u -w -r1.8 Generic.pm --- lib/LXR/Lang/Generic.pm 2001/11/18 03:31:34 1.8 +++ lib/LXR/Lang/Generic.pm 2001/11/19 08:12:05 @@ -28,10 +28,11 @@ use LXR::Common; use LXR::Lang; -use vars qw($AUTOLOAD); +use vars qw($AUTOLOAD $generic_config); @LXR::Lang::Generic::ISA = ('LXR::Lang'); sub new { my ($proto, $pathname, $release, $lang) = @_; my $class = ref($proto) || $proto; @@ -39,35 +40,48 @@ bless ($self, $class); $$self{'release'} = $release; $$self{'language'} = $lang; + + read_config() unless defined $generic_config; + %$self = (%$self, %$generic_config); + + # Set langid + $$self{'langid'} = $self->langinfo('langid'); + die "No langid for language $lang" if !defined $self->langid; + + return $self; +} - open (X, $config->genericconf) || die "Can't open $config->genericconf, $!"; +sub read_config { + open (CONF, $config->genericconf) + || die "Can't open $config->genericconf, $!"; + local($/) = undef; - my $cfg = eval ("\n#line 1 \"generic.conf\"\n". - <X>); + $generic_config = eval ("\n#line 1 \"generic.conf\"\n". + <CONF>); die ($@) if $@; - close X; - %$self= (%$self, %$cfg); + close CONF; - # Set langid - $$self{'langid'} = $self->langinfo('langid'); - die "No langid for language $lang" if !defined $self->langid; + my $langmap = $generic_config->{'langmap'}; - # Setup the ctags to declid mapping - my $typemap =\%{$self->langinfo('typemap')}; + foreach my $lang (keys %$langmap) { + my $typemap = $langmap->{$lang}{'typemap'}; + my $typeid = $langmap->{$lang}{'typeid'} = {}; foreach my $type (keys %$typemap) { - $typemap->{$type}=$index->getdecid($self->langid, $typemap->{$type}); + $typeid->{$type} = + $index->getdecid($langmap->{$lang}{'langid'}, + $typemap->{$type}); } - - return $self; } +} + sub indexfile { my ($self, $name, $path, $fileid, $index, $config) = @_; - my $typemap = $self->langinfo('typemap'); + my $typemap = $self->langinfo('typeid'); my $langforce = $ {$self->eclangnamemapping}{$self->language}; if (!defined $langforce) { And, while hacking the above, I came to miss something like this: Index: lib/LXR/Lang.pm =================================================================== RCS file: /cvsroot/lxr/lxr/lib/LXR/Lang.pm,v retrieving revision 1.24 diff -u -w -r1.24 Lang.pm --- lib/LXR/Lang.pm 2001/11/14 15:03:29 1.24 +++ lib/LXR/Lang.pm 2001/11/19 08:12:04 @@ -30,6 +30,7 @@ foreach $type (values %{$config->filetype}) { if ($pathname =~ /$$type[1]/) { eval "require $$type[2]"; + die "Unable to load $$type[2] Lang class, $@" if $@; my $create = "new $$type[2]".'($pathname, $release, $$type[0])'; $lang = eval($create); die "Unable to create $$type[2] Lang object, $@" unless defined $lang; This last paragraph was originally dedicated to a rant about Postgres not using indexes on the usage-table queries, men when I tried to test it again just now, well, it did. I've booted the box since I tried yesterday, but I was pretty sure I tried restarting Postgres when I ran my head into this then. Darn. Several hours wasted yesterday, then. Arne. |
From: Jan-Benedict G. <jb...@lu...> - 2001-11-18 08:32:42
|
On Sun, 2001-11-18 12:35:56 +0900, Malcolm Box <ma...@br...> wrote in message <3BF...@br...>: > Hi all, > > This is a heads-up for those of you using Postgres with the current CVS > version. I'm about to land some changes which will alter the structure > of the database. This is to support each language having its own > strings when displaying identifiers in ident. I volunteer for testing, as fast as my board comes back from Tyan (died successfully sending blue smoke signals:-( MfG, JBG -- Jan-Benedict Glaw . jb...@lu... . +49-172-7608481 http://lug-owl.de/~jbglaw/software/snapshot2cvs/ |
From: <no...@so...> - 2001-11-18 03:47:41
|
Bugs item #482977, was opened at 2001-11-17 19:47 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=482977&group_id=27350 Category: Database interface Group: current cvs Status: Open Resolution: None Priority: 5 Submitted By: Malcolm Box (mbox) Assigned to: Malcolm Box (mbox) Summary: declarations table not in 1NF Initial Comment: The new declarations table is joined to indexes via declid and langid. But currently declid is globally unique, so the langid field in indexes is redundant, though it might speed up language based searching. If declid was made non-unique across languages it would mean that languages could manage their own namespace without having to get a declid assigned from the database. However, having it unique makes the indexes lookup faster. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=482977&group_id=27350 |
From: Malcolm B. <ma...@br...> - 2001-11-18 03:38:20
|
Robin Theander wrote: > I'm going to contact the source to see if it can be GPL'ed or whatever. > If you know any other GPL'ed VHDL parser that's just the yacc skeleton with > grammar, I could hack that up instead. A random thought that strayed across my neurones - what would be the effort to get your parser integrated into ctags? I think ctags has a reasonably well-defined extension system, and if it was in ctags then (a) all the LXR support would be in place and (b) all the other tools like emacs/vi etc that use ctags would also benefit. Malcolm |
From: <no...@so...> - 2001-11-18 03:34:47
|
Bugs item #476695, was opened at 2001-10-31 01:18 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=476695&group_id=27350 Category: Lang support Group: current cvs >Status: Closed >Resolution: Fixed Priority: 8 Submitted By: Malcolm Box (mbox) Assigned to: Malcolm Box (mbox) Summary: Java interfaces display as docs Initial Comment: When doing an ident search for a Java interface, it will be listed in the output as a "documentation entry". This is because the 'i' character is mapped to "documentation entry" by Common.pm The simple solution is to change this mapping, but the problem goes deeper than that. The output of ctags is not consistent across different languages. For example, "m" can mean method, module, macros, class member or mixins. The solution will be to change the db structure to use an int for the type, and provide ctags -> int mappings for each language. There should be enough space in a 8 bit int for all different concepts across all languages. Even better might be to give each language its own translation strings, so that Java methods are displayed as methods, not functions etc. However, currently ident does not interface to Lang.pm, so the mapping would have to be part of a global namespace. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=476695&group_id=27350 |
From: Malcolm B. <ma...@br...> - 2001-11-18 03:32:03
|
Hi all, This is a heads-up for those of you using Postgres with the current CVS version. I'm about to land some changes which will alter the structure of the database. This is to support each language having its own strings when displaying identifiers in ident. Since I don't have Postgres here, I've only tested the changes against MySQL. I've made what I think are the correct alterations to Postgres.pm, and it compiles, but I can't test it. So could someone who runs Postgres grab the head and give it a whirl? (I'm so totally out of diskspace that installing Postgres is not an option :-( ) Thanks, Malcolm |
From: Malcolm B. <ma...@br...> - 2001-11-17 15:56:39
|
Hi, Per Kristian Gjermshus wrote: > > What do you think - are we at the point where we could declare a 1.0 > > release and move on to the next major development cycle, or are we > > still > > missing key features, bug fixes or stability? > > There are currently 6 open bugs in the bugtracker. I don't think we > should declare 1.0 before all of them are resolved. We could either > decide that a bug is no problem for 1.0 or we should fix it. I agree, we should go through all the bugs and either postpone them to the next release or fix them for 1.0. Currently there are the following bugs outstanding: 447980 Generic config missing 463138 Unable to enter full glimpse REs 469413 Non-symbols can be mistaken for symbols 471858 Some characters in files create trouble 476695 Java interfaces display as docs 476773 Shouldn't install global signal handlers 476775 mod_perl coding style should be checked 481573 requires non-free software for searching 481597 Should index X::Y() as well as Y() Of these, I think "Generic config missing", "Non-symbols can be mistaken for symbols", "Shouldn't install global signal handlers", "mod_perl coding style should be checked" and "Should index X::Y as well as Y" can all be futured to post 1.0. That leave 4 bugs remaining. I have a fix in progress for 476695 and I can see a way to fix 481573. 471858 & 463138 are the same underlying bug to do with how aggressively we wash incoming http parameters. The signal handlers fix is trivial to remove, but replacing with a good solution is more difficult, so maybe we should future this one. All in all, there's not too much left to do I feel. Of course, there's still the lurking "unable to locate module Foo" bug that seems to crop up periodically, but without a reliable way to reproduce it nor a good diagnoses I don't know what to do about it. > Another thing is that I think that 1.0 should be able to index the > entire linux-kernel and run on lxr.linux.no. It must of course be me who > does this testing. It would be very good to be able to showcase the new version on lxr.linux.no. Are you likely to be able to install the new version any time soon? It may also be possible to install a demo site on the sourceforge webservers, possibly indexing some of the other projects on SF. I haven't yet investigated how easy this will be to set up. > We should also decide on which database backends to support. We should > not ship backends that do not work. Have anyone ever gotten the DBFile > backend to work? I know of no-one who has got it to work - unless someone steps up to make it work and maintain it, I suggest it should be dropped from the release. There's also a PR job to do with all the various sites round the web that run the LXR to try to convince them to upgrade to the new version. So far I know of: cvs.gnome.org lxr.mozilla.org who are running the 0.3 codebase. If anyone knows of any others, please let me know so I can contact them and suggest they might want to try the newer version. Cheers, Malcolm |
From: <no...@so...> - 2001-11-17 15:37:16
|
Bugs item #481573, was opened at 2001-11-13 20:05 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=481573&group_id=27350 Category: None Group: None Status: Open Resolution: None Priority: 9 Submitted By: Nobody/Anonymous (nobody) >Assigned to: Malcolm Box (mbox) Summary: requires non-free software for searching Initial Comment: On your front page you suggest Glimpse, which is horridly non-free: http://www.arco.de/~kj/harvest/glimpse-license-status This program is GPL, so a user might assume that it's dependancies are GPL. Please provide hooks for Swish-E http://swish-e.org/ or Swish++ http://homepage.mac.com/pauljlucas/software/swish/ which are GPL replacements. If I am in error and this feature allready exists in lxr, please change the content of http://lxr.linux.no/ to suggest Swish-E or Swish++. I might note that I caught this as a part of our department's implementation of Bugzilla (bonsai uses lxr) which is a mission-critical application for development here, and now that we know that Glimpse has *never* been in the public domain (the source we found had a misleading copyright file) we need to find a replacement or pay up before the audit rolls around. ---------------------------------------------------------------------- Comment By: Rusty Carruth (rustyc) Date: 2001-11-14 06:25 Message: Logged In: YES user_id=215914 There *is* a version of glimpse (I think around 4.0) which does not require a fee. This is the version we are using. A temporary workaround would be to grab that version and use it while 'someone' is fixing lxr to use the free alternative. (We found that version of glimpse by looking around on the 'net till we found it.) rc ---------------------------------------------------------------------- Comment By: Malcolm Box (mbox) Date: 2001-11-14 00:06 Message: Logged In: YES user_id=215386 I agree, depending on non-free software for a tool like LXR is not a good thing. Moving over to one of the free alternatives is on the to-do list. If you wanted to go ahead and provide the hooks for a replacement, I'd be very happy to incorporate the patch in the LXR. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=481573&group_id=27350 |
From: <no...@so...> - 2001-11-17 15:36:42
|
Bugs item #471858, was opened at 2001-10-16 13:53 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=471858&group_id=27350 Category: Browsing Group: current cvs Status: Open Resolution: None Priority: 5 Submitted By: Per Kristian Gjermshus (pergj) >Assigned to: Malcolm Box (mbox) Summary: Some characters in files create trouble Initial Comment: I have a directory in my source-tree containing the string 'c-++'. It is not possible to browse these directories. ---------------------------------------------------------------------- Comment By: Malcolm Box (mbox) Date: 2001-10-23 07:16 Message: Logged In: YES user_id=215386 Looks like this is caused by the httpwash function being over-zealous at stripping out characters. Perhaps we can get away with not washing variables - it seems that there are few dangerous calls that use web-provided parameters, and we could simply check at these places for troublesome characters, rather than globally restrict them. This is related to the glimpse RE bug as well ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=471858&group_id=27350 |
From: <no...@so...> - 2001-11-17 15:36:40
|
Bugs item #476775, was opened at 2001-10-31 06:34 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=476775&group_id=27350 Category: Browsing Group: current cvs Status: Open Resolution: None Priority: 5 Submitted By: Malcolm Box (mbox) >Assigned to: Malcolm Box (mbox) Summary: mod_perl coding style should be checked Initial Comment: The mod_perl coding hints from http://www.perlreference.com/mod_perl/guide/porting.html should be checked through and applied. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=476775&group_id=27350 |
From: <no...@so...> - 2001-11-17 15:36:40
|
Bugs item #476773, was opened at 2001-10-31 06:33 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=476773&group_id=27350 Category: Browsing Group: current cvs Status: Open Resolution: None Priority: 7 Submitted By: Malcolm Box (mbox) >Assigned to: Malcolm Box (mbox) Summary: Shouldn't install global signal handlers Initial Comment: Currently Common.pm installs global signal handlers for DIE and WARN. This means that all scripts running under mod_perl will use these handlers, even if they are not part of LXR. This is wrong. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=476773&group_id=27350 |
From: <no...@so...> - 2001-11-17 15:36:40
|
Bugs item #476695, was opened at 2001-10-31 01:18 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=476695&group_id=27350 Category: Lang support Group: current cvs Status: Open Resolution: None Priority: 8 Submitted By: Malcolm Box (mbox) >Assigned to: Malcolm Box (mbox) Summary: Java interfaces display as docs Initial Comment: When doing an ident search for a Java interface, it will be listed in the output as a "documentation entry". This is because the 'i' character is mapped to "documentation entry" by Common.pm The simple solution is to change this mapping, but the problem goes deeper than that. The output of ctags is not consistent across different languages. For example, "m" can mean method, module, macros, class member or mixins. The solution will be to change the db structure to use an int for the type, and provide ctags -> int mappings for each language. There should be enough space in a 8 bit int for all different concepts across all languages. Even better might be to give each language its own translation strings, so that Java methods are displayed as methods, not functions etc. However, currently ident does not interface to Lang.pm, so the mapping would have to be part of a global namespace. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=476695&group_id=27350 |
From: <no...@so...> - 2001-11-17 15:36:39
|
Bugs item #463138, was opened at 2001-09-20 02:35 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=463138&group_id=27350 Category: Browsing Group: current cvs Status: Open Resolution: None Priority: 5 Submitted By: Malcolm Box (mbox) >Assigned to: Malcolm Box (mbox) Summary: Unable to enter full glimpse REs Initial Comment: Glimpse supports regular expressions with characters such as < & > in them, and also allows these to be escaped ie \< & \>. However, the http washing code treats these as illegal and aborts the search request. Clearly the freetext searching functions should allow a wider range of permissible inputs than the other scripts. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=463138&group_id=27350 |
From: Malcolm B. <ma...@br...> - 2001-11-17 15:34:34
|
Robin Theander wrote: > > Hi Malcolm, > > > What I half-implemented last night does the following: > > I do not see it in CVS yet ;-) > BTW, is there any way of getting head revision out of the SF CVS by tarball. That's cos it's only half implemented and thus not checked in :-) I don't know how to get the HEAD revision out of the tarball - I assumed that if you untarred the tarball as a CVS repository, you could use the normal CVS commands to retreive the head, but I've never tried it. > > 1) Expand indexes.type to an int field > > 2) Create a declarations table containing declid (int) and declaration > > (char(255)) > > If you use two ints (one for the language and one for the id) each language can > have its own "namespace" (my original major, minor idea). The impact is small > if using two small ints. Good point. I've changed my implementation to use two numbers, one for the language code and one for the string id within the language. This also stops two languages sharing the string for say, "class", which is good if one then wants to change it. > Given the current bindings from filename to language module, it makes sense to > put it into the files table. Your webpage example is pretty scary, but a > special language module should take care of this. Unless the page is > preprocessed in some hairy way, there's a parser going to read it at some time. I'm going to put it in indexes since this is the logically correct place. In the webpage example, I would expect a special language module to take care of this, but it might then record the different languages found under multiple lang ids. Possibly by delgating the work to different Lang::* modules. e.g. Webpage.pm -> split file into different languages -> pass to X.pm & Y.pm -> index. > I checked up on the Alliance toolkit. The parser is divided into two. > Behavorial and structural, and there's lot of language (like variable) not > supported. It is a parser that is almost but not quite entirely unlike a VHDL > parser... Not so good. > The good news is that the VAUL parser is derived from the parser i also used, > so all I have to do is to recreate my special parser from the VAUL source and > we're in GPL. That is good news. Good luck with the extraction. Malcolm |
From: Robin T. <Rob...@te...> - 2001-11-15 13:37:25
|
Hi Malcolm, > What I half-implemented last night does the following: I do not see it in CVS yet ;-) BTW, is there any way of getting head revision out of the SF CVS by tarball. > 1) Expand indexes.type to an int field > 2) Create a declarations table containing declid (int) and declaration > (char(255)) If you use two ints (one for the language and one for the id) each language can have its own "namespace" (my original major, minor idea). The impact is small if using two small ints. > 3) Each language now maps the ctags types output/whatever other type > info it has to an int (currently hardwired) > 4) Index::index() now stores the type field as an int > 5) Ident then joins declarations.declid to indexes.type to get the right > string for display. > > If we want to record the language the identifier was found in, it can > either go in the files table, or in the indexes. Putting it in files > implies that there is a 1 to 1 mapping from files to languages, which > while currently true may not always be (think webpages with scripting in > multiple languages :- ) . Putting it in indexes is a little redundant > at the moment, especially since indexes is one of the biggest tables. Given the current bindings from filename to language module, it makes sense to put it into the files table. Your webpage example is pretty scary, but a special language module should take care of this. Unless the page is preprocessed in some hairy way, there's a parser going to read it at some time. > The licence is the number one problem. Cleaning up the code so it runs > on non-gcc platforms would be good, but it's not essential since gcc is > so widely available. I checked up on the Alliance toolkit. The parser is divided into two. Behavorial and structural, and there's lot of language (like variable) not supported. It is a parser that is almost but not quite entirely unlike a VHDL parser... The good news is that the VAUL parser is derived from the parser i also used, so all I have to do is to recreate my special parser from the VAUL source and we're in GPL. Robin. -- ASIC Design Engineer Tellabs Denmark A/S Direct: +45 4473 2942 rob...@te... |
From: Robin T. <Rob...@te...> - 2001-11-15 08:07:55
|
Hi Malcom, Malcolm Box wrote: > I guess you've already looked at them, but a websearch did find some > other parsers, including a Perl one at > http://www.cpan.org/modules/by-module/Hardware/ though it sounds *slow*. Yup, the language definition is so huge and complex that a single small entity with a dummy arch takes 2+ minutes to parse. > There's a VHDL compiler at Alliance, http://www-asim.lip6.fr/alliance/ > which is GPL'ed and thus might have a grammer that you could rip out. Hmm, "my" parser was actually derived from an Alliance toolkit way back. I'll have a second look. > And there's VAUL from http://www.freehdl.seul.org/frontend.html which > claims to be a flex/bison job. I looked at the VAUL. It's a big thing and clutched togethed from several projects. I expected the job ripping and cleaning the parser to be bigger than starting over. Then I found the current skeleton... Thanks anyway. Robin. -- ASIC Design Engineer Tellabs Denmark A/S Direct: +45 4473 2942 rob...@te... |
From: Malcolm B. <ma...@br...> - 2001-11-15 07:21:15
|
Hi, I guess you've already looked at them, but a websearch did find some other parsers, including a Perl one at http://www.cpan.org/modules/by-module/Hardware/ though it sounds *slow*. There's a VHDL compiler at Alliance, http://www-asim.lip6.fr/alliance/ which is GPL'ed and thus might have a grammer that you could rip out. And there's VAUL from http://www.freehdl.seul.org/frontend.html which claims to be a flex/bison job. Malcolm |
From: Malcolm B. <ma...@br...> - 2001-11-15 05:26:38
|
Hi Robin, Robin Theander wrote: >Malcolm Box wrote: > >>Logically the mappings should be per-language, and ideally Common.pm >>would not depend directly on the installed languages - ie it would not >>hold the list of mappings. The problem is that the ident script doesn't >>know what language each of the returned identifiers is in to display the >>correct string. >> >>Probably the best solution is to create another database table that maps >>a numeric id to a string, and then have the language modules store the >>id number where they now store a character. Then each language module >>can contain the string <-> number mapping, and simply check on >>initalisation that the strings are in the db, adding them if not. >> > >Is'nt it too much hassle to put the strings in db? They can be in the languange >module together with all the other language specific stuff. > >In my little head it goes like this: >1) Each identifier has to know what language it is. > That's the rub - currently each identifier does not store this information. It seems like something that could be stored in the indexes table, and perhaps it would be worth it. >2) Language types are possibly ints allocated and defined somewhere in each >language module (or in generic.conf). >3) Each language module defines its own type to string mapping (as ints instead >of chars?). >4) ident looks for the relevant language mapping in the relevant module to >return the string. > This would potentially be pretty slow - assume you have a common identifier such as "close" which might appear in many different languages. Each ident result returned would involve creating the correct language module and then asking it for the type -> string mapping. You could order the results by language, to reduce the create/destroy count, but this is unlikely to be the order people actually want to see the results in. Given I've got a LXR install running where one common identifier returns over 1000 declarations, I'm not sure I'd want to take the hit of that. What I half-implemented last night does the following: 1) Expand indexes.type to an int field 2) Create a declarations table containing declid (int) and declaration (char(255)) 3) Each language now maps the ctags types output/whatever other type info it has to an int (currently hardwired) 4) Index::index() now stores the type field as an int 5) Ident then joins declarations.declid to indexes.type to get the right string for display. The only difficulty is initialising the mapping in (3). My current plan is to have each language hold its own type strings (e.g. "class", "function definition") and on startup build the mapping string -> int by searching declarations for the string and use the declid if found, else insert the string and use the new declid. For languages using ctags, the initialisation would also build the appropriate ctags char -> declid mapping. This should be pretty fast and is a one-time cost for the language module, which is OK because the Lang modules are only used for genxref & source, both of which have much bigger overheads than that. Note this doesn't require ident to know about Lang::* modules. If we want to record the language the identifier was found in, it can either go in the files table, or in the indexes. Putting it in files implies that there is a 1 to 1 mapping from files to languages, which while currently true may not always be (think webpages with scripting in multiple languages :- ) . Putting it in indexes is a little redundant at the moment, especially since indexes is one of the biggest tables. >This could also make identifiers local to their language, but how should that >be handled when >1) Searching from scratch >2) Displaying the identifier from a link from source (here we know the >language). > Making identifiers carry language info is a good idea - it will help for the source -> ident -> source jump that happens so often, since we will be able to filter to identifiers from the same language (and even possibly order by whether the id is in the same file or directory, which would make it much faster to navigate). ident would then be extended to allow selection of a language when searching, defaulting to all langugages as at present. >I think ident should take an optional language identifier from the URL. This >could be generated from source. > Indeed, that's how I see it working. >Did I miss out on anything here? I haven't been in every dim lit corner of the >code.. > Don't go there without a light, or the grues will get you... >And something in the far dark of my mind says that the changes probably should >be compatible with dbm support. > dbm support doesn't work at the moment anyway - there was a discussion here about dropping it totally soon if no-one is prepared to work on it. The overhead of getting someone to set up a RDBMs is so low that I don't see it as a big issue, not to mention the fact that dbm performance was why 0.3 sucked so much on big repositories. >>I think it will be OK to add a C module to the distribution, provided it >>comes with some reasonable way to build it. My guess (correct me if I'm >>wrong) would be that the parser is pretty much vanilla C with no >>platform dependancies, so it should be easy to make build. I would >>suggest creating a lib/LXR/Lang/VHDL subdir to keep the source and build >>system in. Then those that want VHDL support can build it, and those >>that don't can just comment out the config in lxr.conf that maps files >>to VHDL (and in fact won't ever see a problem unless they have files >>that look like VHDL). >> > >I agree, but... The code skeleton has the following license (the files are from >'93): > * This file is intended not to be used for commercial purposes > * without permission of the University of Twente and permission > * of the University of Dortmund > >I'm going to contact the source to see if it can be GPL'ed or whatever. >If you know any other GPL'ed VHDL parser that's just the yacc skeleton with >grammar, I could hack that up instead. > I'd be very reluctant to let any non-free code into the main distribution. I know glimpse isn't free, but there are moves afoot to replace it (probably with Swish-E2) RSN. I don't know of any other VHDL parser out there, although perhaps VHDL mode from emacs might have something useful? >The parser code btw, required quite many hacks. It was in a very old lex >dialect and gave some trouble with both flex and gcc. It would probably need >some more cleaming to run on non gcc platforms. Again, the largest problem is >probably the license. > The licence is the number one problem. Cleaning up the code so it runs on non-gcc platforms would be good, but it's not essential since gcc is so widely available. Cheers, Malcolm |
From: <no...@so...> - 2001-11-14 15:42:14
|
Bugs item #426646, was opened at 2001-05-23 08:02 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=426646&group_id=27350 Category: Database interface Group: None >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Malcolm Box (mbox) >Assigned to: Malcolm Box (mbox) Summary: Mysql insert statements fragile Initial Comment: The insert statements in the mysql backend have an implied dependance on the order of the fields in the tables. This is likely to be fragile in the long run if the tables are changed. The fix is to name the columns explicitly. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=426646&group_id=27350 |
From: <no...@so...> - 2001-11-14 15:27:47
|
Bugs item #447979, was opened at 2001-08-04 11:22 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=447979&group_id=27350 Category: Browsing Group: current cvs >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Malcolm Box (mbox) Assigned to: Nobody/Anonymous (nobody) Summary: Java import hyperlinking Initial Comment: Java import statement hyperlinking is not as good as it could be, since it doesn't link to the file being imported. ---------------------------------------------------------------------- Comment By: Malcolm Box (mbox) Date: 2001-10-31 01:05 Message: Logged In: YES user_id=215386 This also doesn't work properly for package statements at the moment. Ideally, a package statement would provide a fileref to the specified directory (assuming it's in the tree). import should provide a fileref to the relevant package, or a direct link to the right class/interface. To do this will involve tagging classes/interfaces as belonging to packages so that we can search for them easily. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=447979&group_id=27350 |
From: <no...@so...> - 2001-11-14 15:16:21
|
Bugs item #474752, was opened at 2001-10-24 22:56 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=474752&group_id=27350 Category: Browsing Group: current cvs >Status: Closed >Resolution: Fixed Priority: 4 Submitted By: Malcolm Box (mbox) Assigned to: Malcolm Box (mbox) Summary: Tabwidth of files should be configurable Initial Comment: Currently tabs are always 8 spaces wide on display (unless there is an emacs tabwidth line in the file). This should be made configurable, either on a per-language or per-site basis. ---------------------------------------------------------------------- >Comment By: Malcolm Box (mbox) Date: 2001-11-14 07:16 Message: Logged In: YES user_id=215386 Tabwidth is now configurable on a per-language basis from the lxr.conf file ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=474752&group_id=27350 |
From: Robin T. <Rob...@te...> - 2001-11-14 15:05:27
|
Hi Malcom, Malcolm Box wrote: > I agree, the current scheme is broken and needs to be replaced. It > doesn't even work properly with ctags, since the meaning of the letters > ctags outputs is not constant across languages. See the bug at > http://sourceforge.net/tracker/index.php?func=detail&aid=476695&group_id=27350&atid=390117 > for an example of what's wrong. I'm not using LXR with C code, but I noticed the mess when reading the man page for ctags. > Logically the mappings should be per-language, and ideally Common.pm > would not depend directly on the installed languages - ie it would not > hold the list of mappings. The problem is that the ident script doesn't > know what language each of the returned identifiers is in to display the > correct string. > > Probably the best solution is to create another database table that maps > a numeric id to a string, and then have the language modules store the > id number where they now store a character. Then each language module > can contain the string <-> number mapping, and simply check on > initalisation that the strings are in the db, adding them if not. Is'nt it too much hassle to put the strings in db? They can be in the languange module together with all the other language specific stuff. In my little head it goes like this: 1) Each identifier has to know what language it is. 2) Language types are possibly ints allocated and defined somewhere in each language module (or in generic.conf). 3) Each language module defines its own type to string mapping (as ints instead of chars?). 4) ident looks for the relevant language mapping in the relevant module to return the string. 5) The indexfile function in each module should insert the type number and the language number. This could also make identifiers local to their language, but how should that be handled when 1) Searching from scratch 2) Displaying the identifier from a link from source (here we know the language). I think ident should take an optional language identifier from the URL. This could be generated from source. Did I miss out on anything here? I haven't been in every dim lit corner of the code.. And something in the far dark of my mind says that the changes probably should be compatible with dbm support. > Yes, that's fine to include I think. Is glimpse support working for you > - I think it's actually broken against a recent version of glimpse? I found a glimpse RPM version 4.12.5 from somewhere and it runs fine with the path fix. > I think it will be OK to add a C module to the distribution, provided it > comes with some reasonable way to build it. My guess (correct me if I'm > wrong) would be that the parser is pretty much vanilla C with no > platform dependancies, so it should be easy to make build. I would > suggest creating a lib/LXR/Lang/VHDL subdir to keep the source and build > system in. Then those that want VHDL support can build it, and those > that don't can just comment out the config in lxr.conf that maps files > to VHDL (and in fact won't ever see a problem unless they have files > that look like VHDL). I agree, but... The code skeleton has the following license (the files are from '93): * This file is intended not to be used for commercial purposes * without permission of the University of Twente and permission * of the University of Dortmund I'm going to contact the source to see if it can be GPL'ed or whatever. If you know any other GPL'ed VHDL parser that's just the yacc skeleton with grammar, I could hack that up instead. The parser code btw, required quite many hacks. It was in a very old lex dialect and gave some trouble with both flex and gcc. It would probably need some more cleaming to run on non gcc platforms. Again, the largest problem is probably the license. Regards, Robin. -- ASIC Design Engineer Tellabs Denmark A/S |
From: <no...@so...> - 2001-11-14 14:25:07
|
Bugs item #481573, was opened at 2001-11-13 20:05 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=481573&group_id=27350 Category: None Group: None Status: Open Resolution: None Priority: 9 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: requires non-free software for searching Initial Comment: On your front page you suggest Glimpse, which is horridly non-free: http://www.arco.de/~kj/harvest/glimpse-license-status This program is GPL, so a user might assume that it's dependancies are GPL. Please provide hooks for Swish-E http://swish-e.org/ or Swish++ http://homepage.mac.com/pauljlucas/software/swish/ which are GPL replacements. If I am in error and this feature allready exists in lxr, please change the content of http://lxr.linux.no/ to suggest Swish-E or Swish++. I might note that I caught this as a part of our department's implementation of Bugzilla (bonsai uses lxr) which is a mission-critical application for development here, and now that we know that Glimpse has *never* been in the public domain (the source we found had a misleading copyright file) we need to find a replacement or pay up before the audit rolls around. ---------------------------------------------------------------------- Comment By: Rusty Carruth (rustyc) Date: 2001-11-14 06:25 Message: Logged In: YES user_id=215914 There *is* a version of glimpse (I think around 4.0) which does not require a fee. This is the version we are using. A temporary workaround would be to grab that version and use it while 'someone' is fixing lxr to use the free alternative. (We found that version of glimpse by looking around on the 'net till we found it.) rc ---------------------------------------------------------------------- Comment By: Malcolm Box (mbox) Date: 2001-11-14 00:06 Message: Logged In: YES user_id=215386 I agree, depending on non-free software for a tool like LXR is not a good thing. Moving over to one of the free alternatives is on the to-do list. If you wanted to go ahead and provide the hooks for a replacement, I'd be very happy to incorporate the patch in the LXR. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=481573&group_id=27350 |
From: Malcolm B. <ma...@br...> - 2001-11-14 12:55:47
|
Hi Robin, Robin Theander wrote: > I'm hacking VHDL support into lxr and have come across a few things I'd like to > comment on. I have it fully working now but based on the 0.8 release (I'm > firewalled so CVS access in N/A). I'm relying on an external parser (lex&yecc > based skeleton I ripped from the net) because VHDL is a pain to parse. The main > changes are kept in LXR::Lang::VHDL. Cool stuff! As an ex-VHDL hacker myself, I know what you mean about it being a pain to parse. It's great to see a totally new language becoming supported. > A few other changes are necessary though. > > 1) The %type_names in Common.pm are hashed from chars. With VHDL I have about > 20 new types and letter allocation is getting ugly. Is there another way of > doing this. I could think of using numbers and an array instead. The db > overhead is minimal and the type is referenced few places. > I could produce a patch but I'm not dealing with C or C++ files, so I cannot > offer to fully test the (e)ctags implementation. > I'm also thinking about making this language dependent. Something like > major.minor numbers in UNIX devices. Major is the language and minor is the > local specific set. I agree, the current scheme is broken and needs to be replaced. It doesn't even work properly with ctags, since the meaning of the letters ctags outputs is not constant across languages. See the bug at http://sourceforge.net/tracker/index.php?func=detail&aid=476695&group_id=27350&atid=390117 for an example of what's wrong. Logically the mappings should be per-language, and ideally Common.pm would not depend directly on the installed languages - ie it would not hold the list of mappings. The problem is that the ident script doesn't know what language each of the returned identifiers is in to display the correct string. Probably the best solution is to create another database table that maps a numeric id to a string, and then have the language modules store the id number where they now store a character. Then each language module can contain the string <-> number mapping, and simply check on initalisation that the strings are in the db, adding them if not. > 2) The find and search is using $config->sourceroot to remove leading path so > it fits to source. However, glimpse has a nasty way of expanding symlinks so > the glimpse path and the sourceroot is not the same. I have something like this > in mind (hand edited diff against 0.8 ;-): > > --- ../../lxrsrc/lxr/find Tue Oct 16 22:38:37 2001 > +++ find Wed Oct 31 17:17:21 2001 > @@ -58,11 +58,11 @@ > return; > } > print("<hr>\n"); > + $glimpseroot = $config->glimpseroot; > - $sourceroot = $config->sourceroot; > while($file = <FILELLISTING>) { > + $file =~ s/^$glimpseroot//; > - $file =~ s/^$sourceroot//; > if($file =~ /$searchtext/) { > print(&fileref("$file", "find-file", "/$file"),"<br>\n"); > > The same applies for search... Is that acceptable to include? Yes, that's fine to include I think. Is glimpse support working for you - I think it's actually broken against a recent version of glimpse? > 3) Just wondering. How could we go about dealing with an external C based > parser. Including it in the project would increase the noise (and portability). > Rewriting it into perl would be nice but quite a pain because VHDL is context > sensitive and generally stateful (and I like lex and yacc for doing this). > Right now it might make sense to keep VHDL out from the releases and offer a > language addon. I think it will be OK to add a C module to the distribution, provided it comes with some reasonable way to build it. My guess (correct me if I'm wrong) would be that the parser is pretty much vanilla C with no platform dependancies, so it should be easy to make build. I would suggest creating a lib/LXR/Lang/VHDL subdir to keep the source and build system in. Then those that want VHDL support can build it, and those that don't can just comment out the config in lxr.conf that maps files to VHDL (and in fact won't ever see a problem unless they have files that look like VHDL). > BTW, thanks for all this great work in lxr. Glad you like it. Thanks for all the work you've been doing adding to LXR - without contributions like yours this project wouldn't be half as advanced. Cheers, Malcolm |
From: <no...@so...> - 2001-11-14 08:06:24
|
Bugs item #481573, was opened at 2001-11-13 20:05 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=481573&group_id=27350 Category: None Group: None Status: Open Resolution: None >Priority: 9 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: requires non-free software for searching Initial Comment: On your front page you suggest Glimpse, which is horridly non-free: http://www.arco.de/~kj/harvest/glimpse-license-status This program is GPL, so a user might assume that it's dependancies are GPL. Please provide hooks for Swish-E http://swish-e.org/ or Swish++ http://homepage.mac.com/pauljlucas/software/swish/ which are GPL replacements. If I am in error and this feature allready exists in lxr, please change the content of http://lxr.linux.no/ to suggest Swish-E or Swish++. I might note that I caught this as a part of our department's implementation of Bugzilla (bonsai uses lxr) which is a mission-critical application for development here, and now that we know that Glimpse has *never* been in the public domain (the source we found had a misleading copyright file) we need to find a replacement or pay up before the audit rolls around. ---------------------------------------------------------------------- >Comment By: Malcolm Box (mbox) Date: 2001-11-14 00:06 Message: Logged In: YES user_id=215386 I agree, depending on non-free software for a tool like LXR is not a good thing. Moving over to one of the free alternatives is on the to-do list. If you wanted to go ahead and provide the hooks for a replacement, I'd be very happy to incorporate the patch in the LXR. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=390117&aid=481573&group_id=27350 |