Re: [Lxr-dev] Status of this project?
Brought to you by:
ajlittoz
From: Paul S. <ps...@ne...> - 2007-03-22 20:13:09
|
On Thu, 2007-03-22 at 20:30 +0100, Jan-Benedict Glaw wrote: > What may be different is the way to get the newly auto-generated > number back again. Yes, this is what I meant. > But since the INSERT pathes aren't time-critical, we'd just allow to > SELECT for the value instead of playing tricks to get it. Hm. Interesting idea. I'm not so sure they aren't time-critical. They don't impact the web user of course, but genxref does take a while to index stuff already :). I had two other performance-related ideas: Genxref performance improvement: Especially when adding a new release, genxref can be very slow. I think it's because of all the indexing that goes on, and the DB re-indexes after every insert. A common method of adding a lot of data to an SQL DB is to put the commands into a file, then load the file all at once with indexing disabled, then re-index everything at the end. I know both MySQL and Postgres support this model (although most likely it's accomplished in different ways) although I've not investigated it thoroughly. So, my idea was changing genxref to do this: instead of adding things to the DB one at a time, it would write out the statements to a file and at the end, import that file with indexing disabled. Web performance improvement: I'm sure someone else has already thought of this, but right now we generate our HTML dynamically every time. It seems to me that this is a prime candidate for caching! Especially for backends that support annotate/blame etc. The annotations on a file won't change unless the contents of the file changes, for the most part (the other possibility is that the symbols in the database changes--as far as I can tell this shouldn't happen normally but if people are worried we can have genxref flush the cache). We have a unique file id already, so we can cache the content using the file id with a typical span out to avoid any single directory being too large. We can compare creation time of the cached file vs. the source file to tell when it's out of date. We can use the access time to clean out old, unused cache entries if we want. Also, we can just cache the actual file content, and leave off the header information; the header info can be added dynamically when the user browses it. That way the same cached copy of a file can be used for different releases, if they all share that fileid. > Heck, this f*ing column name caused so much grief, lets just rename > it! Yes, that's somewhat painful and we need a Big Fat Warning in the > v1.0 docs that the column needs to be renamed, but I'm all for doing > that. I'd be happy with that. It's not actually such a big deal to fix this; you just need a series of ALTER TABLE operations. It's easy enough to add to the readme, or even write an update script. > > Actually there is an "ANSI_QUOTES" mode that we could set, that lets you > > (among other things) use standard "-quoting in MySQL. That might be a > > valid thing to do. > > Is this in the DBI backend or configury on the MySQL server side? It can be set globally or per-session. We'd use per-session obviously. It's set from the client side. I also discovered that there's a DBI method that quotes identifiers like this for you: my $release = $dbh->quote_identifier('release'); That would be the safest way to go, although it's annoyingly verbose. And, there's a DBI get_info() method that lets you ask about all kinds of features of the server, and one of those is the quoting character, so we could get that and use it instead of quotes. But, changing the name sounds good to me! :) -- ----------------------------------------------------------------------------- Paul D. Smith <ps...@ne...> http://netezza.com "Please remain calm--I may be mad, but I am a professional."--Mad Scientist ----------------------------------------------------------------------------- These are my opinions--Netezza takes no responsibility for them. |