[documancer-devel] Re: new Documancer caching code checked in
Status: Beta
Brought to you by:
vaclavslavik
|
From: Kevin & M. O. <kev...@th...> - 2005-01-24 22:11:39
|
Hi Vaclav, On Jan 23, 2005, at 1:56 PM, Vaclav Slavik wrote: > Hi, > > I checked in the long-promised generic caching code, any testing would > be most welcome. Aside from the big internal changes in how the > updates are carried, there are some UI improvements as well: > > * Updating book A's index no longer disables searching in book B. > * You can no longer attempt to search in a book without index. > Previously, this would result in the UI blocking and waiting > until the index is regenerated. Now, Documancer will warn you > the the index is missing (see attached noindex.png) and will > disable the search box -- it will be enabled again when > indexing is done (remember: indexing happens in background, > it's no longer triggered by user action!). > * If a book has index, you can search it even if the index is > out of date. The index is generated in temporary directory > and replaces old index when it's fully created, so as long as > there is at least _some_ index, you can do (potentially inexact) > searches (see attached oldindex.png). > * Indexing of all books (i.e. even those not currently opened) is > done in background when you're reading docs, meaning that > there's a good change that Documancer will already be done > generating the indexes when you'll need them. +4 :-) > Still missing: > > * Prioritization of the updater -- when you select a book in > Documancer, the updater should stop doing whatever it is it's > working on and update items owned by currently selected book > as soon as possible. +1 > * Autodetection of outdated books -- Documancer should check the > HTML/info/man/whatever files and regenerate the index if they > changed; currently, you have to do it manually. Can we make this feature an option for the moment, at least until Documancer's parser can handle more special cases? For example, sometimes EClasses have redirect pages, and those would stop the indexer. (As to why they do, it's a bit of a long story. ;-) Anyways, it's better for EClass books to use an outdated index than recreate the index by only indexing the redirect page and nothing else. Also, I should mention a couple features I'd like to add. One is support for searching metadata, and probably also generating metadata indexes. I'd like people to be able to look for documents produced by the United Nations, or publications from 1986, for example. Or browse an alphabetical list of all organizations that have documents in the book. EClass lets users specify this sort of metadata, and I think it would be good to allow Documancer users to perform more focused searches. For Documancer-generated indexes, we can probably use DublinCore metadata to get this metadata from HTML documents. I'm not sure if info or man pages specify this info in a standard way. Another feature - what I'd call "bookshelves". Right now, Documancer presents you simply with a list of books, but what I'd like to see are books organized by categories. Like, a "Programming" bookshelf, with books related to programming only. Of course, there will be an "All" bookshelf, the default, that will show every book. The purpose of this is of course to keep the books list from getting too crowded as time goes on. (And we intend to use it for distributing content in fields like Economics, Agriculture, etc. so you can see how users could get bogged down with books!) How I foresee things is that you select your "Bookshelf" from the drop-down list on the main page, then if you click on the "Browse" tab, you get a list of the books in that bookshelf inside the tree view. Eventually, with books like EClass books, you can also expand their contents in the browse tree view. If Documancer can't determine the contents, it will just show the root page of the book. In fact, we can use the "content package" XML file format to manage bookshelves too, so we get bookshelf support and EClass TOC support in one shot. Basically, I just need to clean up my conman module and make it into a portable Python module. I should also do that for wxbrowser.py too, so that we can start using it with Documancer as well. I think I've pretty much patched up wxWebKitCtrl, so I'll be using that for the Mac version of Documancer. But all of this doesn't have to be in the next release. I'd like to see a 0.2.4 soon, probably after the caching code is tweaked and I've committed the indexer code. Anyways, I think that's all for now. ;-) Thanks, Kevin > Regards, > Vaclav > > -- > PGP key: 0x465264C9, available from http://wwwkeys.pgp.net/ > <noindex.png><oldindex.png> |