From: Kevin & M. O. <kev...@th...> - 2005-01-26 00:57:44
|
Hi Vaclav, On Jan 25, 2005, at 3:32 PM, Vaclav Slavik wrote: [snip] >> Can we make this feature an option for the moment, at least until >> Documancer's parser can handle more special cases? For example, >> sometimes EClasses have redirect pages, and those would stop the >> indexer. (As to why they do, it's a bit of a long story. ;-) >> Anyways, it's better for EClass books to use an outdated index than >> recreate the index by only indexing the redirect page and nothing >> else. > > I don't understand what's the problem: as long as the index is > generated from the files (i.e. is newer than them), it's considered > up-to-date and not renegerated, so it shouldn't be a problem when > running from a CD, should it? Actually, I guess it depends on how sophisticated detection of outdated index is. As long as it doesn't try to reindex the documentation when it shouldn't, I should be fine. (BTW, I did eventually want to allow people the option to copy CD to their hard drive for faster/easier access.) > As for the "for the moment" part, that's already done because it's not > implemented yet ;-) > >> Also, I should mention a couple features I'd like to add. One is >> support for searching metadata, and probably also generating >> metadata indexes. I'd like people to be able to look for documents >> produced by the United Nations, or publications from 1986, for >> example. Or browse an alphabetical list of all organizations that >> have documents in the book. EClass lets users specify this sort of >> metadata, and I think it would be good to allow Documancer users to >> perform more focused searches. For Documancer-generated indexes, we >> can probably use DublinCore metadata to get this metadata from HTML >> documents. I'm not sure if info or man pages specify this info in a >> standard way. > > I don't understand how is this useful when searching in single > document (which is what Documancer does; as opposed to searching in > multiple documents). Documancer's search returns pages rather than > documents and it IMHO doesn't make sense to attach any metadata to > individual pages, let alone search for them. Am I missing something? I guess in this context I'm not sure what exactly a document is? I know pages are individual 'files' of a book, and a book is a page or collection of pages, but I'm not sure where a 'document' falls in between those two. A book can have pages written by different authors, for example, can't it? (wxWidgets docs are, for example, though that information isn't stored in the pages themselves.) As for its usefulness, this mostly falls under the 'my boss and others are asking for this' reasoning, although I do see how it is useful for them. ;-) We're an academic institution, so it wouldn't be uncommon for our people to want to query Documancer saying "I want all UN publications on the subject of refugees", for example, if they are looking for the UN's position on said topic. And since we work in many developing countries, one thing we try to do is assemble lots of public domain publications onto CD-ROM and distribute them at universities or public labs in those countries. So it is not uncommon for our 'books' to have 'pages' written by many, many different people/organizations. (And for these people, Googling and pulling up the results is very unreliable and/or costly.) >> Another feature - what I'd call "bookshelves". Right now, > > I believe it's already in TODO ;) Hmmm... I don't see it. ;-/ Am I missing something, or is in the TODO list that's sitting on top of your shoulders? :-) Thanks, Kevin >> with books!) How I foresee things is that you select your >> "Bookshelf" from the drop-down list on the main page, then if you >> click on the "Browse" tab, you get a list of the books in that >> bookshelf inside the tree view. Eventually, with books like EClass >> books, you can also expand their contents in the browse tree view. >> If Documancer can't determine the contents, it will just show the >> root page of the book. > > I like this UI to the bookshelfs... > > Regards, > Vaclav > > -- > PGP key: 0x465264C9, available from http://wwwkeys.pgp.net/ |