From: John J. L. <jj...@po...> - 2002-03-03 20:04:05
|
On Sun, 3 Mar 2002, Frederic Gobry wrote: [...] > 1. Storage > > Pyblio will have its native storage format. This format will have to Good plan. > support the storage of entries taken from another format, with no > loss. This format will also support all the extra information that [...] > I can imagine something like: > > <pybliodb> [...] > </pybliodb> > > > The rationale behind the duplication of information in the <original/> > tag is to make it possible to provide the original form of an entry as > long as it has not been modified in pyblio, while in the same time to > provide a consolidated view of the data (parsing of names, dates,...) > for the rest of the application. I like the idea of being able to directly edit BibTeX, while keeping the ability to convert between N formats without writing specifying the conversion between every pair of formats (impossible task -- actually, how does pyb. do this ATM??). However, how useful is it to provide original BibTeX format *only* if you haven't used the GUI on a record yet? That introduces a bit of state you have to remember, something that we're told by UI researchers is a Bad Thing... OTOH, I guess it only messes up your formatting, not the content -- or does it affect capitalisation, TeX macros, etc? (I know you shouldn't use TeX macros in BibTeX, but I find it necessary so can't consistently tell other people not to). OTOH, I don't think using BibTeX as the underlying format would solve this, since you may well find that you need finer divisions of meaning than are provided by BibTeX in order to cover all other formats without losing information -- and if this is true, the BibTeX record is no longer easy to edit by hand. Using BibTeX would also introduce comlications due to baggage you have to carry over from TeX, etc. On balance, you may have the best solution. > I also don't think it is a good idea to create tags for specific parts > of a description (for instance, a <title> or <author> tag), as it makes > it cumbersome to customize a database with specific fields. I prefer > having some base types (a person, a date, a date range,...) and > possibly a description of what is correct and what is not (a journal > entry must have a journal name, an author, a title,...). BTW, do you > think such a description should be placed in the database itself ? > This is good for file exchange, but it is maybe a bit overkill for > everyday use ? Isn't this what DTDs are for?? And if this is a standard format, surely there is only a need for one DTD? I know only the bare minimum of XML, but I know you stick the DTD name and a URL at the top of your XML document if you're doing validation, so all this is already taken care of, no? Anyway, having miserably failed to understand how Z39.50 works so far, I don't feel qualified to answer this kind of stuff (despite the fact that you're trying to keep it simple). I'm sure if I really understood the motivations for all those bits and pieces: attribute sets, abstract record syntaxes, variant sets, schemas..., I'd have an understanding of which things are important in something simpler, like pyb., and which things aren't. > 3. Internal manipulation > > Once parsed, any format must fit a single representation (as compared to > now, where every format could behave a bit differently), which is close > to the native format. To conciliate the needs of people that manipulate > small databases and people that have large entries shared by many > users, it might be of interest to use a relational database as the > actual processing backend: a lightweight temporary database like gadfly > for people who don't care, and the ability to plug the system to > PostgreSQL for instance, on larger configurations. Sounds like you have your work cut out, Frederic. (for non-english speakers: idiomatically 'have your work cut out' == 'have a lot of work to do') [...] > 4. Front-ends > > > The text-based interface has been left behind during the development of > the GUI. Maybe it's time to see if a correct abstraction could be > written so that multiple front-ends can be developped with a minimum of > rewriting. The minimum should be a curses and a Gnome front-end, to > extend according to the people interested in the development. I'm not certain this isn't more complexity than is necessary, but if you're determined to do it, how about modelling it on what sketch does? I confess I haven't looked at how he does this, but some of the other stuff from sketch looks really nicely done. I know you've already pinched some stuff from there... > 5. Filters / Web queries > > I need some feedback on how to make the development of filters and > external query mechanisms easier. There is certainly a lot of redundency > to remove, but I haven' looked at it yet. I have some detailed-ish plans for a simple web query system modelled somewhat after perl's WWW::Search (with improvements, don't want to reinvent wheel *only* to have it in python rather than perl). I won't start work on it for a couple of months yet, but it will happen (unless somebody beats me to it :). > 6. Formatting [...] > - easy creation of new formats for specific journals for instance > - connection with word processors > > Here again, I need some feedback from people who have a good experience > with commercial software in the area, so that we can find out what must > be done. If you have access to a windows box, you can download free trials of most of them. John |