From: Chris D. <ce...@ui...> - 2007-11-21 15:43:54
|
On Wed, Nov 21, 2007 at 10:32:02AM -0500, Andrew Nagy wrote: > So then let's revisit the problem - how many of you have experienced problems with storing the local xml file in the filesystem? > > I would imagine that a large institution - such as Chris' Illinois System - might run into filesystem limitations. Chris - any thing you can speak to? Not only are there potential filesystem issues, it also severely limits our flexibility in hardware configuration, e.g., we may end up splitting VuFind off into a web tier farm and this will cause headaches, I'm sure. In any case, we would like to see some sort of new approach (storing the data in either in one or multiple fields in SOLR or in mysql) in VuFind main development. If this doesn't happen, we will most likely implement our own solution anyway out of pure necessity :-). Chris > > Thanks > Andrew > > > -----Original Message----- > > From: vuf...@li... [mailto:vufind- > > gen...@li...] On Behalf Of Wayne Graham > > Sent: Tuesday, November 20, 2007 9:47 PM > > To: Mark Triggs > > Cc: vuf...@li... > > Subject: Re: [VuFind-General] Minor VuFind refactoring > > > > A while back I did some experimentation with writing the marcXML file > > into the index. One might argue that placing an unindexed stored field > > is a waste of space for an indexing engine...or worse, that its a > > mistreatment of what an index is designed for. It does, however, > > provide a convenient method to encapsulate the entire entity in a > > convenient place with easy calls. (DB CLOBs are essentially doing the > > same thing to create their indexes). Some of the pluses of storing the > > XML in the index provided a space savings (mostly because I stored the > > XML as a single string) and you don't have any directories that you > > can't list the contents of without going to get some coffee. I would > > hesitate indexing the entirety of the Marc XML file (or raw marc for > > that matter) since a lot of what's in there are codes that make sense > > to catalogers and not a whole lot of others (and I always get in > > trouble for saying that with catalogers). Most of this stems from the > > fact that the majority of the algorithms being used to index the > > information assume that the data is unstructured (like a book or a web > > page) that it has to create categories, mispeling indexes, etc. rather > > than highly controlled metadata vetted by several professionals. > > > > I'm very much in favor of including the information from the marc file > > in an unindexed field, but I don't really think the current > > implementation is so bad...it may even be what's best for this type of > > implementation. Basically we need a fast way to grab the display > > information. If adding the information to the index in any way > > inhibits the speed at which information is returned, I would argue to > > just keep things the way they are (and I hadn't gotten quite that far > > to do the load testing for the indexes with the marc in the index). It > > may turn out that storing some of this in a database (which brings its > > own indexing elements) may also improve the performance of Vufind. > > > > This being said, we're off the rest of the week. When we get back, > > I've got a little more work to get done with another project, but hope > > to be able to get back to some of the other things in Vufind after > > that. To me, I think we need to do some testing to see what happens. I > > have some hunches, but I've learned that sometimes the way things work > > in my mind and how they actually work are quite different. > > > > Wayne > > > > On Nov 20, 2007 5:55 PM, Mark Triggs <mt...@nl...> wrote: > > > It's funny that both Steve and I independently make reference to the > > > same record display issue... you can see how much this has > > traumatised > > > us ;o) > > > > > > Mark > > > > > > > > > Steven McPhillips <smc...@nl...> writes: > > > > > > > I suppose that the single field approach could work as long as the > > > > application provided decent interfaces into that data. I think that > > > > something more flexible / powerful than xslt is required here: some > > of > > > > our xsl templates are rather hideous. Granted this is probably due > > to > > > > the dirtiness of our marc records, but I imagine we wouldn't be > > alone > > > > on this issue (maybe? I think we have subjects hiding in 7 fields, > > > > under a multitude of subfields thereafter...) > > > > > > -- > > > Mark Triggs > > > <mt...@nl...> > > > > > > --------------------------------------------------------------------- > > ---- > > > > > > This SF.net email is sponsored by: Microsoft > > > Defy all challenges. Microsoft(R) Visual Studio 2005. > > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > > _______________________________________________ > > > VuFind-General mailing list > > > VuF...@li... > > > https://lists.sourceforge.net/lists/listinfo/vufind-general > > > > > > > ----------------------------------------------------------------------- > > -- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2005. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > VuFind-General mailing list > > VuF...@li... > > https://lists.sourceforge.net/lists/listinfo/vufind-general > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > VuFind-General mailing list > VuF...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-general |