From: <lb...@us...> - 2010-02-25 07:09:54
|
We at The Hong Kong University of Science and Technology (HKUST) Library are evaluating systems to host a repository of faculty publications metadata. This database will become a component of a university-wide knowledge harvesting and transfer project, and it will be utilized by various academic/research/administration units for various purposes. One of the candidate systems we have in mind to host the repository is VuFind. While major implementations of VuFind are to host the next generation library catalog, we thought we should give it a try to see whether it is appropriate to adopt it for this repository. We already have an experimental setup of this repository, which runs on DSpace, but we would like to have a platform that has: - more "next-generation" features (e.g. facets, mashup, social networking, etc.) - author profile data model (e.g. like Thomson Reuters' ResearcherID), allowing pulling of selected publications metadata in the repository by authors - on-the-fly retrieval of citation counts from external sources, such as Web of Science, Scopus, Google Scholars, etc. - lots of web services to allow external systems to interface with the repository, for various reporting and knowledge analysis purposes. I would like to seek your pointers and advices (as I am new to VuFind, please excuse my ignorance) about the appropriateness of using VuFind for these purposes, and in particular: - I know VuFind's SolarMarc has authority data model, but does VuFind has authority control and can it be used to implement the author profile data model mentioned above? If not, could that be done "fairly" easily (we don't mind doing some development works)? - Does VuFind has metadata "harvesting" component/tools that can be utilized/customized to harvest and load metadata from other external sources (OAI-PMH, Scopus API, etc.)? If not, how feasible for us to implement such capability on VuFind? - VuFind loads metadata by batches of MARC files. How about files in other formats (e.g. Dublin Core, RIS, RefWorks)? - How about record de-duplication, would VuFind has a more sophisticated mechanism than just by matching with an ID field (e.g. MARC tag 001)? - Apart from OAI-PMH (and SRW/U? Open Search?), does VuFind has an appropriate architecture for implementing web services, so that external systems can query and retrieve metadata from the repository for various purposes? We found a site, the "Community Bibliography" from Villanova University, that implements their institutional repository on VuFind: <http://bibliography.library.villanova.edu/> But just by browsing their public interface could not give me enough clues to answer my above questions. For your information, we have IR on DSpace <http://repository.ust.hk/> and next-generation library catalog on Scriblio (<http://catalog.ust.hk/>. And the experimental repository of scholarly publications on DSpace is at <http://library.ust.hk/spi>. Sorry for the long email but your comments/advices will be very useful to us. Thanks, K.T. Lam Head of Systems and Digital Services The Hong Kong Univ of Sci and Tech Library |
From: Demian K. <dem...@vi...> - 2010-02-25 13:26:11
|
First of all, the very broad answer: VuFind is currently in the process of evolving from a very library-specific application to a more generic Solr front-end. Obviously, I think it will always be aimed primarily at the library market, but as it becomes more configurable, it becomes useful for more purposes. However, at this point in time, the majority of development has been library-oriented. I think you can do what you want with VuFind, but it would definitely require a significant amount of work on your part. Since I'm interested in making VuFind more flexible, I'd certainly be happy to support that work by answering questions about the architecture and participating in design discussions on some level. Some more specific notes: 1.) VuFind does not currently have an authority control system. There is an authority index in place, but it is not currently used; it's just a leftover from an unfinished enhancement. However, assuming you have access to the appropriate authorized names, adding them to your index and making them a factor in search and display shouldn't be terribly hard -- we can discuss in more detail if you like. 2.) VuFind does not currently have strong harvesting tools, but I think these could be developed in a fairly straightforward way -- determining the ideal workflow for loading records and optimizing the index might actually be harder than the technical details. The biggest potential complication is that I don't usually recommend harvesting data directly into the VuFind Solr index -- it might be better to develop some sort of data store as an intermediate layer and build the index from there. This way, if you upgrade VuFind or change your index schema, you can easily rebuild the Solr index without having to re-harvest all of the same data. 3.) The latest release version of VuFind (1.0RC2) only supports MARC. The current trunk code (which will be included in the 1.0 release, expected later this year) has been refactored to allow "record drivers" to be written to support custom display of other types of metadata. You would have to write your own indexers (though I'm sure the community will start sharing tools sooner or later); fortunately, this should only amount to an XML transformation in most cases. More details are in the wiki. 4.) VuFind doesn't have any special de-duplication tools, but Solr does -- I haven't looked into them in much detail, but I suspect that you could do some de-duplication work on VuFind's index fairly easily. 5.) There are a few details where web APIs provide challenges for VuFind, mainly with regard to tracking when records were added or changed. This makes certain API calls difficult to implement reliably. There has been some discussion about whether the Jangle project might offer a solution to this problem, but Jangle/VuFind integration is in very early stages and may not apply to your applications. Apart from calls which rely on information that VuFind does not currently have available, though, implementing APIs should not be too difficult -- VuFInd is designed with the MVC model, and many APIs may not be a lot more complicated than implementing a new view on top of existing data. Also, regarding Villanova's Community Bibliography, that is a heavily customized release of a very early version of VuFind, created before I joined the team. One of these days, I plan on updating it to the latest version of VuFind, which may help add some new tools to the trunk. I hope this helps answer some of your questions. Please let me know if you need more details on anything, and thanks for your interest in the project! - Demian ________________________________________ From: lb...@us... [lb...@us...] Sent: Thursday, February 25, 2010 1:40 AM To: vuf...@li... Subject: [VuFind-Tech] scholarly publications repository We at The Hong Kong University of Science and Technology (HKUST) Library are evaluating systems to host a repository of faculty publications metadata. This database will become a component of a university-wide knowledge harvesting and transfer project, and it will be utilized by various academic/research/administration units for various purposes. One of the candidate systems we have in mind to host the repository is VuFind. While major implementations of VuFind are to host the next generation library catalog, we thought we should give it a try to see whether it is appropriate to adopt it for this repository. We already have an experimental setup of this repository, which runs on DSpace, but we would like to have a platform that has: - more "next-generation" features (e.g. facets, mashup, social networking, etc.) - author profile data model (e.g. like Thomson Reuters' ResearcherID), allowing pulling of selected publications metadata in the repository by authors - on-the-fly retrieval of citation counts from external sources, such as Web of Science, Scopus, Google Scholars, etc. - lots of web services to allow external systems to interface with the repository, for various reporting and knowledge analysis purposes. I would like to seek your pointers and advices (as I am new to VuFind, please excuse my ignorance) about the appropriateness of using VuFind for these purposes, and in particular: - I know VuFind's SolarMarc has authority data model, but does VuFind has authority control and can it be used to implement the author profile data model mentioned above? If not, could that be done "fairly" easily (we don't mind doing some development works)? - Does VuFind has metadata "harvesting" component/tools that can be utilized/customized to harvest and load metadata from other external sources (OAI-PMH, Scopus API, etc.)? If not, how feasible for us to implement such capability on VuFind? - VuFind loads metadata by batches of MARC files. How about files in other formats (e.g. Dublin Core, RIS, RefWorks)? - How about record de-duplication, would VuFind has a more sophisticated mechanism than just by matching with an ID field (e.g. MARC tag 001)? - Apart from OAI-PMH (and SRW/U? Open Search?), does VuFind has an appropriate architecture for implementing web services, so that external systems can query and retrieve metadata from the repository for various purposes? We found a site, the "Community Bibliography" from Villanova University, that implements their institutional repository on VuFind: <http://bibliography.library.villanova.edu/> But just by browsing their public interface could not give me enough clues to answer my above questions. For your information, we have IR on DSpace <http://repository.ust.hk/> and next-generation library catalog on Scriblio (<http://catalog.ust.hk/>. And the experimental repository of scholarly publications on DSpace is at <http://library.ust.hk/spi>. Sorry for the long email but your comments/advices will be very useful to us. Thanks, K.T. Lam Head of Systems and Digital Services The Hong Kong Univ of Sci and Tech Library ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Vufind-tech mailing list Vuf...@li... https://lists.sourceforge.net/lists/listinfo/vufind-tech |
From: Greg P. <gre...@gm...> - 2010-02-26 02:22:22
|
This sounds remarkably pertinent to the stuff I'm doing in my new job. The team I work with now is developing some software called The Fascinator: http://fascinator.usq.edu.au/ don't ask me how it got named :) We currently looking at some work just like you've described, both with unique identification of parties: http://ptsefton.com/2010/02/23/ands-metadata-stores-integrating-vital-with-the-nlas-party-infrastructure-project.htm And implementing a repository alongside an existing IR to supplement with research outputs: http://ptsefton.com/2010/02/23/ands-metadata-stores-describing-metadata-collections-in-vital.htm Hopefully you can forgive me for my bosses colour scheme on that blog :) Beyond all of that there is a desktop version that is scheduled for release midway through this year designed to provide academics with a web interface into their own desktop and make it easy to synch their local research files with servers for backup and/or collaboration. The tie on all of this with VuFind is my own experience with the product and that USQ uses VuFind for our Library catalogue. One of the next steps in our plan is to install VuFind as a discovery layer. The Fascinator has a really strong backend with modular harvest components (such as OAI-PMH), document transformation (ie. it harvests a word document and renders html versions, pdfs and extracts metadata) and a storage API using digital objects prior to solr indexing. The payoff here is the added complexity this brings to the user interface, and the fact that we haven't spent much time polishing it from an end-user's perspective. Our intent is to use The Fascinator's web portal as our staff/admin interface, and use VuFind as a public interface we expose to students. That way we harvest the rich metadata we've created using The Fascinator and get the benefit VuFind's reasonably mature and developed user interface. So in my opinion VuFind isn't a solution to your current problem. As Demian said it would be a lot of work, but I'm not sure its work I think needs to be done in VuFind. VuFind core strength is its lightweight neutrality as a discovery layer. I'd gladly sell it as the front end of such a system... that's what we're doing :) The best part of all of this from my perspective is that all this stuff (and other things not even mentioned like Moodle and some annotation work we're doing) is all open source. Anyone can use it and modify to their hearts content. Greg On 25 February 2010 16:40, <lb...@us...> wrote: > We at The Hong Kong University of Science and Technology > (HKUST) Library are evaluating systems to host a repository > of faculty publications metadata. This database will > become a component of a university-wide knowledge > harvesting and transfer project, and it will be utilized > by various academic/research/administration units for > various purposes. > > One of the candidate systems we have in mind to host > the repository is VuFind. While major implementations > of VuFind are to host the next generation library catalog, > we thought we should give it a try to see whether it > is appropriate to adopt it for this repository. > > We already have an experimental setup of this repository, > which runs on DSpace, but we would like to have a > platform that has: > - more "next-generation" features (e.g. facets, mashup, > social networking, etc.) > - author profile data model (e.g. like Thomson Reuters' > ResearcherID), allowing pulling of selected publications > metadata in the repository by authors > - on-the-fly retrieval of citation counts from external > sources, such as Web of Science, Scopus, Google Scholars, > etc. > - lots of web services to allow external systems to > interface with the repository, for various reporting > and knowledge analysis purposes. > > I would like to seek your pointers and advices > (as I am new to VuFind, please excuse my ignorance) > about the appropriateness of using VuFind for these > purposes, and in particular: > - I know VuFind's SolarMarc has authority data model, > but does VuFind has authority control and can it be > used to implement the author profile data model > mentioned above? If not, could that be done "fairly" > easily (we don't mind doing some development works)? > - Does VuFind has metadata "harvesting" component/tools > that can be utilized/customized to harvest and load > metadata from other external sources (OAI-PMH, Scopus > API, etc.)? If not, how feasible for us to implement > such capability on VuFind? > - VuFind loads metadata by batches of MARC files. How > about files in other formats (e.g. Dublin Core, RIS, > RefWorks)? > - How about record de-duplication, would VuFind has > a more sophisticated mechanism than just by matching > with an ID field (e.g. MARC tag 001)? > - Apart from OAI-PMH (and SRW/U? Open Search?), does > VuFind has an appropriate architecture for implementing > web services, so that external systems can query > and retrieve metadata from the repository for > various purposes? > > We found a site, the "Community Bibliography" > from Villanova University, that implements their > institutional repository on VuFind: > <http://bibliography.library.villanova.edu/> > But just by browsing their public interface could > not give me enough clues to answer my above questions. > > For your information, we have IR on DSpace > <http://repository.ust.hk/> and next-generation > library catalog on Scriblio (<http://catalog.ust.hk/>. > And the experimental repository of scholarly > publications on DSpace is at <http://library.ust.hk/spi>. > > Sorry for the long email but your comments/advices > will be very useful to us. > > Thanks, > K.T. Lam > Head of Systems and Digital Services > The Hong Kong Univ of Sci and Tech Library > > > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech > |