From: Christopher O. <oe...@oe...> - 2009-08-07 11:24:04
|
Hi Rob, great! Do you want to provide a jar and a description (+link to a homepage) for the website? How about after your fetcher has been announced, you drop the NTRS guys a note on these problems. Maybe the will fix some of these problems. Thanks for the good work. Seems to work for me (with the noted limitations). Cheers, Christopher On Fri, 07 Aug 2009 06:38:22 +0200, Rob McDonald <rob...@gm...> wrote: > All, > > Thanks for all the help, I got the NTRS Fetcher packaged up as a proper > plugin. > > It isn't perfect, but it does a reasonable job given what NTRS spits > out. This means: > - XML files with blank lines at the start. > - Lots of fields with more than one entry (date, identifiers, etc). > - No particular format for things like date. > - Not all information pushed by OAI (keyword, etc.) > - Limited use hours (shut down 8am-8pm EST). > > It might be more useful to build an NTRS fetcher based on a pure HTML > scraper to get around most of these problems. > > I hope someone finds it useful. > > Rob |