Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Interface with spires / inspire

Help
Anonymous
2010-06-30
2013-05-28
  • When I try to import them in Refbase, I get "There were validation errors regarding the data you entered:" and below "Unrecognized data format!"
    ….
    - is it normal that the Endnote and RefWorks import fail? Is it a configuration issue on my side, a bad output from inspire?

    Endnote XML is supported by refbase. As far as I know, the Endnote XML format has horrible documentation & differs from version-to-version of Endnote, so it is hard to say whether Spires is technically "wrong." Most Endnote XML files have the structure

    <xml><records><record>
    

    and refbase expects this for format autodetection, but bibutils does not seem to actually require it.  Spires does not put their records in an <xml>  trunk, so refbase's autodetection does not work. Perhaps includes/import.inc.php can be modified to not rely on the '<xml>' being there (or, as a work-around, you can add this to your data manually). But the Spires Endnote XML export is still a bit "strange." No reference type is specified, for instance & bibutils can't seem to import the 'authors' branch (don't know, offhand, why). Have you successfully imported the Endnote XML from Spires into any other program? I'm a bit hesitant to spend a lot of time diagnosing their issues & suggesting improvements to them or patching bibutils/refbase while the service is still in beta and they encourage the use of slac.stanford.edu/spires.

    RefWorks XML is not a supported format yet. It is only the tagged RefWorks format that is supported. Support for RefWorks XML will likely follow support by bibutils.

    - is there a way not to get duplicates after an import? Are there some unique fields?

    Better duplicate handling is planned. But you can use duplicate_search.php for now.

     

  • Anonymous
    2010-07-05

    Thanks for the answer!
    I've never imported anything from inspire. I used to do some shell processing of spires and manually import that to a mysql database, and the display it with some php code. My idea was to switch to something more "pro" (and not maintained by me ;) )…
    In order to switch to refbase I "just" need for a more or less automatic cron-based way to fill refbase from spires/inspire.
    I am ready to contribute/interact with the spires/inspire people to make this work. I just need to know in which direction I should go. Should we work to have Inspire Endnote better supported? In that case, there is the "easy" xml fix. Now, as you said, the other fields are not all properly imported… Looking at the xml file, the <authors><author>a1</author><author>a2</author></authors> seems clean to me, but I have no idea what is the expected syntax. Any document on which the refbase import is based?
    Spires/inspire is something really important in high energy physics, and I guess it is worth spend some time working on it (I mean some of my time)…

     
  • It seems to me that INSPIRE will need to fix their export functions & I've written to their feedback address to request these changes (dates in MODS, all authors in BibTeX, and a working EndNote XML export).  EndNote X is not able to import INSPIRE-generated files either, so I don't think we want to work-around the issues when INSPIRE will almost certainly need to change things.

    -Rick

     
  • Travis Brooks from INSPIRE responded today, noting that the issue is now in their tracking system (but with a low priority because INSPIRE is in early beta).  He also remarked

    BibTeX including all authors can be hard, since many HEP articles
    contain, now, thousands of authors.   We usually truncate at a few, does
    this make sense to you?

    The idea of some truncation is reasonable to me (BibTeX fields are limited to a few thousand characters), but one author may not be sufficient.  I don't know the convention for HEP, but most other fields with high author counts do list more than a single author (AASTeX, for example, defaults to 8 but allows customization, for example).  If you have a different opinion, feel free to mention it.

     

  • Anonymous
    2010-09-01

    Well… There is no real convention. Even within a single collaboration it is hard to agree on such a convention.
    The main issue of not having all the authors is that if you are for example Mr Zee, you will never appear, and then a search on refbase will not list your papers unless you manually add yourself to them, which in some sense defeats the initial goal.
    I would suggest having a way of handling this. I can see 2 options. One is having one style of export that lists all the authors, and the second one is having a way of handling collaborations in refbase. Note that even if the second one may seem more simple, it is in fact not the case since tha authorship changes with time…
    So basically, I guess that they have in INSPIRE the full list of author for each paper… And it would be nice to be able to retreive it and insert it in refbase.
    Does it make sense?

     
  • One is having one style of export that lists all the authors….So basically, I guess that they have in INSPIRE the full list of author for each paper… And it would be nice to be able to retreive it and insert it in refbase.

    Yes, and INSPIRE uses the full list for all export formats other than BibTeX.  It is just that the other formats are either invalid or are missing information.  But INSPIRE notes they "will be fixing up these formats soon, but not _too_ soon."