Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

RIS import

Help
2008-03-22
2013-05-28
  • I was testing an RIS import to refbase.org and received errors on import.  I tried two file formats - one plain text and one rich text (to test unicode characters import).
    The format is RIS - here is exactly what it looks like:

    TY  - JOUR
    AU  - Young, Warren C.
    T1  - Whither Evangelicalism
    PY  - 1959
    JA  - BETS
    JF  - Bulletin of the Evangelical Theological Society
    SP  - 5-15
    VL  - 2
    IS  - 1 
    UR  - http://search.ebscohost.com/login.aspx?direct=true&db=rfh&AN=ATLA0000666694&site=ehost-live
    U1  - English
    ER  -

    TY  - JOUR
    AU  - Young, Warren C.
    T1  - Is There a Christian Philosophy
    PY  - 1958
    JA  - BETS
    JF  - Bulletin of the Evangelical Theological Society
    SP  - 6-14
    VL  - 1
    IS  - 4 
    UR  - http://search.ebscohost.com/login.aspx?direct=true&db=rfh&AN=ATLA0000666690&site=ehost-live
    U1  - English
    ER  -

    Can someone let me know what I'm doing wrong?

     
    • I don't think sourceforge is properly displaying those; they have neither unicode nor rich text.  Note, though, that in the formal RIS specification, there is no support for either of these features anyway.

      When copied & pasted from sourceforge, the RIS is also invalid because there are supposed to be two (not one) spaces before the dash.

       
    • Hi Danny,

      I tried to import your records at refbase.org as received from the SourceForge post notification email. They seem to import fine for me here. What is the exact error message that you receive? And have you tried copying your Unicode records and pasting them into the import form (instead of uploading a file via the "upload file" button)?

      Matthias

       
    • Sorry, I confused the question by talking about unicode.  All I was saying is that some of my other records have unicode which is why I tried a rich text document - I realize I have no unicode characters in the above examples.

      I figured out the problem but don't really understand why it is happening but Richard was right.  When I open the files in textedit there is two spaces before the dash, but when importing it is only seeing one space.  I can't figure out why though.....

       
      • What leads you to think it is the space problem?  (I had assumed that the sourceforge forum is just eating them, but refbase should NOT be.)

        Are you copy/pasting or are you uploading the file?  When you try the other method, do you get the same results?

         
    • When I open the .txt file there is clearly 2 spaces before the dash.  But when copying the text and pasting it, it turns in to a single dash.  I'm assuming that this is the problem when importing from the file as well.

       
      • Please try to import the file.  Also try to copy/paste from and into some other program.  Do you have any third-party clipboard manager?

         
    • - importing the file doesn't work.
      - copying and pasting from the text file doesn't work
      - copying and pasting into Pages or Word keeps the double spaces
      I'm wondering, does the text encoding have something to do with it?  the RIS .txt file was exported from Bookends, which makes it ASCII .txt.

       
      • Hi Danny,

        just to understand you correctly:

        - The problems you're describing occur with the refbase installation at http://refbase.org using the two RIS records you've posted in your original email in this thread, is this correct?
        - What is the exact error message(s) you're receiving?
        - What browser (and browser version) are you using?

        > I'm wondering, does the text encoding have something to do with it?

        I don't think so.

        > the RIS .txt file was exported from Bookends, which makes it ASCII .txt

        Btw, some former versions of Bookends had a bug where the second space in exported RIS records got swallowed by Bookends, but I assume that you're using the most recent version which daosn't have this issue (AFAIK).

        FWIW, I just tried again to import your above given records into the refbase db at refbase.org (using Safari 3.0.4 on OSX 10.4 via copy&paste) and they do import fine for me. Have you tried another browser?

        Matthias

         
    • One other thing about RIS importing for you, in regards to the tags.

      Looking at the refbase wiki (http://wiki.refbase.net/index.php/Import_Example:_RIS)  "T2" is used for the journal abbreviation, but according to the reference manager page (http://www.refman.com/support/risformat_tags_05.asp) it should be "JA".

      Also, an RIS tag for DOI's is not specified by reference manager, but I think "L1" makes the most sense (http://www.refman.com/support/risformat_tags_07.asp).  Can this be taken into consideration — or let me know what tag is already assigned to DOI's in refbase.

      Danny

       
      • > Looking at the refbase wiki (http://wiki.refbase.net/index.php/Import_Example:_RIS)
        > "T2" is used for the journal abbreviation, but according to the reference manager
        > page (http://www.refman.com/support/risformat_tags_05.asp) it should be "JA".

        When JA is present (whether or not there is a T2), JA will be used for the abbreviation.  If only T2 is present, T2 is used for the abbreviation.

        > Also, an RIS tag for DOI's is not specified by reference manager, but I think
        > "L1" makes the most sense (http://www.refman.com/support/risformat_tags_07.asp).
        > Can this be taken into consideration — or let me know what tag is already
        > assigned to DOI's in refbase.

        But a DOI is usually NOT a link to a PDF.  Springer & others put in the dx.doi.org link in the 'UR' field.

        In the SVN version, refbase will strip off the 'http://dx.doi.org/' from the beginning of the URI & place what remains in the DOI field.

         
    • Matthias,
      yes the two citations are what I'm trying to upload, they are in a .txt file.  I tried the upload in Firefox and got the same errors.  When I cut and paste the text into firefox, it did work.    I just emailed you the file.

      In regards to DOI, if L1 isn't good, what about L2?  I realize some just put it in a UR field, but sometimes you have a separate UR to attach to a citation.  If you had to UR fields, one a regular UR and the other a dx.doi.  UR, would refbase know to put an identical tag into separate fields?

      In any case, I still think choosing a RIS tag for DOI's is a good idea, for importing and for exporting.

       
      • > I tried the upload in Firefox and got the same errors.  When I cut and paste
        > the text into firefox, it did work.    I just emailed you the file.

        Matthias may well figure it out from what you sent him, but the following would be useful:
        (1) what is the error message that you get?
        (2) do you get it when you upload the file from other browsers?
        (3) what is the filename (with extension) of your import file?
        (4) can you think of any other potential limitations to file uploads you might have (proxy/firewall/antivirus/etc.)?

        If it works in other browsers:
        (5) what extensions do you have installed in firefox?

        > In regards to DOI, if L1 isn't good, what about L2?

        It is better, but DOIs don't always point to the full text either.  Further: Some implementations of RIS seem not to accept proper URIs in these fields anyway.

        > If you had to UR fields, one a regular UR and the other a dx.doi.  UR, would
        > refbase know to put an identical tag into separate fields?

        Not at this time.  It SHOULD, though.

        > In any case, I still think choosing a RIS tag for DOI's is a good idea, for
        > importing and for exporting.

        For import:
        We should do our best to both support the specification and to support what is popularly in the field.  Since I can't think of any other popular RIS exporter that has both DOI & a standard URL, I think we currently do the latter.  But we should certainly improve our importer to do the former as well.

        For export:
        We pawn this task off to 'bibutils.'  As long as our native MODS XML export is as good as it can be, I'm hesitant to make changes when it is unclear that there is a net gain without any significant downside.  There seems to be NO gain for using our own, non-standard field for this.  There MIGHT be a small gain of exporting multiple URIs if other clients read them properly.  I haven't done enough testing to know behavior of other clients.  Endnote seems to be fine with one URL per line, but not with semicolon-separated URLs on one line.  I have a slight fear that other clients may only take the first or last URL or might choke (since I've seen no exporter that has this behavior already). 

        If this fear is unfounded, perhaps a request should be made to bibutils first (so that other users of that tool could reap the same benefits).

        --Rick

         
      • Danny, thanks for sending me the file, that helped me figure out what was causing the problem. It's actually an old (i.e. known) problem, which I haven't been able to solve.

        refbase can import RIS (and other) record formats from file if the line ending format is set to either Unix (LF) or Windows (CRLF), but Mac (CR) line endings seem to cause problems. Your file had Mac (CR) line endings, changing them to Unix or Windows line endings allows to import the file into refbase (this can e.g. be done using the free "TextWrangler" app available from http://www.barebones.com/products/textwrangler/ ).

        Compared to upload of files copy & pasting records usually works with less problems.

        Btw, if you're exporting records from Bookends, you could also try the direct "Upload to refbase" functionality. See the Bookends PDF manual for more info on how to set this up (setup is very easy, just enter the refbase base url plus your account info in the Bookends Internet prefs).

        Matthias

         
        • Does
            ini_set('auto_detect_line_endings', true);
          magically fix import with Mac line endings?

          See:
          http://us3.php.net/manual/en/ref.filesystem.php#ini.auto-detect-line-endings

          --Rick

           
          • > Does
            >   ini_set('auto_detect_line_endings', true);
            > magically fix import with Mac line endings?

            Rick, thanks for the tip, actually I didn't know about that one!

            It "feels" as if this must be the issue (since I *think* that we always check for \n AND \r in the refbase code). However, setting

            ini_set('auto_detect_line_endings', true);

            in function 'readFromFile()' didn't seem to make for any difference. But we currently use 'file_get_contents()' (and not 'fgets()' or 'file()') to read the file's contents, and I haven't tried it yet with these latter two functions.

            Thanks again for the pointer. I'll test this more tomorrow...

            Matthias

             
      • W.r.t. DOIs, Danny Zacharias wrote:

        > I realize some just put it in a UR field, but sometimes you have a
        > separate UR to attach to a citation. If you had to UR fields, one
        > a regular UR and the other a dx.doi.  UR, would refbase know to
        > put an identical tag into separate fields?

        I've modified the relevant import function so that refbase recognizes multiple 'UR' tags and extracts any DOI (given as http://dx.doi.org/... URL) to the 'doi' field.

        Otherwise the behaviour w.r.t. multiple 'UR' tags is unchanged, i.e. in case of multiple (non-DOI) URLs, refbase will take the URL from the first given 'UR' tag and store it in the refbase 'url' field. Support for multiple URLs in refbase is a planned feature but isn't done yet.

        > In any case, I still think choosing a RIS tag for DOI's is a good
        > idea, for importing and for exporting.

        Generally, I agree with you here. I'll ask the Bibutils developer (Chris Putnam) whether he can add support for DOIs in RIS import ('ris2xml') and export ('xml2ris'). We'll then try to support the same tags for DOIs in the refbase RIS importer.

        More info how other tools handle this would be welcome.

        Matthias

         
    • Not sure it matters now, but my apologies for continuing to forget to report the error I was receiving.

      "There were validation errors regarding the data you entered:
      Unrecognized data format!"

      Matthias, can other refbase databases be uploaded to from Bookends (i.e. if/when we set up our refbase site)?
      Second Bookends question - what is the field mapping from BE to refbase?

       
    • > can other refbase databases be uploaded to from Bookends (i.e.
      > if/when we set up our refbase site)?

      Yes, that's possible, just adjust the refbase settings (base URL, user name & pwd) in the Bookends Internet prefs accordingly.

      However, note that the "Upload to refbase" feature was added to refbase after the release of refbase-0.9.0, which means that you'll need the refbase version from SVN.

      http://svn.refbase.net/

      > Second Bookends question - what is the field mapping from BE to
      > refbase?

      I don't have a thorough field mapping table at hand. The easiest is probably to upload some records of different types (article, chapter, whole book, etc) from within Bookends to the refbase database at http://refbase.org and see where the record content ends up. Sorry for that.

      It may be useful to test the two different upload formats (options "Endnote XML" and "RIS" available in the Bookends Internet prefs) to see whether one works better than the other for your own records. While refbase imports records in Endnote XML format thru Bibutils (and is thus dependend on the Bibutils field mapping), RIS records get imported directly (i.e. without passing it thru Bibutils).

      Matthias