Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Plugin for automatic download of PDF files

2009-03-25
2012-11-08
  • Hi all,

    I wrote a plugin to automatically download pdf files from the preprint server
    arXiv or from the journal homepage (linked by the doi field). Maybe it is also
    useful for some of you.

    You can find it at http://www.lhnr.de/ext/localcopy.php .

    It uses the eprint field for the preprint and the doi field for the journal
    download. Both fields are automatically set if you fetch, e.g., from SPIRES.

    I tested the plugin against the following journals:
    - Phys. Rev. D.
    - J. Math. Phys.
    - JHEP
    - Some journals on www.sciencedirect.com.

    It should, in principle, work with any journal that links to the pdf file
    on the doi-destination website.

    I am looking forward to your feedback.

    Best,

    Christoph

     
    • tupduq bien
      tupduq bien
      2009-03-26

      Nice plugin Christoph!!!
      Very useful and works perfectly.
      If you could add the option "downloading PDF from URL", it would be still better since DOI is not a field always recorded in various bib files.

      Thanks.
      Best regards.

       
    • Thanks for your comment. I also have some entries in my bib file that only have the url field set.

      Please find a new version (0.4) online that now includes support for the url field. I tested it with links to citebase and it worked fine. Just use the "Journal PDF" option.

      Best

       
    • tupduq bien
      tupduq bien
      2009-03-26

      Really impressive!!!
      This plugin should be put by default in the next version of Jabref!
      Congratulations.

      Best regards.

       
      • mel
        mel
        2009-03-28

        Hi,
        Unfortunately, I didn't get a lot of success with this exciting plugin. Is it compatible with automatic proxy url (with authentification)? I tried adding proxy adress in java config but I don't know if id/password is set. I also try "use browser config", but didn't work also. I always get

        ! file is not of mime type application/pdf, but text/html;charset=iso-8859-1.
        ! usually this happens if authentication with the journal failed.
        ! please follow the doi link with your web browser.

        An other, maybe stupid, question : for some pdf, downloading is *almost* fine (it download only free content-> only abstract). But where file is saved??

        Thanks,
        Mélanie
        PS I could give you more details or example if needed. I would love using it!

         
        • Hi Mélanie,

          For your interest, there is a new version of the plugin available (0.7) that now also uses the per-database specific file directory settings when they are available. Furthermore it now uses the main file directory instead of the pdf directory so that the problem you described (per default the files were stored in /) should no longer occur.

          Best,

          Christoph

           
        • Hi Mélanie,

          thanks for your interest in the plugin. Here is how I configure JabRef/the plugin to use a proxy:

          I use a SOCKS proxy obtained by a "ssh -D 9500 ..." call, but any other HTTP or SOCKS proxy will do as well.

          For a SOCKS proxy I start JabRef using the command line:

          java -DsocksProxyHost=localhost -DsocksProxyPort=9500 -jar JabRef-2.4.2.jar ,

          where localhost and 9500 have, of course, to be replaced with your SOCKS proxy hostname and port.

          A general documentation that also includes HTTP proxies (and how to specify a username/password) can be found here:

          http://java.sun.com/j2se/1.5.0/docs/guide/net/properties.html .

          This works very well with my configuration.

          As for the second question: The plugin saves the PDF file in the "pdf" folder that can be set in the preferences of JabRef. It then uses the BibTeX key field of the entry (plus .pdf) as the filename of the local copy of the article.

          Best regards,

          Christoph

           
          • mel
            mel
            2009-03-29

            Thanks for quick answer.
            I will check this next week.I have no idea what is a SOCKS, so I will have to learn it before trying config modifications.

            For location of file, it's logic. But I could'nt find it, but I found why. I didn't set a repertory specifically for pdf, but only for main files. By the way, in Ubuntu, they got saved in root(/). Now it's works fine.
            Thanks,
            Mélanie

             
    • Denis FEURER
      Denis FEURER
      2009-04-14

      Dear Christoph,

      I would be very pleased to use the plugin you developped, I find this idea excellent.

      Unfortunately, I don't manage to install it. I tried several different methods including :
      -putting the .jar file in a "plugin" subdirectory of the directory where jabref.2.3.1.jar is.
      -putting the files of the /plugin directory of the .jar file in the same directory

      I don't see anything changed in the Jabref interface. What am I supposed to obtain ?

      I'm running JabRef 2.3.1 under Ubuntu 8.10

      Thanks in advance for your answer an thank you again for this plugin,

      Denis

       
      • mel
        mel
        2009-04-14

        Plugin doesn't work on 2.3.1. You have to download 2.4. And java1.6 if it's not already done.
        Good luck!

         
    • Rajil Saraswat
      Rajil Saraswat
      2009-05-21

      Thanks for the excellent plugin. Jabref has become much nicer to use with the automatic downloading of pdf. I have a request though.

      Is it possible to automatically download from websites where a form based Login/Password is required. The specific website i want to download from is http://www.ingentaconnect.com.

       
      • Dear Rajil,

        I am glad you find the plugin useful. I am currently thinking about implementing a mechanism that
        will offer form based logins if ip authentication fails and the website offers a <form> that includes
        username/password fields. It would be very helpful if you could help with the debugging process by providing a username/password and a relevant article on www.ingentaconnect.com; if you are willing to do so and your institution policy allows this, please contact me at clehner // users.sourceforge.net.

        Best,

        Christoph

         
    • Rajil Saraswat
      Rajil Saraswat
      2009-05-22

      Dear Christoph,

      Unfortunately, my institution policy doesnt allow to share the username/password. But I can always test your code and send you back the responses if that is helpful.

      On another note, I like to download several articles in one go. Since these articles have both doi/url, LocalCopy asks me to select either of these every time. Is it possible to set a preference so that doi gets picked up automatically failing which url is picked next. This way a user doesnt need to keep selecting the download mode for every file.

      Cheers
      Rajil

       
    • Rajil Saraswat
      Rajil Saraswat
      2009-05-28

      Hi Christoph,

      Please bear with me for some time as there has been a subscription problem between my institute and ingentaconnect. Due to this i am not able to login. But as soon as it is resolved i will give you a feedback.

      Cheers
      Rajil

       
    • Rajil Saraswat
      Rajil Saraswat
      2009-05-29

      Hi Christoph,

      Ok journal subscription is sorted out. The plugin did ask me for a login and password, but then it failed and could not download the pdf file. Here is the log

      ! http request rejected
      ! could not download file 'http://dx.doi.org/10.1179/136217103225010943'.
      ! Content-Type: text/html; charset=iso-8859-1
      ! Transfer-Encoding: chunked
      ! Connection: Keep-Alive
      ! Keep-Alive: timeout=2, max=99
      ! Server: Apache/1.3.41 (Unix) mod_ssl/2.8.31 OpenSSL/0.9.8h mod_jk/1.2.19
      ! Date: Fri, 29 May 2009 20:08:35 GMT
      ! null: HTTP/1.1 403 Forbidden
      - cookie received: ingentaCookie=token004f1cc52bac8ff3b29964754766543d30237e224f58592f557c685047305f432f266e6b4392799eaf66f0e1e49003cd9d4645d564a41f9;path=/;expires=Thu, 27-Aug-09 20:08:35 GMT.
      - follow redirect to http://www.ingentaconnect.com/content/maney/stwj/2003/00000008/00000003/art00002;jsessionid=771a71vln1jdd.victoria?token=005f1bb01794a892f843f4e4b3b49264f655d375c6b6876335045416762496e586546244072423520654c763f6c4444.
      - no pdf link found. second try with login.
      - follow redirect to http://www.ingentaconnect.com/content/maney/stwj/2003/00000008/00000003/art00002?token=005f1bb01794a892f843f4e4b3b49264f655d375c6b6876335045416762496e586546244072423520654c763f6c4444.
      - follow redirect to http://www.ingentaselect.com/rpsv/cgi-bin/cgi?ini=xref&body=linker&reqdoi=10.1179/136217103225010943.
      - downloading journal html from http://dx.doi.org/10.1179/136217103225010943...
      - processing Khandkar2003...

      Cheers
      Rajil

       
      • Dear Rajil,

        thank you for testing the plugin. Meanwhile there is a new debug version of the plugin available at

        http://www.lhnr.de/ext/localcopy/net.sf.jabref.plugin.localcopy-1.0.jar .

        This version was successfully tested against form-login on IEEE.org. If you are interested in providing
        further debugging results, please contact me at

        clehner@users.sourceforge.net .

        Best,

        Christoph

         
    • Rajil Saraswat
      Rajil Saraswat
      2009-06-02

      Hi Christoph,

      This time it worked very well. I was able to download the file after entering the username/password. However, i cant provide the log since the download window closes after the download (guess this is intentional). Otherwise it is working well.

      I dont understand the Policy button and Update fields but guess this is something you are working on.

      Cheers
      Rajil