Automatic metadata from DOI, DOI from PDF

2009-02-01
2013-05-28
  • Colon Grease
    Colon Grease
    2009-02-01

    Feature:

    I would like refbase to be able to fetch journal article information - authors, title, journal, page, issue, year, all that - from a DOI.

    Ideally, this DOI would come from a scan of the pdf, but even being able to retrieve it from a user-specified DOI, or scanning it from the article title, would be an improvement and a good starting point.

    What we want is a batch import feature that imports all author information automatically (from a scanned or specified DOI). If the DOI had to be specified, it would be preferable that the user select a batch of pdfs for upload and then selected DOIs one-by-one from a list without having to hit 'back' or find the reference in the library after it has been added. Alternatively, if the user could add a list of DOIs and associate the added references with pdfs manually without having to hit 'back' or find the individual references, this would work.

    Usage:

    I am using refbase for a company-wide database of journal articles. Each one of the scientists on the team want to to share relevant journal articles on our central server which serves refbase and holds hundreds of pdfs. The problem is that each of us have tens or hundreds of articles we'd like to put in the database in our personal libraries, but don't have the patience to manually enter all author information and direct it to the pdf, or return to the web to download the RIS citation or export an ENW citation from our personal library. This is very tedious.

    Perhaps as an incentive, we are prepared to make a donation if that helps speed development. I strongly believe this is a very powerful feature, so it's not just us that wants to make this happen! You may email me at my nickname at gmail.com.

    Best,
    Colin

     
    • Hi Colin,

      > I would like refbase to be able to fetch journal article information
      > - authors, title, journal, page, issue, year, all that - from a DOI.

      This is possible with refbase-0.9.5. Records from arXiv.org, CrossRef.org and PubMed.gov can be imported directly via their identifiers -- just enter one or more arXiv IDs, DOIs (or OpenURLs), or PubMed IDs (PMIDs), and refbase will fetch & import the corresponding record metadata.

      What fields get imported depends on the remote web service. E.g., while records imported from arXiv.org and PubMed.gov usually include the abstracts, CrossRef.org unfortunately only includes the basic metadata but no abstracts.

      > Ideally, this DOI would come from a scan of the pdf,

      Right. For future reference, this was also discussed in this thread:

      https://sourceforge.net/forum/forum.php?thread_id=2955016&forum_id=218758

      > but even being able to retrieve it from a user-specified DOI, or
      > scanning it from the article title, would be an improvement and a
      > good starting point.

      If the user provides the DOI himself, why not use the DOI importer then? I.e.:

      1. Open the PDF, and copy the DOI number

      2. Goto the refbase import page, paste the DOI and hit the "Import" button.

      3. Upon successful metadata retrieval, the "Add Record" mask will open with the metadata prefilled. Attach your PDF file and click the "Save Record" button.

      Repeat this process for the next PDF file.

      > Alternatively, if the user could add a list of DOIs and associate
      > the added references with pdfs manually without having to hit 'back'
      > or find the individual references, this would work.

      The refbase DOI importer accepts multiple DOIs at once. Just enter multiple DOIs (separated with whitespace) and click the "Import" button. Then use the first list of steps mentioned here:

      https://sourceforge.net/forum/forum.php?thread_id=2957962&forum_id=351913

      No need to hit 'back' or find the individual references. Wouldn't this work for you?

      > Perhaps as an incentive, we are prepared to make a donation if that
      > helps speed development.

      Thanks for the offer. Unfortunately, my time available for development is independent from money. However, maybe you could pay someone else to implement this?

      > I strongly believe this is a very powerful feature, so it's not just
      > us that wants to make this happen!

      Yep, I fully agree with you. Batch import of PDFs with automatic metadata retrieval would be a killer feature. And actually I think this is what everyone will expect from a bibliographic tool (i.e. take for granted) in a few years or less.

      Matthias

       
    • Colon Grease
      Colon Grease
      2009-02-01

      Once that happens, this product becomes something I would easily pay money for. The combination of server pdf storage and the ease of use that automatic DOI scan and metadata retrieval would provide is available nowhere else.

      Thanks again for your help.