Menu

#1 Use with manuscript pdfs that were drafted on Word

v1.2-alpha
open
nobody
None
6 days ago
2026-01-19
Anonymous
No

Originally created by: Anto007
Originally owned by: lidianycs

Hi @lidianycs

Interesting tool that can potentially be very useful for defending academic literature from AI-generated hallucinatory citations and thank you bringing this out to the community. In my testing, it either fails to identify most references from some of my published article pdf files or fails with the error:"Try a standard digital PDF" for pdfs generated from Word in the case of my manuscripts that are under preparation or review. It seems to work reasonably well with some published paper pdfs though. Is this perhaps an issue that is specifically related to the formatting of the references section? If so, what is the expected format?

Discussion

  • Anonymous

    Anonymous - 2026-01-19

    Originally posted by: lidianycs

    Hi @Anto007,

    Thank you for the kind words and for testing CERCA! I appreciate the detailed feedback. To answer your question: Yes, this is directly related to the formatting and the underlying engine CERCA uses.

    It currently relies on CERMINE, a Java library for extracting metadata and content from PDFs of academic publications.

    CERMINE was chosen deliberately to keep the project lightweight and deliver a working tool quickly.
    For the current version to work best, the PDF should use a standard academic layout with a clear, bolded header labeled "References" or "Bibliography."

    Since parsing "wild" PDFs is a complex challenge, I am working on adding an option for "Manual Text". This will allow you to copy the reference list from your document and paste it directly into CERCA. This bypasses the strict layout requirements of the PDF parser while still giving you the full power to check the rferences against Crossref, OpenAlex, and other databases.

    I am adding this to the roadmap for the next release (v1.2).

    Thanks again for helping improve the tool!

     
  • Anonymous

    Anonymous - 2026-01-19
     
  • Anonymous

    Anonymous - 2026-01-19

    Originally posted by: cutiepiemayam244

    Image

     
  • Anonymous

    Anonymous - 2026-01-20

    Originally posted by: Anto007

    Thank you very much @lidianycs for getting back to me and for providing a detailed explanation. I look forward to testing out the new version

     
  • Anonymous

    Anonymous - 6 days ago

    Originally posted by: lidianycs

    @cutiepiemayam244 thank for sharing your screenshot. I'll take a look at it in the weekend.

     

Log in to post a comment.