DropIt / Bugs / #196 Unknown FirstFileContentDateNormalized

#196 Unknown FirstFileContentDateNormalized

Milestone: v8.5

Status: open

Owner: nobody

Labels: PDF (1) OCR (1) FirstFileContentDateNormalized (1) Associations (1)

Priority: 1

Updated: 2024-06-03

Created: 2022-12-31

Creator: Anonymous

Private: No

Hello, I LOVE this program and if I knew anything about coding I'd be happy to help (where can I learn?) In the last year or two, I've noticed an ongoing inability to recognize dates in the content of the PDF files. It used to work excellent, but lately it seems to not be able to identify and gives the default output:

Unknown FirstFileContentDateNormalized

When I open the PDF, the text is all OCR and easily readable and I can "highlight" or select the date/text so it should be able to find it?

David Colbeth - 2022-12-31

following

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

divinity666 - 2023-09-19

I am wondering about this issue. I am still using it, and as long as the PDF files do contain text (e.g. via OCR), the text is being recognized properly.

You might want to check the pattern configuration in the configuration file.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Colbeth - 2024-06-03

Hello Divinity, is there a better OCR program you are using? I scan all of my documents and the built in OCR service for ScanSnap used to be great but seems to be inadequate lately.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- divinity666 - 2024-06-03
  
  We use pdftotext to extract content from PDF files, i.e. you can use any OCR on your PDF documents to recognize content (e.g. from ScanSnap) and DropIt will use exactly that recognized texts for the further evaluation.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - David Colbeth - 2024-08-31
    
    Thanks for the reply.
    Unfortunately it was not helpful.
    That is exactly the issue I am experiencing.
    
    The PDF has been OCR and text is readable, but DropIt does not find it
    for some reason.
    
    Please help.
    
    --
    David Colbeth
    206-850-5368
    http://www.dot.cards/davidcolbeth
    
    http://www.ColbethGroup.com
    
    On Mon, Jun 3, 2024, 2:06 PM divinity666 divinity666@users.sourceforge.net
    wrote:
    
    We use pdftotext to extract content from PDF files, i.e. you can use any
    OCR on your PDF documents to recognize content (e.g. from ScanSnap) and
    DropIt will use exactly that recognized texts for the further evaluation.
    
    [bugs:#196] https://sourceforge.net/p/dropit/bugs/196/ Unknown
    FirstFileContentDateNormalized
    
    Status: open
    Group: v8.5
    Labels: PDF OCR FirstFileContentDateNormalized Associations
    Created: Sat Dec 31, 2022 01:01 AM UTC by Anonymous
    Last Updated: Mon Jun 03, 2024 12:26 AM UTC
    Owner: nobody
    
    Hello, I LOVE this program and if I knew anything about coding I'd be
    happy to help (where can I learn?) In the last year or two, I've noticed an
    ongoing inability to recognize dates in the content of the PDF files. It
    used to work excellent, but lately it seems to not be able to identify and
    gives the default output:
    
    Unknown FirstFileContentDateNormalized
    
    When I open the PDF, the text is all OCR and easily readable and I can
    "highlight" or select the date/text so it should be able to find it?
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/dropit/bugs/196/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    Related
    
    Bugs: #196
    
    alternate
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Unknown FirstFileContentDateNormalized

A flexible tool to automate processing & organizing files and folders.

Group

Searches

Help

#196 Unknown FirstFileContentDateNormalized

Related

Discussion

Related