Menu

Xampp Windows unable to index PDF

Bob
2018-05-16
2018-06-01
  • Bob

    Bob - 2018-05-16

    Hello,

    I copied the configuration line for PDF indexing from the demo server.
    I installed PDFTOTXT for windows and it works fine.

    In the PDF indexing configuration I put this line :
    C:\xampp\htdocs\pdftotext.exe -nopgbrk %s | sed -e's/[a-zA-Z0-9.]{1} / /g'
    This create the txt file but nothing more.

    I did some research and the "sed" command is not available on Windows. There is however the command "where" for PowerShell but php runs cmd.exe and not PowerShell.

    How to use plaintxt indexing under windows?

    Sincerely

     
  • Uwe Steinmann

    Uwe Steinmann - 2018-05-17

    The sed call is required, it just filters out some terms not worth to index. I'd suggest to try it without the sed and see if it works.

     
  • Bob

    Bob - 2018-05-22

    Hello,

    Thanks you for the replie. I have delete the sed commande and it's works.
    But when i search a terms, he dont find any document.
    I can see the txt file with the PDF extraction text.

    Any idea?

     

    Last edit: Bob 2018-05-22
  • Bob

    Bob - 2018-05-22

    Sorry dupplicated message xD

     

    Last edit: Bob 2018-05-22
  • Bob

    Bob - 2018-06-01

    Up

     

Log in to post a comment.

MongoDB Logo MongoDB