/seeddms/out/out.IndexInfo.php is showing me 10 Documents,42 Terms,document_id,mimetype,origfilename,owner and title only but one the demo site https://demo.seeddms.org/out/out.IndexInfo.php it also include content attribute. Can this be the reason why am not eable to full-text search? If so what I am missing?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It's not indexing the content of the documents. Make sure the configured programms to turn your documents into plain text work. That means 1. the exist and 2. are at the right place. The tab 'Advanced' in the settings contains the configuration for those programms.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
the converts are doing just fine at least for thos I managed to check Example
I inserted a code to print $cmd in "C:\pearl\pear\SeedDMS\SQLiteFTS\IndexedDocument.php" before proc_open() is called,how? i can explain if needed.
the comand looks like this pdftotext -enc UTF-8 -nopgbrk E:/full_content/\1048576/52/1.pdf - | sed -e "s/ [a-zA-Z0-9.]{1} / /g" -e "s/[0-9.]//g"
only changed the ' to " from the original(default) config in 'Advanced' Tab
I run this code in cmd and git-bash workes fine in both of them,displays the txt in stdout, so what can be the case?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
** apache tika** ... totally solved my problem with full-text search and i think it's far more easy than going through all the converters...specially for window users.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
/seeddms/out/out.IndexInfo.php is showing me 10 Documents,42 Terms,document_id,mimetype,origfilename,owner and title only but one the demo site https://demo.seeddms.org/out/out.IndexInfo.php it also include content attribute. Can this be the reason why am not eable to full-text search? If so what I am missing?
It's not indexing the content of the documents. Make sure the configured programms to turn your documents into plain text work. That means 1. the exist and 2. are at the right place. The tab 'Advanced' in the settings contains the configuration for those programms.
the converts are doing just fine at least for thos I managed to check Example
I inserted a code to print $cmd in "C:\pearl\pear\SeedDMS\SQLiteFTS\IndexedDocument.php" before proc_open() is called,how? i can explain if needed.
the comand looks like this
pdftotext -enc UTF-8 -nopgbrk E:/full_content/\1048576/52/1.pdf - | sed -e "s/ [a-zA-Z0-9.]{1} / /g" -e "s/[0-9.]//g"
only changed the ' to " from the original(default) config in 'Advanced' Tab
I run this code in cmd and git-bash workes fine in both of them,displays the txt in stdout, so what can be the case?
You are writing that 10 documents are being indexed. Which file type are those documents? Which type of file isn't indexed?
** apache tika** ... totally solved my problem with full-text search and i think it's far more easy than going through all the converters...specially for window users.
I dont suppose you have any setup instructions for tika do you, i am having the same sort of issues with full text on Windows.