It is probably an issue with the XML/XPATH parser which is in charge of loading the configuration. It is powerful, but it don't works efficiently with large XML documents.
Can you send us the XML configuration file of your classifier ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I have an intensive use of classifier. My classifier has more than 80 000 items.
It works great and the url I crawl are correctly classified.
But it takes 8 hours to start OSS due to my large classifier.
Is there any config tip to reduce start time ?
i'm running OSS on a Xeon quadcore 2.8Ghz / 20 Go RAM / ubuntu 12.04
16 Go alocated to OSS
thanks
Laurent
Hi Laurent,
It is probably an issue with the XML/XPATH parser which is in charge of loading the configuration. It is powerful, but it don't works efficiently with large XML documents.
Can you send us the XML configuration file of your classifier ?
here is my classifier
Thank you, it was useful. Initially, we did not designed the Classifier to be able to handle hundred thousands of items.
So, we made the required improvements. Now, the instance starts in few minutes.
I created a ticket to track this issue:
http://sourceforge.net/p/opensearchserve/bug-report/185/
The fix is available in v1.4-rc4 which has just been published. Let us know if it works on your environment.
Hi Emmanuel,
I've just upgrade to v1.4-rc4, it starts in less than 20sec with 10 classifiers, 3 of them with more than 80000 items.
No problem so far.
Thanks a million for your very fast answer.
Laurent