Web crawler over https with opensearchserver-1.5.2-SNAPSHOT-b394

  • Poncho

    Poncho - 2013-12-12

    Hello everybody I'm beginner in OpenSearchServer.

    Anybody knows if it's possible to webcrawl over https protocol.
    That's the situation, we've got a GLPI Server and we would like to index all the content (specially de knowledgebase) with webcrawler. Y have tried all the options in the tab crawler > web > authentication" using an existing UID and password but unfortunately it doesn't work. The only web page is indexed is the home page.
    Nothing is written in the log file.
    I'm using opensearchserver-1.5.2-SNAPSHOT-b394 version

  • Emmanuel Keller

    Emmanuel Keller - 2013-12-31

    The web crawl over SSL (HTTPS) should work.

    We made several improvement by integrating the last HttpClient version and a better handling of the crawl error in the URL browser.

    Can you try the last 1.5.2 version ? And check the status in the URL Browser ?

    Last edit: Emmanuel Keller 2013-12-31
  • Poncho

    Poncho - 2014-01-02

    Hello Emmanuel,
    Thanks for your answer, I'm sorry to make you waste your time. In fact I think to have found an explanation by myself. The problem is that GLPI uses a webform authetication. And I've read here https://github.com/jaeksoft/opensearchserver/issues/22 that it's a feature that doesn't still work.
    So I will wait for progress in this field in the project. Thanks again Emmanuel.


Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks