Links with special chars (umlaut, etc.)

  • Heiko

    Heiko - 2012-03-09

    I noticed that urls with umlaut (öäü and more) which are like  %FC were converted to \0x0FC
    These urls were rejected by our indexer (solr 3.5) - the old solr 1.4 accetps it still.
    It would be helpful, when these characters are left untouched and remain url encoded.

    PS: PHPCrawl is a geat tool. Thanks a lot. Heiko

  • Uwe Hunfeld

    Uwe Hunfeld - 2012-03-09

    Hey Heiko again,

    could you please post an example-website containing (working) links with umlauts (so i may be able to relate wht's going on there)?

    Thanks again and best regards,


  • Uwe Hunfeld

    Uwe Hunfeld - 2012-03-14

    Hi Heiko,

    just noticed that phpcrawl has some problems with links containing umlauts (and other speacial characters) in general.
    HTTP-requests for these URLs sometimed don't work as expected  (depending on the char-encoding of the refering document and other encoding stuff).

    Will try to fix that.




Cancel  Add attachments

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks