I noticed that urls with umlaut (öäü and more) which are like %FC were converted to \0x0FC
These urls were rejected by our indexer (solr 3.5) - the old solr 1.4 accetps it still.
It would be helpful, when these characters are left untouched and remain url encoded.
PS: PHPCrawl is a geat tool. Thanks a lot. Heiko
You seem to have CSS turned off.
Please don't fill out this field.
Hey Heiko again,
could you please post an example-website containing (working) links with umlauts (so i may be able to relate wht's going on there)?
Thanks again and best regards,
You can try these documents:
just noticed that phpcrawl has some problems with links containing umlauts (and other speacial characters) in general.
HTTP-requests for these URLs sometimed don't work as expected (depending on the char-encoding of the refering document and other encoding stuff).
Will try to fix that.
Sign up for the SourceForge newsletter: