Again, you didn't give us the url that you are crawling, so I couldn't test it
:) If it doesn't enter the second loop, maybe the list in that loop is empty?
First check that.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
BTW, in the generated xml (using this code on two websites) I have malformed
characters:
1st situation: html source in utf-8, xml stored using encoding="UTF-8" , but
got: "Ajdarević" instead of "Ajdarević"
2nd situation: html source in iso-8859-1 (contains a lot of accented letters:
éàè...), I want to store data using encoding="UTF-8", how can I do so
correctly?
Thanks in advance again :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
Can somebody check my config file to see what's wrong with it please... It
seems like it even doesn't get into the second loop !
http://pastebin.com/x1pKdEDq
Thanks in advance
Again, you didn't give us the url that you are crawling, so I couldn't test it
:) If it doesn't enter the second loop, maybe the list in that loop is empty?
First check that.
try to check the first loop ,please
Again, i'm sorry for disturbing...
org.apache.commons.httpclient.util.URIUtil.encodeQuery returned bad links in
the second loop, replacing it with sys.fullURL solved the problem...
It tooks me more than a day to figure it out... but i'm making improvement in
errors detection, I hope this will be my last "useless" spam xD
Thanks again guys :)
BTW, in the generated xml (using this code on two websites) I have malformed
characters:
1st situation: html source in utf-8, xml stored using encoding="UTF-8" , but
got: "Ajdarević" instead of "Ajdarević"
2nd situation: html source in iso-8859-1 (contains a lot of accented letters:
éàè...), I want to store data using encoding="UTF-8", how can I do so
correctly?
Thanks in advance again :)