2007-08-13 14:53:12 UTC
Hello,
I have been successfully using web harvest for scraping for some time.
I have lately discovered an issue regarding scraping multiple pages and processing them with some xpath expressions.
I did some basic profiling and apparently the class
org.webharvest.runtime.variables.NodeVariable is the one that pumps up with every downloaded page resulting in the end an OutOfMemory exception for the scraping process.
Maybe there is a way to fix this issue.
Kind regards,
Mile