I've read a lot about it and I've seen that HTMLUnit api can help me to get
correct link. Could anyone show me an example of integration in a webharvest
script?
I'd like to integrate HTMLUnit in a webharvest script, and once HTMLUnit leads
me to that detail information come back to parse that URL via webharvest..
Thanks in advance
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It is very easy to do. Also take a look at the existing plugins to get some
clues.
Note that since WH 2.1 the Plugin API has changed very little, although
nothing special and you will easily get it when you look at the source code.
Basically what you need to do it do inherit
fromorg.webharvest.runtime.processors.WebHarvestPlugin and implement some
methods. Then you need to register your plugin either via WH IDE or (if you
use WH programmatically) via
org.webharvest.definition.DefinitionResolver.registerPlugin(Class pluginClass,
String uri)
Then in WH 2.1 the usage is also a bit different - you need to use your custom
XML NS which your plugin is associated with.
Hi,
I have a serious issue when asking for detail information in an specific URL.
Seems that links are created on onclick event....
PS20111026_INCOMPANYSEV
I've read a lot about it and I've seen that HTMLUnit api can help me to get
correct link. Could anyone show me an example of integration in a webharvest
script?
I'd like to integrate HTMLUnit in a webharvest script, and once HTMLUnit leads
me to that detail information come back to parse that URL via webharvest..
Thanks in advance
Use WH Plugin API to make a plugin. See the very basic example here - http
://web-harvest.sourceforge.net/plugins.php
It is very easy to do. Also take a look at the existing plugins to get some
clues.
Note that since WH 2.1 the Plugin API has changed very little, although
nothing special and you will easily get it when you look at the source code.
Basically what you need to do it do inherit
fromorg.webharvest.runtime.processors.WebHarvestPlugin and implement some
methods. Then you need to register your plugin either via WH IDE or (if you
use WH programmatically) via
org.webharvest.definition.DefinitionResolver.registerPlugin(Class pluginClass,
String uri)
Then in WH 2.1 the usage is also a bit different - you need to use your custom
XML NS which your plugin is associated with.