This is tricky since some sites use a different a action url once you visit or
login to a page or it's look for a certain User-Agent in the http header. I
can save you come time in this respect (note I'm using 2.1 trunk, so the xml
will look different). Consider the following code fragment which will visit a
page to extract action URL and other post parameters (also look for my cookie
fix for the 2.1 trunk code a few posts down) in order to post required
parameters:
<configxmlns="[url]http://web-harvest.sourceforge.net/schema/2.1/core[/url]"charset="UTF-8"><defvar="userAgent">Mozilla/5.0 (Windows; U; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)</def><defvar="xmlDoc"/><defvar="postUrl"/><defvar="bowStEvent"/><script><![CDATA[ // Date format java.text.SimpleDateFormat simpleDateFormat = new java.text.SimpleDateFormat("MM/dd/yyyy"); // Get current Date as String String dateStr = simpleDateFormat.format(new Date()); ]]></script><!-- Parse common page elements into variables defined above --><functionname="parseCommon"><!-- Get POST URL from form's action attribute --><setvar="postUrl"><xpathexpression="//form/@action"><value-ofexpr="${xmlDoc}"/></xpath></set><!-- Get _bowStEvent from input value attribute --><setvar="bowStEvent"><xpathexpression="//input[@name='_bowStEvent']/@value"><value-ofexpr="${xmlDoc}"/></xpath></set></function><!-- Visit page, so we can extract common post elements --><setvar="xmlDoc"><html-to-xml><httpmethod="get"url="[url]http://www.manateesheriff.com/wps/portal/PublicInterest/ActiveCalls[/url]"follow-redirects="true"><http-headername="User-Agent"><value-ofexpr="${userAgent}"/></http-header></http></html-to-xml></set><!-- Parse common page elements into variables --><callname="parseCommon"/><!-- Get a list of TR elements of name DataContainer using URL postUrl --><setvar="xmlDoc"><xpathexpression='//tr[@name="DataContainer"]'><html-to-xml><httpmethod="post"url="www.manateesheriff.com${postUrl}"follow-redirects="true"><http-headername="User-Agent"><value-ofexpr="${userAgent}"/></http-header><http-paramname="_bowStEvent"><value-ofexpr="${bowStEvent}"/></http-param><http-paramname="Street"></http-param><http-paramname="DateFrom"><value-ofexpr="${dateStr}"/></http-param><http-paramname="DateTo"><value-ofexpr="${dateStr}"/></http-param><http-paramname="submit_button">Submit</http-param></http></html-to-xml></xpath></set><fileaction="write"path="/home/someuser/mcso_active_calls_output.xml"><value-ofexpr="${xmlDoc}"/></file></config>
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-11-15
Thank you for your code, I'll analyze it and I'll try to do something similar.
You are using a parameter: follow-redirects
What is it for ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-11-15
An important detail ... Where (how) do I get the 2.1 trunk version ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
When you compile Maven project you will notice saxon/saxon-dom are not in
Maven repos. You need to add to your local Maven or Nexus repository then it
will build without error.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-11-16
Compiling the project is a new world to me... sorry.
Is there any specific o general help that explain how to do it (get the local
repository and compile) ?
Tank's
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Just scan the forums or google Maven build. Just install Maven as directed,
put mvn in execution path, cd to where pom.xml file run mvn with appropriate
target.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Compiling the project is a new world to me... sorry.
Eric, you would be surprised how easy that world in WH 2.1 :)
As Steven said in the previous post - just install Maven and then execute "mvn
install" command from the WH project folder (where pom.xml is located).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-11-17
I go into it ...
Thank's for your explanations.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello
Like some others I have some problems with the http post method...
I have identified the different parameters to post but it still not work.
Could it be possible that the destination page is looking for a specific name
of the form, and in that case how could I define it ?
Thank's
This is tricky since some sites use a different a action url once you visit or
login to a page or it's look for a certain User-Agent in the http header. I
can save you come time in this respect (note I'm using 2.1 trunk, so the xml
will look different). Consider the following code fragment which will visit a
page to extract action URL and other post parameters (also look for my cookie
fix for the 2.1 trunk code a few posts down) in order to post required
parameters:
Thank you for your code, I'll analyze it and I'll try to do something similar.
You are using a parameter: follow-redirects
What is it for ?
An important detail ... Where (how) do I get the 2.1 trunk version ?
follow-redirects does what it says, follow http redirects the site may
process. This is defaulted to true in 2b1, but is false as default in 2.1
trunk. https://web-harvest.svn.sourceforge.net/svnroot/web-
harvest/trunk
When you compile Maven project you will notice saxon/saxon-dom are not in
Maven repos. You need to add to your local Maven or Nexus repository then it
will build without error.
Compiling the project is a new world to me... sorry.
Is there any specific o general help that explain how to do it (get the local
repository and compile) ?
Tank's
Just scan the forums or google Maven build. Just install Maven as directed,
put mvn in execution path, cd to where pom.xml file run mvn with appropriate
target.
Guys, the build should be fixed now :)
Eric, you would be surprised how easy that world in WH 2.1 :)
As Steven said in the previous post - just install Maven and then execute "mvn
install" command from the WH project folder (where pom.xml is located).
I go into it ...
Thank's for your explanations.