Re: [Htmlparser-user] Harvester
Brought to you by:
derrickoswald
From: Somik R. <so...@ya...> - 2003-02-23 05:22:19
|
You could go thru the docs at http://htmlparser.sourceforge.net/docs/index.php/LinkExtraction Forms and Frames are represented by HTMLFormTag, and HTMLFrameTag. You could write your own visitor that could collect form tags, string nodes, and on encountering a frame tag, could open a new parser object for the frame url and visit it with the same visitor (different object probably). Try out the programs on this page, and it should be easy. Feel free to post here if you face any problems. Regards, Somik ----- Original Message ----- From: "Mohd-Taqiyuddin Zalfan" <mt...@ec...> To: <htm...@li...> Sent: Saturday, February 22, 2003 10:44 AM Subject: [Htmlparser-user] Harvester > hi, > > I would like to write a program that can harvest certain information (mostly > text) on the web page. Some of the web page requires feedback from the user > (existence of <form> tag) to get more information on the page. Some of the > page is just a plain text and some of the page is in frames. How can I wrote > a single harvester that can harvest these three types of pages with one > harvester code. > > below is the sample pages that I want to harvest. (harvest question and get > the correct answers.) > > i)with the form: http://developer.java.sun.com/developer/Quizzes/jbasics1-1/ > ii)plain text: http://www.jchq.net/mockexams/exam3.htm > iii) with frames: http://www.angelfire.com/or/abhilash/Main.html > > hope you can give me some advice on how to do this. thank you. > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SlickEdit Inc. Develop an edge. > The most comprehensive and flexible code editor you can use. > Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial. > www.slickedit.com/sourceforge > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-user |