Thread: [Htmlparser-cvs] htmlparser/docs/samples index.html,1.5,1.6
Brought to you by:
derrickoswald
From: <der...@us...> - 2004-01-04 03:23:12
|
Update of /cvsroot/htmlparser/htmlparser/docs/samples In directory sc8-pr-cvs1:/tmp/cvs-serv11427/htmlparser/docs/samples Modified Files: index.html Log Message: Web site revamp, phase 1. Main and first level pages are refurbished. The wiki is still to do. Fixed bug #865279 Documentation The samples directory is now orphaned and no longer shipped. Index: index.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/samples/index.html,v retrieving revision 1.5 retrieving revision 1.6 diff -C2 -d -r1.5 -r1.6 *** index.html 2 Sep 2003 00:41:56 -0000 1.5 --- index.html 4 Jan 2004 03:23:08 -0000 1.6 *************** *** 2,28 **** <html> <head> ! <title>Sample Programs</title> ! <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> </head> <body> ! <h3><font size="4"><strong>Sample Programs </strong></font></h3> ! <strong>WARNING: These examples are outdated. Except for the embedded links article, ! they need to be reworked to the most recent version of HTML Parser. ! <p>Please see <a ! href="http://htmlparser.sourceforge.net/docs/index.php/SamplePrograms">WikiPages ! Sample Programs</a> for more recent versions.</strong> ! <p>We provide below some commonly-used sample programs that were created using ! HTMLParser. Going through these programs will give you an idea of the design ! of the parser, and its expected usage.</p> ! <p><a href="links.html">Extracting Links / Mail addresses from a Web Page<br> ! </a><a href="linksEmbedded.html">Extracting Embedded Links</a><br> ! <a href="text.html">Extracting Text Content from a Web Page</a><br> ! <a href="imageslinks.html">Extracting Images Within Links</a><br> ! <a href="exception.html">Exception Handling in the parser</a><br> ! <a href="crawler.html">Web Crawler</a><br> ! <a href="ripper.html">Web Ripper (Modifying links and image locations)</a><br> ! <a href="feedback.html">Feedback Mechanism</a><br> ! <a href="custom.html">Supporting Custom Tags (extending the parser)</a></p> </body> </html> --- 2,106 ---- <html> <head> ! <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> ! <title>Sample Programs</title> ! <link REL ="stylesheet" TYPE="text/css" HREF="../javadoc/stylesheet.css" TITLE="Style"> </head> <body> ! <h2>Sample Programs</h2> ! <p>The example programs included with the HTML Parser distribution are listed ! below, with some details.</p> ! <p><strong>Note:</strong> On unix systems if you used the Java jar command or ! some older unzip utility to extract the distribution zip file, the ! executable flag will not have been preserved on the files in the bin ! directory. You can fix this by issuing the following command: ! <pre> ! <code>chmod u+x bin/*</code> ! </pre> ! <p> ! <table width="94%" border="0"> ! <tr> ! <td valign="top"> ! <strong>Parser</strong><br> ! </td> ! <td> ! <i>Parse a web page and print the tags in a simple loop.</i><br> ! <a href="../javadoc/org/htmlparser/Parser.html#main(java.lang.String[])" target="_parent">org.htmlparser.Parser.main(String[] args)</a> ! <pre> ! <code>bin/parser http://website_url [tag_name]</code> ! where tag_name is an optional tag name to be used as a filter, i.e. ! A - Show only the link tags extracted from the document ! IMG - Show only the image tags extracted from the document ! TITLE - Extract the title from the document ! NOTE: this is also the default program for the htmlparser.jar, so the above could be: ! <code>java -jar lib/htmlparser.jar http://website_url [tag_name]</code> ! </pre> ! </td> ! </tr> ! <tr> ! <td valign="top"> ! <strong>Link Extractor</strong><br> ! </td> ! <td> ! <i>Extract links/mail addresses from a web page.</i><br> ! <a href="../javadoc/org/htmlparser/parserapplications/LinkExtractor.html" target="_parent">org.htmlparser.parserapplications.LinkExtractor</a> ! <pre> ! <code>bin/linkextractor http://website_url [-maillinks]</code> ! the optional -maillinks argument causes mailto: links to be printed ! </pre> ! </td> ! </tr> ! <tr> ! <td valign="top"> ! <strong>String Extractor</strong><br> ! </td> ! <td> ! <i>Extract text from a web page.</i><br> ! <a href="../javadoc/org/htmlparser/parserapplications/LinkExtractor.html" target="_parent">org.htmlparser.parserapplications.StringExtractor</a> ! <pre> ! <code>bin/stringextractor http://website_url [-links]</code> ! the optional -links argument causes hyperlinks to be shown within the text ! </pre> ! </td> ! </tr> ! <tr> ! <td valign="top"> ! <strong>Site Capturer</strong><br> ! </td> ! <td> ! <i>Save a web site locally.</i><br> ! <a href="../javadoc/org/htmlparser/parserapplications/SiteCapturer.html" target="_parent">org.htmlparser.parserapplications.SiteCapturer</a> ! <pre> ! <code>bin/sitecapturer http://source_website /target_directory/ [true|false]</code> ! the optional boolean argument determines whether resources such as images, ! audio and video are to be captured ! </pre> ! </td> ! </tr> ! <tr> ! <td valign="top"> ! <strong>Thumbelina</strong><br> ! </td> ! <td> ! <i>View images behind thumbnails.</i><br> ! <a href="../javadoc/org/htmlparser/lexerapplications/thumbelina/package-summary.html" target="_parent">org.htmlparser.lexerapplications.thumbelina.Thumbelina</a> ! <pre> ! <code>bin/thumbelina [http://starting_website]</code> ! </pre> ! </td> ! </tr> ! <tr> ! <td valign="top"> ! <strong>BeanyBaby</strong><br> ! </td> ! <td> ! <i>Parser Java Bean demo.</i><br> ! <a href="../javadoc/org/htmlparser/beans/BeanyBaby.html" target="_parent">org.htmlparser.beans.BeanyBaby</a> ! <pre> ! <code>bin/beanybaby [http://starting_website]</code> ! </pre> ! </td> ! </tr> ! </table> </body> </html> |