Thread: [Htmlparser-cvs] htmlparser/src/doc-files using.html,NONE,1.1 building.html,1.2,1.3 overview.html,1.
Brought to you by:
derrickoswald
From: Derrick O. <der...@us...> - 2005-04-24 17:48:35
|
Update of /cvsroot/htmlparser/htmlparser/src/doc-files In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv5117/htmlparser/src/doc-files Modified Files: building.html overview.html Added Files: using.html Log Message: Documentation revamp part three. Reworked some JavaDoc descriptions. Added "HTML Parser for dummies" introductory text. Removed checkstyle.jar and fit.jar (and it's cruft). Index: building.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/doc-files/building.html,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** building.html 15 Mar 2004 22:50:55 -0000 1.2 --- building.html 24 Apr 2005 17:48:26 -0000 1.3 *************** *** 10,14 **** <H2>JDK</H2> Set up java. I won't include instructions here, just a link to the ! <a href="http://java.sun.com/j2se">Sun j2se site</a>. I use version 1.4.1, and you need a JDK (java development kit), not a JRE (java runtime environment).<p> Test your installation by typing command:<p> --- 10,14 ---- <H2>JDK</H2> Set up java. I won't include instructions here, just a link to the ! <a href="http://java.sun.com/j2se">Sun j2se site</a>. I use version 1.5, and you need a JDK (java development kit), not a JRE (java runtime environment).<p> Test your installation by typing command:<p> *************** *** 21,25 **** HTML Parser uses relies on command tags available in Ant version 1.4.1 or higher. The version currently used on the build machine ! is 1.5.3. The current version of Ant is available <a href="http://archive.apache.org/dist/ant/ant-current-bin.zip">here</a>.<p> Basically you unzip the file into a directory and add an ANT_HOME environment --- 21,25 ---- HTML Parser uses relies on command tags available in Ant version 1.4.1 or higher. The version currently used on the build machine ! is 1.6.2. The current version of Ant is available <a href="http://archive.apache.org/dist/ant/ant-current-bin.zip">here</a>.<p> Basically you unzip the file into a directory and add an ANT_HOME environment Index: overview.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/doc-files/overview.html,v retrieving revision 1.4 retrieving revision 1.5 diff -C2 -d -r1.4 -r1.5 *** overview.html 18 Jul 2004 21:31:21 -0000 1.4 --- overview.html 24 Apr 2005 17:48:26 -0000 1.5 *************** *** 12,23 **** The HTML Parser distribution is composed of: <ul> ! <li>a low level {@link org.htmlparser.lexer.Lexer lexer} that converts characters into tags</li> ! <li>a high level {@link org.htmlparser.Parser parser} that provides a heirarchical document view</li> ! <li>several example applications</li> </ul> <p> <h2>Building</h2> ! To build the system you'll need to get the sources from the ! <a href="http://sourceforge.net/project/showfiles.php?group_id=24399&release_id=161563">HTML Parser project on Sourceforge</a> if you haven't already, and then follow the <A href="{@docRoot}/doc-files/building.html">build instructions</A>. --- 12,26 ---- The HTML Parser distribution is composed of: <ul> ! <li>a low level {@link org.htmlparser.lexer.Lexer lexer} that converts characters from a HTML page into a linear sequence of nodes</li> ! <li>a high level {@link org.htmlparser.Parser parser} that provides a heirarchical document model of a HTML page</li> ! <li>source code in the src.zip file</li> </ul> <p> + <h2>Getting Started</h2> + For novice users, an introductory guide on how to set up your environment to + use the HTML Parser is provided in <A href="{@docRoot}/doc-files/using.html">HTML Parser for Dummies</A>. <h2>Building</h2> ! To build the HTML Parser you'll need to get the sources from the ! <a href="http://sourceforge.net/project/showfiles.php?group_id=24399" target="_top">HTML Parser project on Sourceforge</a> if you haven't already, and then follow the <A href="{@docRoot}/doc-files/building.html">build instructions</A>. --- NEW FILE: using.html --- <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <title>How to Use the HTML Parser Libraries</title> <link REL ="stylesheet" TYPE="text/css" HREF="../stylesheet.css" TITLE="Style"> </head> <body> <H1>How to Use the HTML Parser libraries</H1> <h3>Step 1: Java</h3> You should make sure that a Java development system (JDK) is installed, not just a Java runtime (JRE). If you are working in an IDE (Integrated Development Environment) this is usually taken care of for you. If you are using just a command line, should see help information when you type: <pre> javac </pre> Java versions greater than 1.2 are supported for the parser, and Java 1.1 for the lexer. You can check your version with the command: <pre> java -version </pre> If you are using Java 5, you may need to specify option "-source 1.3" to avoid some warnings. <h3>Step 2: Setting the CLASSPATH</h3> To use the HTML Parser you will need to add the htmlparser.jar to the classpath. This jar includes all the files in htmllexer.jar, which is the subset of classes used by the lexer. If you are using an IDE, you need to add the htmlparser.jar to the list of jars/libraries used by your project. <h4>NetBeans</h4> <ul> <li>Right click on your project in the Projects Window (Ctrl-1) and choose Properties.</li> <li>In the Project Properties pane choose the Libraries view.</li> <li>Select the Compile tab.</li> <li>Click the Add Jar/Folder button.</li> <li>Browse to <htmlp_dir>/lib (where where <htmlp_dir> is the directory where you unzipped the distribution: xxx/htmlparser1_5), select the htmlparser.jar file and click on OK.</li> </ul> <h4>Eclipse</h4> <ul> <li>Right click on your project in the Package Explorer Window (Shift-Alt-Q + P) and choose Properties.</li> <li>In the Properties pane choose the Java Build Path view.</li> <li>Select the Libraries tab.</li> <li>Click the Add External Jars button.</li> <li>Browse to <htmlp_dir>/lib (where where <htmlp_dir> is the directory where you unzipped the distribution: xxx/htmlparser1_5), select the htmlparser.jar file and click on OK.</li> </ul> <h4>Command Line</h4> You can either add the jar to the CLASSPATH environment variable, or specify it each time on the command line: <h5>Windows</h5> <pre>set CLASSPATH=[htmlp_dir]\lib\htmlparser.jar;%CLASSPATH%</pre> where [htmlp_dir] is the directory where you unzipped the distribution: xxx\htmlparser1_5, or use: <pre>javac -classpath=[htmlp_dir]\lib\htmlparser.jar MyProgram.java</pre> <h5>Linux</h5> <pre>export CLASSPATH=[htmlp_dir]/lib/htmlparser.jar:$CLASSPATH</pre> where [htmlp_dir] is the directory where you unzipped the distribution: xxx/htmlparser1_5, or use <pre>javac -classpath=[htmlp_dir]/lib/htmlparser.jar MyProgram.java</pre> <h3>Step 3: Import Necessary Classes</h3> Whatever classes you use from the HTML Parser libraries will need to be imported by your program. For example, the simplest usage is: <pre> import org.htmlparser.Parser; import org.htmlparser.util.NodeList; import org.htmlparser.util.ParserException; class Test { public static void main (String[] args) { try { Parser parser = new Parser (args[0]); NodeList list = parser.parse (null); System.out.println (list.toHtml ()); } catch (ParserException pe) { pe.printStackTrace (); } } } </pre> Note the import statements may also have been written: <pre> import org.htmlparser.*; import org.htmlparser.util.*; </pre> <h3>Step 4: Compile & Run</h3> Within an IDE the compile and execute steps are usually combined. <h4>NetBeans</h4> <ul> <li>From the Run menu select Run Main Project (F6).</li> </ul> <h4>Eclipse</h4> <ul> <li>From the Run menu select Run... and browse to the Main class and click the Run button.</li> </ul> <h4>Command Line</h4> The above program in a file called Test.java can be compiled and run with the commands: <h5>Windows</h5> <pre> javac -classpath=[htmlp_dir]\lib\htmlparser.jar Test.java java -classpath=.;[htmlp_dir]\lib\htmlparser.jar Test.java </pre> <h5>Linux</h5> <pre> javac -classpath=[htmlp_dir]/lib/htmlparser.jar Test.java java -classpath=.:[htmlp_dir]/lib/htmlparser.jar Test.java </pre> </body> </html> |