No, I'm afraid there isn't. The only real documentation is the
javadocs<http://htmlparser.sourceforge.net/javadoc/index.html>
.
There are quite a few examples of standalone programs provided in the source
code.
Search for
public static void main (String[] args)
For example, in org.htmlparser.Parser (located in
htmlparser\trunk\parser\src\main\java\org\htmlparser\Parser.java):
/**
* The main program, which can be executed from the command line.
* @param args A URL or file name to parse, and an optional tag name to
be
* used as a filter.
*/
public static void main (String [] args)
{
Parser parser;
NodeFilter filter;
if (args.length < 1 || args[0].equals ("-help"))
{
System.out.println ("HTML Parser v" + getVersion () + "\n");
System.out.println ();
System.out.println ("Syntax : java -jar htmlparser.jar"
+ " <file/page> [type]");
System.out.println (" <file/page> the URL or file to be
parsed");
System.out.println (" type the node type, for example:");
System.out.println (" A - Show only the link tags");
System.out.println (" IMG - Show only the image tags");
System.out.println (" TITLE - Show only the title tag");
System.out.println ();
System.out.println ("Example : java -jar htmlparser.jar"
+ " http://www.yahoo.com");
System.out.println ();
}
else
try
{
parser = new Parser ();
if (1 < args.length)
filter = new TagNameFilter (args[1]);
else
{
filter = null;
// for a simple dump, use more verbose settings
parser.setFeedback (Parser.STDOUT);
getConnectionManager ().setMonitor (parser);
}
getConnectionManager ().setRedirectionProcessingEnabled
(true);
getConnectionManager ().setCookieProcessingEnabled (true);
parser.setResource (args[0]);
System.out.println (parser.parse (filter));
}
catch (ParserException e)
{
e.printStackTrace ();
}
}
and in org.htmlparser.beans.StringBean
(htmlparser\trunk\parser\src\main\java\org\htmlparser\beans\StringBean.java):
/**
* Unit test.
* @param args Pass arg[0] as the URL to process.
*/
public static void main (String[] args)
{
if (0 >= args.length)
System.out.println ("Usage: java -classpath htmlparser.jar"
+ " org.htmlparser.beans.StringBean <http://whatever_url>");
else
{
StringBean sb = new StringBean ();
sb.setLinks (false);
sb.setReplaceNonBreakingSpaces (true);
sb.setCollapse (true);
sb.setURL (args[0]);
System.out.println (sb.getStrings ());
}
}
On Mon, Jun 20, 2011 at 7:23 AM, Michael Ghasemi <mi...@co...> wrote:
> **
> Hello
> I am new to java, is there a tutorial/guide on how to use all features of
> htmlparser?
>
> Thanks
>
>
>
> ------------------------------------------------------------------------------
> EditLive Enterprise is the world's most technically advanced content
> authoring tool. Experience the power of Track Changes, Inline Image
> Editing and ensure content is compliant with Accessibility Checking.
> http://p.sf.net/sfu/ephox-dev2dev
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>
>
|