Thread: [Htmlparser-cvs] htmlparser/src/org/htmlparser/tags package.html,1.20,1.21

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv5117/htmlparser/src/org/htmlparser/tags

Modified Files:
	package.html 
Log Message:
Documentation revamp part three.
Reworked some JavaDoc descriptions.
Added "HTML Parser for dummies" introductory text.
Removed checkstyle.jar and fit.jar (and it's cruft).

Index: package.html
===================================================================
RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags/package.html,v
retrieving revision 1.20
retrieving revision 1.21
diff -C2 -d -r1.20 -r1.21
*** package.html	10 Apr 2005 23:20:45 -0000	1.20
--- package.html	24 Apr 2005 17:48:27 -0000	1.21
***************
*** 41,48 ****
  <p>The classes in this package have been added in an ad-hoc fashion, with the
  most useful ones having existed a long time, while some obvious ones are rather
! new. Please feel free to add your own, and register them with the
  {@link org.htmlparser.PrototypicalNodeFactory PrototypicalNodeFactory},
  and they will be treated like any other in-built tag. In fact tags do not need
  to reside in this package.</p>
  <p>If the tag can contain other nodes, i.e. {@.html <h1>My Heading</h1>}, then
  it should derive from (i.e. be a subclass of) {@link org.htmlparser.tags.CompositeTag}.
--- 41,51 ----
  <p>The classes in this package have been added in an ad-hoc fashion, with the
  most useful ones having existed a long time, while some obvious ones are rather
! new. Please feel free to add your own custom tags, and register them with the
  {@link org.htmlparser.PrototypicalNodeFactory PrototypicalNodeFactory},
  and they will be treated like any other in-built tag. In fact tags do not need
  to reside in this package.</p>
+ <br><b>Custom Tags</b>
+ <p>Creating custom tags is fairly straight forward. Simply copy one of the
+ simpler tags you find in this package and alter it as follows.
  <p>If the tag can contain other nodes, i.e. {@.html <h1>My Heading</h1>}, then
  it should derive from (i.e. be a subclass of) {@link org.htmlparser.tags.CompositeTag}.
***************
*** 51,59 ****
  and nodes between the start and end tag will be gathered into the list of
  children. Most of the tags in this package derive from CompositeTag, and that
! why the nodes returned from the Parser are nested.</p>
  <p>If it is a simple tag, i.e. {@.html <br>}, then it should derive from
  {@link org.htmlparser.nodes.TagNode TagNode}. See for example
  {@link org.htmlparser.tags.MetaTag}
  or {@link org.htmlparser.tags.ImageTag}.</p>
  <!-- Put @see and @since tags down here. -->
  
--- 54,89 ----
  and nodes between the start and end tag will be gathered into the list of
  children. Most of the tags in this package derive from CompositeTag, and that
! is why the nodes returned from the Parser are nested.</p>
  <p>If it is a simple tag, i.e. {@.html <br>}, then it should derive from
  {@link org.htmlparser.nodes.TagNode TagNode}. See for example
  {@link org.htmlparser.tags.MetaTag}
  or {@link org.htmlparser.tags.ImageTag}.</p>
+ <p>To be registered with {@link org.htmlparser.PrototypicalNodeFactory#registerTag},
+ and especially if it is a composite tag, the tag needs to implement
+ <code>getIds</code> which returns the UPPERCASE list of names for the tag
+ (usually only one), for example "HTML". If the tag can be smart enough to know
+ what other tags can't be contained within it, it should also implement
+ {@link org.htmlparser.nodes.TagNode#getEnders getEnders()} which returns the
+ list of other tags that should cause this tag to close itself, and 
+ {@link org.htmlparser.nodes.TagNode#getEndTagEnders getEndTagEnders()} which
+ returns the list of end tags (i.e. {@.html </xxx>}), other than it's own name, that
+ should cause this tag to close itself. When these 'ender' lists cause a tag to
+ end before seeing it's own end tag, a virtual end tag is created and 'inserted'
+ at the location where the end tag should have been. These end tags can be
+ distinguished because their {@link org.htmlparser.Node#getStartPosition starting}
+ and {@link org.htmlparser.Node#getEndPosition ending} locations are the same
+ (i.e. they take up no character length in the HTML stream).
+ <p>For example, the {@.html <OPTION>} tag from a form can be prematurely ended by
+ any of {@.html <INPUT>}, {@.html <TEXTAREA>}, {@.html <SELECT>},
+ or another {@.html <OPTION>} tag. These are the tags in the getEnders() list.
+ It can also be prematurely ended by {@.html </SELECT>}, {@.html </FORM>},
+ {@.html </BODY>}, or {@.html </HTML>}. These are the tags in the
+ getEndTagEnders() list.
+ <p>Other than that any functionality is up to you. You should note that
+ {@link org.htmlparser.Node#doSemanticAction doSemanticAction()} is called after
+ the tag has been completely scanned (it has it's children and end tag), but before
+ its siblings further downstream have been scanned. If transformation is your purpose,
+ this is the opportunity to mess around with the content, for example to set the link URL,
+ or lowercase the tag name, or whatever.
  <!-- Put @see and @since tags down here. -->

Thread: [Htmlparser-cvs] htmlparser/src/org/htmlparser/tags package.html,1.20,1.21

htmlparser-cvs