Re: [Htmlparser-user] Parsing for links
Brought to you by:
derrickoswald
|
From: Derrick O. <der...@ro...> - 2007-08-07 22:16:23
|
Hi,=0A=0AThe HasAttributeFilter should have worked... at least enough to ex=
tract all links with the Id attribute:=0A new AndFilter (new TagNameFilter=
("A"), new HasAttributeFilter ("Id"))=0A=0AThat said, there isn't a "HasAt=
tributeRegexFilter" that would match an attribute value pattern,=0Aalthough=
it has been discussed on the dev forum - or was that the LinkRegexFilter?=
=0A=0AWhat you need is a combination of the HasAttributeFilter and the Rege=
xFilter, where the exact equality test in the accept() method of HasAttribu=
teFilter is replaced by the pattern matching code from the RegexFilter. Som=
ething like this:=0A=0A /**=0A * Accept tags with a certain attribut=
e.=0A * @param node The node to check.=0A * @return <code>true</cod=
e> if the node has the attribute=0A * (and value if that is being check=
ed too), <code>false</code> otherwise.=0A */=0A public boolean accep=
t (Node node)=0A {=0A Tag tag;=0A Attribute attribute;=0A =
String string;=0A Matcher matcher;=0A boolean ret;=0A=
=0A ret =3D false;=0A if (node instanceof Tag)=0A {=0A=
tag =3D (Tag)node;=0A attribute =3D tag.getAttribute=
Ex (mAttribute);=0A ret =3D null !=3D attribute;=0A i=
f (ret && (null !=3D mValue))=0A {=0A string =3D =
attribute.getValue ();=0A matcher =3D mPattern.matcher (stri=
ng);=0A switch (mStrategy)=0A {=0A =
case MATCH:=0A ret =3D matcher.matches ();=
=0A break;=0A case LOOKINGAT:=0A =
ret =3D matcher.lookingAt ();=0A =
break;=0A case FIND:=0A default:=
=0A ret =3D matcher.find ();=0A =
break;=0A }=0A }=0A }=0A=0A retu=
rn (ret);=0A }=0A=0ADerrick=0A=0A----- Original Message ----=0AFrom: Mar=
k Goking <Mar...@as...>=0ATo: htm...@li...=
orge.net=0ASent: Tuesday, August 7, 2007 5:19:29 AM=0ASubject: [Htmlparser-=
user] Parsing for links=0A=0A=0AHi all=0A=0AI used the filterbean class to =
extract only tags with links <a href>=0A=0AHowever I wish to only retrieve =
links that have an id attribute with=0Avalue that starts with string test_=
=0A=0AI don't see any method in the api that lets you do a search for the i=
d's=0Avalue that acts like a String's indexOf() method.=0A=0AWhat would be =
the filters needed for this operation? Even though ive=0Aadded attributes t=
o the LinkTag to search for id=3Dvalue attribute, it=0Astill wont work.=0A=
=0AThanks=0AChester=0A=0A--------------------------------------------------=
-----------------------=0AThis SF.net email is sponsored by: Splunk Inc.=0A=
Still grepping through log files to find problems? Stop.=0ANow Search log =
events and configuration files using AJAX and a browser.=0ADownload your FR=
EE copy of Splunk now >> http://get.splunk.com/=0A________________________=
_______________________=0AHtmlparser-user mailing list=0AHtmlparser-user@li=
sts.sourceforge.net=0Ahttps://lists.sourceforge.net/lists/listinfo/htmlpars=
er-user=0A=0A=0A=0A=0A |