[Htmlparser-user] Problem parsing a link
Brought to you by:
derrickoswald
From: Stephen H. <Ste...@tr...> - 2002-09-27 17:41:23
|
I have a simple document which I am trying to parse a link out of: Here is the code: <html> <body> <DL> <DT>YOUR QUERY WAS: </DL> Select one of the following documents to retrieve. <P> <HR> <P><DL> <DT><B>1:</B> <!-- hit --><A HREF="/cgi-bin/view_search?query_text=postdate>20020701&txt_clr=White&bg_clr=Red&url=http://localhost/Testing/Report 1.html">20020702 Report 1</A> <DD><font size="-1">Score: 1000, Size: 7.4 kbytes, Type: URL file</font> </DL> </body> </html> The parser is getting confused by the '>' after the postdate. Instead of returning the whole link: http://localhost/cgi-bin/view_search?query_text=postdate>20020701&txt_clr=White&bg_clr=Red&url=http://localhost/Testing/Report 1.html only a portion of the link is returned: http://localhost/cgi-bin/view_search?query_text If the 'postdate>' is replaced by 'postdate=' then it functions properly. Seems like the parser is not looking at the double quotes. I am using the latest integration build (1.2-2002_08_31) Before digging into the source code and trying to fix the problem, I thought maybe someone might have run into this problem before. Thanks, --stephen |