Re: [Htmlparser-user] how to deal with form tag following table tag
Brought to you by:
derrickoswald
From: Leslie R. <le...@op...> - 2002-12-06 00:40:13
|
actually, there are two problems in the case at hand, and i am not at all sure that the <table><form> construction is the worst of them. not only does hotbot produce this invalid sequence, but they also failed to close the form tag. it looks like HTMLFormScanner simply falls out of the loop at lines 136-154 looking for the end tag and throws an exception when not found. it would be better if the end-form tag could be 'assumed' so that the file could at least be parsed. that would mirror the behavior of commercial browsers. Leslie Rohde wrote: > the construction <table ...><form...> is not allowed in spec, but it > does occur in such places as the hotbot search engine results page. > currently, htmlparser delivers a flood errors and exceptions when > parsing a hotbot results page. > > how best to handle this? > -- Leslie Rohde mailto:le...@op... http://www.optitext.com |