We got a complaint for a Webmaster that we are creating
a lot of false requests since we are treating values of
HTML FOR attribute as relative URLs.
Since a value of FOR attribute can be a relative URI we
should probably continue to extract them. However, in
order to reduce bad requests we should probably apply
LIKELY_URI_PATH rule to determine if a FOR value is
likely to be a relative URL.
---A note from a webmaster ---
However, there seems to be a problem with the spider's
interpretation
of certain HTML constructs, particularly the 'for'
attribute of
'label' elements. Apparently the spider interprets
these attributes as
relative links and subsequently tries to retrieve those
from the
server. This, of course, doesn't make sense since these
attributes are
meant to refer to elements inside the document, not
external resources
(see
http://www.w3.org/TR/html4/interact/forms.html#edef-LABEL).
Karl Thiessen
Extraction
1.6.0
Public
|
Date: 2007-03-14 00:22
|
|
Date: 2005-06-09 19:49 Logged In: YES |
|
Date: 2005-06-09 17:40 Logged In: YES |
|
Date: 2005-04-15 01:30 Logged In: YES |
| Field | Old Value | Date | By |
|---|---|---|---|
| artifact_group_id | None | 2005-09-23 18:27 | gojomo |
| close_date | - | 2005-06-09 19:49 | karl-ia |
| status_id | Open | 2005-06-09 19:49 | karl-ia |
| resolution_id | None | 2005-06-09 19:49 | karl-ia |
| assigned_to | gojomo | 2005-06-09 17:40 | gojomo |
| priority | 5 | 2005-06-09 17:26 | gojomo |
| assigned_to | nobody | 2005-06-09 17:26 | gojomo |
Copyright © 2010 Geeknet, Inc. All rights reserved. Terms of Use