I think is useful for preserve original content of attribute (eg. useful on <input value=""> or title attributes).
class simple_html_dom { protected $strip_rn = true; function load( $str, $lowercase = true, $stripRN = true, $defaultBRText = DEFAULT_BR_TEXT, $defaultSpanText = DEFAULT_SPAN_TEXT, $options = 0) { [...] $this->strip_rn = $stripRN; [...] } protected function parse_attr($node, $name, &$space) { [...] if ($this->strip_rn) { $node->attr[$name] = str_replace("\r", '', $node->attr[$name]); $node->attr[$name] = str_replace("\n", '', $node->attr[$name]); } [...] } }
Joseph
Good point. I'm actually not sure why newlines are removed in the first place, as browsers seem to allow them in attribute values. The comment for this particular if-statement doesn't link to any particular source:
The specification is also quite clear:
That said, removing the offending if-statement has no effect because all newline characters are replaced by blanks if
$stripRN = true
inload (...)
. Making multiline attributes possible essentially means newline characters can no longer be stripped by a simplepreg_replace
. It needs to be done content aware.This is certainly something worth looking into eventually.