Many times the emails I am trying to parse have attributes without the " or ' surrounding the value (I think mainly Microsoft generated emails). These don't seem to parse correctly. Any ways to fix this easily?
Thanks. Any help you could provide would be greatly appreciated. Here is an example. The span tag is removed correctly, but the paragraph tag stays. This happens on your demo page also.
Could you please provide an example HTML snippet so I could take a look? If it's an easy fix I might be able to fix it.
Though I want to be straight with you: I don't use this stuff myself anymore and hence development pretty much stopped...
Thanks. Any help you could provide would be greatly appreciated. Here is an example. The span tag is removed correctly, but the paragraph tag stays. This happens on your demo page also.
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif";
color:#1F497D'>inside text</span></p>
Here is the source code that I added. I am not really great at that stuff, but tried to figure something out. The second else if is what i added:
elseif (in_array($this->html[$pos].$this->html[$pos+1], array('="', "='"))) {
# get attribute value
$pos++;
$await = $this->html[$pos]; # single or double quote
$pos++;
$value = '';
while (isset($this->html[$pos]) && $this->html[$pos] != $await) {
$value .= $this->html[$pos];
$pos++;
}
$attributes[$currAttrib] = $value;
$currAttrib = '';
} elseif (in_array($this->html[$pos], array('='))) {
# get attribute value
$await = array(" ", ">");
$pos++;
$value = '';
while (isset($this->html[$pos]) && !in_array($this->html[$pos], $await)) {
$value .= $this->html[$pos];
$pos++;
}
if ($this->html[$pos] === ">") {
$pos--;
}
$attributes[$currAttrib] = $value;
$currAttrib = '';
}