Parser should decode HTML entities

A php based DOM parser.

Brought to you by: john_schlick, logmanoriginal, me578022

#29 Parser should decode HTML entities

Status: closed

Owner: LogMANOriginal

Labels: None

Updated: 2019-04-19

Created: 2009-07-16

Creator: Francesc Rosàs

Private: No Discussion Disabled

Example:

$dom = new simple_html_dom('

&

');
echo $dom->find('*', 0)->plaintext; // Got "&", but expected "&"

Discussion

Francesc Rosàs - 2009-07-16

bugfix.diff

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Francesc Rosàs - 2009-07-16

I've solved it by adding a htmlspecialchars_decode() to any output function, but I suppose it should be fixed in the parser itself.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alex - 2010-05-05

I don't think its a good idea to make this permanent because lets say you have ">"(greater than symbol) and/or "<"(less than symbol) anywhere in your text and you decode that then your HTML would become invalid basically

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Francesc Rosàs - 2010-05-06

Please check the patch I submited, it only modifies text() and __get() functions. These functions doesn't return HTML so keeping HTML entities in it doesn't have any sense.

BTW, probably there is a bug in text() implementation as a same text can be decoded multiple times.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

LogMANOriginal - 2019-04-19

status: open --> closed

assigned_to: LogMANOriginal

discussion: enabled --> disabled
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

LogMANOriginal - 2019-04-19

Thanks for including a patch.
Closing this in favor of https://sourceforge.net/p/simplehtmldom/feature-requests/52/ - please continue discussion on that ticket.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link: