Parsing large pages resulting in fatal error like this one:
"Allowed memory size of 67108864 bytes exhausted"
I assume this is because the DOM structure and data are being stored in RAM. This is OK for small pages, by I often run onto huge pages I need to parse.
Can the class have an option to write data to file, on the fly? And in the DOM structure within an object it'll have only pointers to that data (integers)?
So RAM will only hold information needed to search and traverse DOM tree, and action content (text nodes) will be stored in file.
If I'm not mistaking, this can significantly relieve the RAM usage.
I'd like to see that in Simple HTML DOM Parser as an option for parsing big HTML files.
Log in to post a comment.