Menu

#198 iconv() detected an illegal character in input string

closed
None
2022-04-24
2022-04-11
No

Problem: simple_html_dom->parse_charset () at line 1765 produces an error:
iconv() detected an illegal character in input string in ...

Fix: insert '//IGNORE' to handle characters that cannot be transliterated, like so:
if (!@iconv('CP1252', 'UTF-8//IGNORE', $this->doc)) {

Discussion

  • LogMANOriginal

    LogMANOriginal - 2022-04-19
    • labels: iconv, simple_html_dom->parse_charset(), simplehtmldom-1.9.1 -->
    • status: open --> accepted
    • assigned_to: LogMANOriginal
     
  • LogMANOriginal

    LogMANOriginal - 2022-04-19

    Thanks for reporting. It took me a while to figure out what is going on. Am I right to assume that you are running on PHP 8.x?

    In previous versions that error would not have been reported because of the error suppression operator (@). (Un-)fortunately the behavior of this operator changed in PHP 8: https://php.watch/versions/8.0/fatal-error-suppression

    The behavior for //IGNORE depends on the specific implementation of iconv, some of which completely ignore this flag. Still, this is a good hack to make it work for now.

     
  • Roland Heymanns

    Roland Heymanns - 2022-04-19

    Thanks for your good work!

    The error message first appeared after I upgraded PHP from 8.0 to 8.1 last week.

     
  • LogMANOriginal

    LogMANOriginal - 2022-04-24
    • status: accepted --> closed
     
  • LogMANOriginal

    LogMANOriginal - 2022-04-24

    This is fixed via [c53a612e6fe61d5b1efc0c3270e20aa34e4e84ee]. Instead of using //IGNORE, it needs to be wrapped inside a try-catch block, so that the character set is detected properly. Eventually, this will be replaced by a better solution, but this works for now.

    Thanks again for reporting!

     

    Related

    Commit: [c53a61]


Log in to post a comment.

MongoDB Logo MongoDB