Menu

#3 Unicode support

open
nobody
None
5
2007-07-03
2007-07-03
Anonymous
No

Tyring to read Hebrew characters inside the HTML and I get garbaged string

Discussion

  • Davi de Castro Reis

    The htmlcxx library is encoding agnostic. You need to know the original encoding of the text.

     
  • Alexis Wilke

    Alexis Wilke - 2011-09-10

    Actually this isn't correct. htmlcxx is limited to "basic" HTML (opposed to full XML) and expects an encoding that's compatible with the first half of ISO-8859-1 (i.e. also matches UTF-8).

    Other encodings are likely to fail.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.