Menu

#8 Modified entity decoder.

open
None
5
2010-04-19
2010-03-24
Anonymous
No

Here it is a modified entity decoder. It converts the characters to utf8.

Discussion

  • Nobody/Anonymous

    The source

     
  • Davi de Castro Reis

    Hi Robson,

    This patch makes sense since it encodes decoded entities as utf8. What is the current encoding? Is it latin1?

     
  • Davi de Castro Reis

    • assigned_to: nobody --> braga
     
  • Davi de Castro Reis

    By the way, the code this patch is supposed to replace is the one at html/utils.

     
  • Anonymous

    Anonymous - 2010-05-08

    This also decodes hexadecimal entities.

     
  • Robson Braga Araujo

    Yes, it's Latin-1. It's documented there:
    Convert html entities into their ISO8859-1 equivalents

    I'll add this as a decode_entities_utf8 helper method. Eduardo, can you resubmit the file with a copyright header? Preferably using LGPL as the rest of the project.

     
  • Anonymous

    Anonymous - 2010-05-11

    I have made some little changes and added the copyright header.
    I forgot to login before I submit this and it seems that I am not able to add more attachments, so I uploaded the new file to my website.
    Download here: http://eduardo38.netne.net/decode_entities_utf8.cpp

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.