TinyXML / Discussion / Developer: Implementing a Character Set

Anonymous - 2003-10-25

Hi!

Great piece of software! However, I read somewhere it supports only Latin 1 character set. I would also like it to support Windows-1250 character set.

Which is the fastest way to enable this?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- B Sizer - 2003-10-27
  
  I expect it's not Latin-1 specific, but ASCII-specific. It will probably support whichever codeset your locale is currently set to. If you need it to support wide characters, you probably need to change the #define for TIXML_STRING from std::string to std::wstring, but I expect that will break other parts of the code.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous - 2003-11-23
  
  it works with extended ascii (european chars) with a small modification :
  
  line 234, tinyxmlparser.cpp:
  else if ( *p==' '/*isspace( *p )*/ )
  
  so, the alien characters (because *yes* we are aliens!) won't be skipped.
  
  -sbrt
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - B Sizer - 2003-11-23
    
    However, doing that will probably break your code when it encounters a tab, newline, or carriage return in that position, no?
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous - 2003-12-03
  
  As this code is intendeed to forget "empty" characters surrounding the meat of real words, the tab, newline, etc just won't be skipped, which is not very important. Anyway, it can't make the code crash or the output XML be corrupted.
  
  I've just tested my app ... and it just works fine.
  
  Know that many people use xml for config files, with installation/current/whatever folders, and filenames. And extended ascii can be in these paths/filenames in european/asian systems ...
  
  It was a pain to find a small, stable and extended-ascii-compliant XML IO library ... Thanks to tinyXML (with the slight modification above). I used expat/sxp before and I had to extend all asian/european strings with two blank chars, otherwise it simply crashed.
  
  -sbrt
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - B Sizer - 2003-12-05
    
    Ok, I just took a closer look at the code.
    
    It already checks for newlines and carriage returns before the isspace check, so you're ok there. However if you want your tabs to be properly converted then just add:
    
    || *p == '\t'
    
    to the carriage return checks above. I don't remember the code for a form feed but it's so rarely used that I doubt it matters.
    
    Incidentally, for most (if not all) users, the separate carriage return checks are unnecessary anyway, as they are covered by isspace().
    
    However, I would point out that if 'isspace' is treating your extended ASCII as spaces, there's probably something else wrong. For example, maybe you have the wrong locale set in tinyxml/the application using it. You may just need to call setlocale(LC_ALL, "french") for example, choosing the appropriate language/region string.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous - 2004-01-07
  
  Argh! You're right, and I've wasted hours just because of my ignorance about setLocale :-/
  
  But it is not require to specify "french", etc. The simplest is to call at app startup :
  setlocale(LC_ALL,"");
  Which, according to MSDN:
  "Sets the locale to the default, which is the system-default ANSI code page obtained from the operating system."
  
  -sbrt
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Implementing a Character Set

Forums

Help

Implementing a Character Set

Implementing a Character Set

Forums

Help

Implementing a Character Set document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Implementing a Character Set