Menu

#388 Expat 1.95.8 ReadMe

Not a Bug
closed-rejected
1
2005-08-10
2005-08-10
abhijitk
No

From Expat ReadMe:
--------------------------------------------------------------------------------------
If you are interested in building Expat to provide document
information in UTF-16 rather than the default UTF-8,
following these instructions:
1. For UTF-16 output as unsigned short (and
version/error strings as char), run:
./configure CPPFLAGS=-DXML_UNICODE

For UTF-16 output as wchar_t (incl. version/error
strings), run:
./configure CFLAGS="-g -O2 -fshort-wchar" \ CPPFLAGS=-DXML_UNICODE_WCHAR_T

2. Edit the MakeFile, changing:
LIBRARY = libexpat.la
to:
LIBRARY = libexpatw.la
(Note the additional "w" in the library name.)

3. Run "make buildlib" (which builds the library only).
4. Run "make installlib" (which installs the library
only).
--------------------------------------------------------------------------------------
As per the defination of -fshort-wchar:

-fshort-wchar
Override the underlying type for wchar_t to be
short
unsigned int instead of the default for the
target.
This option is useful for building programs to
run under
WINE.

Warning: the -fshort-wchar switch causes GCC
to generate code that is not binary compatible with
code generated without that switch. Use it to conform
to a non-default application binary interface.
--------------------------------------------------------------------------------------

So this indicates that the option -fshort-wchar is to
be used in case I need UTF-16 output as unsigned short.
But as the ReadMe suggests should I use the option
-fshort-wchar for UTF-16 output as wchar_t?

Please correct me if my understanding is incorrect.

Discussion

  • Karl Waclawek

    Karl Waclawek - 2005-08-10

    Logged In: YES
    user_id=290026

    No, use -fshort-wchar only if you want UTF-16 output
    to be deliveerd as a *two-byte* wchar_t type (necessary
    because on Unix wchar_t is typically four bytes).

     
  • abhijitk

    abhijitk - 2005-08-10

    Logged In: YES
    user_id=1312629

    So wchar_t has to be two byte for expat to work correctly?

    I am compiling a 32 bit applicaiton on Solaris, so if i dont
    use the -fshort-wchar, wchar_t will be defined as long. My
    code does not depend on the size of wchar_t, will expat give
    the desired result in this scenario?

    Basically I am getting senmentation fault in my application
    and so I am looking if the -fshort-wchar switch which causes
    GCC to generate code that is not binary compatible with
    code generated without that switchoption is the reason.
    My own libraries are build with this option but other system
    libraries (from /usr/lib) are not compiled with this option.

     
  • Karl Waclawek

    Karl Waclawek - 2005-08-10

    Logged In: YES
    user_id=290026

    Apparently, you want to process Unicode on Solaris not as
    UTF-8 but as UTF-16 encoded. Therefore you need a
    two-byte data type for the UTF-16 base character type.

    Which base data-type for UTF-16 are you using elsewhere,
    unsigned short or a (redefined) wchar_t?

     
  • abhijitk

    abhijitk - 2005-08-10

    Logged In: YES
    user_id=1312629

    In my code I am using wchar_t data type, I have not
    specifically defined it in my code.
    The existing library is working on MAC as MAC has wchar_t
    defined as unsigned short in the system headers.
    On Solaris I use the -fshort-wchar option for compiling so i
    guess wchar_t gets defined as unsigned short.

    Let me explain in short what i am trying to do here, I am
    porting the application from MAC to Solaris.
    On Mac the expat library is compiled with XML_UNICODE so
    XML_Char is defined as wchar_t and wchar_t is defined as
    unsigned short in the system headers. My library which
    interfaces with expat has wchar_t used in all places as its
    avaliable on MAC as unsigned short, so it worked fine.

    Now on Solaris if i have to use the -fshort-wchar option to
    have two byte wchar_t, I have two options:
    1) Compile everything with -fshort-wchar option including
    system libraries so that all use the wchar_t as two bytes.
    OR
    2) Either change my entire library code to use some thing
    like XML_Char so I can control how its defined.
    But in second case still the system libraries will stil use
    wchar_t defined as long.

    This is out of way question, please guide or if there is any
    place where i can get more info on wide build of expat.

    Thanks.

     
  • Karl Waclawek

    Karl Waclawek - 2005-08-10

    Logged In: YES
    user_id=290026

    I don't know of any specific sources for wchar_t problems,
    but Google should help you there.

    I would think that if all your application code is built on the
    assumption of a 2-byte wchar_t, then it would make sense to
    use the system libraries compiled for a short wchar_t.

     
  • Karl Waclawek

    Karl Waclawek - 2005-08-10
    • milestone: --> Not a Bug
    • priority: 5 --> 1
     
  • abhijitk

    abhijitk - 2005-08-10

    Logged In: YES
    user_id=1312629

    Yes Google is the only help i have been using for few yrs
    now.....
    Thanks for your time and info, made things bit more clear
    for me.

     
  • abhijitk

    abhijitk - 2005-08-10

    Logged In: YES
    user_id=1312629

    I came across this bug :
    [ 931546 ] Unixode support for Windows and Unix are not
    compatible

    This is exactly what problem I am facing too. I got your
    answer there already.

     
  • Karl Waclawek

    Karl Waclawek - 2005-08-10

    Logged In: YES
    user_id=290026

    I am glad you found some answers, even though
    they are probably not what you wanted.

    Closing this issue.

     
  • Karl Waclawek

    Karl Waclawek - 2005-08-10
    • status: open --> closed-rejected
     

Log in to post a comment.