#7 Fix for big-endian systems

closed-fixed
None
5
2008-03-13
2008-03-02
No

There are issues with big-endian systems which the following patch addresses (+ a few minor things)

In detail:
* espeak-phoneme-data.c: Cross-compiling is easier as one can define TARGET_BYTE_ORDER to override the BYTE_ORDER define.
* dictionary.cpp: reverse_word_bytes was not working correctly - replaced by macros. Also two more words are swapped allowing to use the original little-endian dict files -> no need to recompile them! (now only the phon* files need to be converted).
GetFileLength may return a size < 0, therefore check for <= 0 (not == 0)
* Makefile: Removed flag -O2 to allow for system-specific optimization settings, like -Os for small systems.
* synthesize.cpp: buf[3] was never set, fixed.
* numbers.cpp: Made letter_accents_0e0 unsigned to get rid of the warning "overflow in implicit constant conversion"

Please review the changes and if appropriate please apply them.
Thanks,
-Thomas

Discussion

  • Jonathan Duddington

    Logged In: YES
    user_id=1448760
    Originator: NO

    Sorry for the delay. I didn't get an email from SourceForge when you posted this patch here. I think I've fixed that setting now.

    Please send (email) me your modified "espeak-phoneme-data.c" file. I'm not confident about applying patch files, and I'm not sure I understand the changes.

    I'm OK with the changes to espeak.

     
  • Thomas Reitmayr

    Thomas Reitmayr - 2008-03-05

    Logged In: YES
    user_id=1811164
    Originator: YES

    Hi,
    the changed espeak-phoneme-data.c is attached. As said earlier it might just make life easier for cross-compiling when the phon* files should be already fixed on the build host, because the build host's endianess would differ from the target system's. This can now be accounted for by setting the TARGET_BYTE_ORDER.

    I just noticed a problem with my patch in dictionary.cpp. The patch now swaps all the words found in the dict files, however when compiling a dict file, the first two words at the beginning of the file are written in the host's byte order (compiledict.cpp, function CompileDictionary), but the contents if explicitly written little-endian (function compile_dictrules). I was under the impression that also the initial two words were written using Write4Bytes...

    Should these two words also be written in little-endian order on any system (i.e. adapt the function CompileDictionary) or do you like to keep the host's endianess there?
    -Thomas
    File Added: espeak-phoneme-data.c

     
  • Jonathan Duddington

    Logged In: YES
    user_id=1448760
    Originator: NO

    > Should these two words also be written in little-endian order on
    > any system?

    Yes. I've now changed CompileDictionary() to use Write4Bytes() for
    these two words.

    > the changed espeak-phoneme-data.c is attached. As said earlier it might
    > just make life easier for cross-compiling when the phon* files should be
    > already fixed on the build host, because the build host's endianess would
    > differ from the target system's. This can now be accounted for by setting
    > the TARGET_BYTE_ORDER.

    I'm sorry, but I still don't understand the purpose of this change.

    What is the problem with the original espeak-phoneme-data ?

    I didn't write it, and I don't use it myself, but I thought it's purpose was
    to convert little-endian data to big-endian data. Can it be run on either
    type of processor in order to produce the big-endian data?

    When I run your new version, it says: "No conversion necessary".

     
  • Thomas Reitmayr

    Thomas Reitmayr - 2008-03-06

    Logged In: YES
    user_id=1811164
    Originator: YES

    > Yes. I've now changed CompileDictionary() to use Write4Bytes() for
    > these two words.
    Sounds great!

    > I'm sorry, but I still don't understand the purpose of this change.
    > What is the problem with the original espeak-phoneme-data ?
    > I didn't write it, and I don't use it myself, but I thought it's
    > purpose was to convert little-endian data to big-endian data. Can
    > it be run on either type of processor in order to produce the
    > big-endian data?
    > When I run your new version, it says: "No conversion necessary".
    You are right about the purpose.

    Now speaking about the original version:
    The knowledge about the endianess to use comes from sys/types.h which provides the define "BYTE_ORDER". This obviously is the byte order of the system for which you build.
    -> If you compile the program for a big-endian architecture you can convert the phon* files on that platform from little to big-endian.
    -> If you compile it for a little-endian architecture the program will read and write the phon* files without changing its contents (the macros SWAP_XX do nothing).

    In a cross-compile environment the build host may be little-endian (i.e. your regular x86-PC) and the target might be big-endian (eg. some embedded platform). IMHO it makes sense to convert the phon* files on the build host and just package the files for the embedded platform which are really needed and which are ready to be installed.
    So the espeak-phoneme-data.c would run on a little-endian architecture but has to swap the endianess of the phon* files!
    So I originally redefined the BYTE_ORDER by hacking espeak-phoneme-data.c on the fly (during the build process), which works ok as well. But I though if this could be done from outside by defining a TARGET_BYTE_ORDER, this would be nicer.
    Also for little endian targets it does not really make much sense to read and write identical data, so I just added the "No conversion necessary" + return(0).

    Conclusion: The current tool works ok even in cross-build environments if you know how to hack it. I can also live without this part of the patch being applied :)

     
  • Jonathan Duddington

    • assigned_to: nobody --> jonsd
     
  • Jonathan Duddington

    Logged In: YES
    user_id=1448760
    Originator: NO

    I had a report from someone using big-endian Mac of problems with the latest version of eSpeak. Unfortunately he is now available at the moment for testing, and the symptoms didn't make sense. Possibly they may have been some configuration or installation error.

    The only relevant changes recently seem to be these discussed here.

    Are you able to test the current development eSpeak on a big-endian processor?
    Currently version 1.33.04 at http://espeak.sf.net/test/latest.html

    This would be very helpful.

     
  • Jonathan Duddington

    • status: open --> open-fixed
     
  • Jonathan Duddington

    Logged In: YES
    user_id=1448760
    Originator: NO

    eSpeak's *_dict files are now binary compatible between little and big endian systems.

     
  • Jonathan Duddington

    • status: open-fixed --> closed-fixed
     

Log in to post a comment.