Menu

#104 Problem with dot and comma per locale

open
nobody
8
2014-08-19
2009-03-16
SauLus
No

When loading files all points are casted to int, because the decimal separation is being recognized wrongly!

Because my locale is german, locale on Ubuntu Jaunty reports:
$ locale
LANG=de_DE.UTF-8
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=

The sign to separate decimal numbers is ',' in germany. Some other countries handle this the other way round. I suppose QT to rely on locale to get the sign for separating decimal numbers.

  • Thats why exchanging models using meshlab will not work over countries. *

Many programs decided to hardcode the decimal-separator-sign.

  • Thats why even sharing files with programs running on the same locale may fail. *

The hardcoding solution may be to look for ',' or '.' and changing this to the appropriate sign. I do this with the following code:

//substitute all occurences of , by .
while (getline(file,line)) {
string::size_type pos = line.find(",");
while (pos != string::npos) {
line.replace(pos,1,".");
pos = line.find(",");
}
}

This is not a linux but. I've had the same troubles with WindowsXP (exporting) and Vista (importing). Even both had the same settings for language, the seperator-sign differed.

Example: I exported a file with blender that looks like this:
$ head doublediamond.off
OFF
98 192 0
-0.468690 0.214344 0.230856
-0.326480 0.271458 0.181382
-0.468690 0.280055 0.124938
-0.326480 0.320207 0.063693
-0.468690 0.303129 -0.000000
-0.326480 0.320207 -0.063694
-0.468690 0.280055 -0.124939
-0.326480 0.271458 -0.181383

Meshlab didnt show anything. So I exported it. Now the file looks like:
$ head doublediamond-mshlb-exp.off
OFF
98 192 0
-0 0 0
-0 0 0
-0 0 0
-0 0 0
-0 0 -0
-0 0 -0
-0 0 -0
-0 0 -0

Please fix this as soon as possible, becase the manual conversion takes time and is often being forgotten.

Discussion

  • SauLus

    SauLus - 2010-04-02

    This bug does not only affect the .off file format. I still get this problem in the latest svn version and it overall many file formats, e.g. .off, .ply, and .obj. Fixing this manually is a big pain and severly slows down the workflow. I repeat my feature request to check for the ',' or '.' and changing this to the appropriate delimiter.

    I saw a some other bugreports that probably result from the same issue, as ,e.g. 2960691, 2900886, and 2530990.

     
  • SauLus

    SauLus - 2010-04-02

    Ok, I wrote a patch for this problem. Funny thing is, that the delimiter either has changed or I was mixing them up at the last report. However, the following works fine for me:

    Index: import_obj.h

    --- import_obj.h (revision 3680)
    +++ import_obj.h (working copy)
    @@ -703,8 +703,14 @@
    {
    if(stream.eof()) return;
    std::string line;

    • do
    • std::getline(stream, line);
    • do {
    • std::getline(stream, line);
    • std::string::size_type pos = line.find(".");
    • while (pos != std::string::npos) {
    • line.replace(pos,1,",");
    • pos = line.find(".");
    • }
    • }
      while ((line[0] == '#' || line.length()==0) && !stream.eof()); // skip comments and empty lines
      if ((line[0] == '#') || (line.length() == 0))  // can be true only on last line of file
      

    The file resides in devel/vcglib/wrap/io_trimesh

     
  • SauLus

    SauLus - 2010-04-02

    I thougt a bit about this hardcoding fix. I suggest meshlab wants to read all supported files, independent of the used decimal point delimiter. So I examined the problem a bit using this c++ program:

    / setlocale example /

    include <stdio.h></stdio.h>

    include <locale.h></locale.h>

    int main ()
    {
    struct lconv * lc;
    int run=0;

    do {
    lc = localeconv ();
    printf ("Locale: %s, Decimal point: \"%s\", Thousands separator: \"%s\"\n", setlocale(LC_ALL,NULL),lc->mon_decimal_point,lc->mon_thousands_sep );

    switch (run) {
      case 0:
        setlocale (LC_ALL,"");
        break;
      case 1:
        setlocale (LC_ALL,"en_US.utf8");
        break;
      case 2:
        setlocale (LC_ALL,"en_GB.utf8");
        break;
      case 3:
        setlocale (LC_ALL,"it_IT.utf8");
        break;
      case 4:
        setlocale (LC_ALL,"it_CH.utf8");
        break;
    }
    

    } while (run++ < 5);

    return 0;
    }

    For me it gives the following output:

    $ g++ locale.cpp -o locale; ./locale
    Locale: C, Decimal point: "", Thousands separator: ""
    Locale: de_DE.UTF-8, Decimal point: ",", Thousands separator: "."
    Locale: en_US.utf8, Decimal point: ".", Thousands separator: ","
    Locale: en_GB.utf8, Decimal point: ".", Thousands separator: ","
    Locale: it_IT.utf8, Decimal point: ",", Thousands separator: "."
    Locale: it_CH.utf8, Decimal point: ".", Thousands separator: "'"

    Here we see that different nations handle the decimal point and thousand operator differently. Also the problem is obvious now: Some nations have a decimal point that is the thousands separator from another nation. Therefore during import the floating point values are wrongly recognized.

    To make my fix more portable I check for the current locale and replace in accordance to the specific locale. The diff is given below.

    For any discussion you can always reach me in #meshlab at irc.freenode.org .

    David

    Index: import_obj.h

    --- import_obj.h (Revision 3680)
    +++ import_obj.h (Arbeitskopie)
    @@ -703,8 +703,17 @@
    {
    if(stream.eof()) return;
    std::string line;

    • do
    • std::getline(stream, line);
    • struct lconv* lc = localeconv();
    • char * decimal_delimiter = lc->mon_decimal_point;
    • char * thousands_delimiter = lc->mon_thousands_sep;
    • do {
    • std::getline(stream, line);
    • std::string::size_type pos = line.find(thousands_delimiter);
    • while (pos != std::string::npos) {
    • line.replace(pos,1,decimal_delimiter);
    • pos = line.find(thousands_delimiter);
    • }
    • }
      while ((line[0] == '#' || line.length()==0) && !stream.eof()); // skip comments and empty lines
      if ((line[0] == '#') || (line.length() == 0))  // can be true only on last line of file
      
     

Log in to post a comment.

MongoDB Logo MongoDB