Menu

Possible sorting error - Norwegian

Help
2009-09-12
2013-05-30
  • John Morten Malerbakken

    I have a Norwegian name "Moen".
    It seems that the letter combination "oe" is somehow interpreted as the Norwegian character "ø" (also known as o-slash). The result is that the name shows up in the wrong place when looking at the list of surnames.

    Is this a bug in PGV, a bug related to the Norwegian translations or could it be a setting on my server?

    John Morten

     
  • Greg Roach

    Greg Roach - 2009-09-12

    What language have you selected in PGV?  Different collation rules are used for different languages.

    Do you use the "USE DB COLLATION" option in your database settings?  With one, sorting is done in PHP, in the other, it is done in the database.

     
  • John Morten Malerbakken

    Sorry, but I do not know anything about this setting, where do I find it?

    John Morten

     
  • John Morten Malerbakken

    The language I have selected for my user is Norwegian.

     
  • Greg Roach

    Greg Roach - 2009-09-12

    If you haven't changed it, the default is "NO", which means sorting is done in PHP.

    I've just checked the code, and this would appear to be working as designed.  When you select Danish or Norwegian, the letter sequences AA, AE and OE are sorted as Å, Æ and Ø.

    I don't think this code has changed in years. I'm afraid I don't speak Danish or Norwegian, so cannot say whether this behaviour is correct.  However, we have many scandinavian users, and I would be surprised if nobody has reported it before.

     
  • Stephen Arnold

    Stephen Arnold - 2009-09-12

    John 
    This is part of your CONFIG.php setup. #2- Database connection.

    "Configuration help

    Controls whether PhpGedView should use the database's built-in sorting and collation facilities. It is generally quicker to use the database to sort and filter data rather than PHP, although not all databases/versions provide this feature. The collation sequence used for each language is set in that language's settings page.

    IMPORTANT: You should only set this value to YES if you do so BEFORE the database tables are created for the first time. Selecting it on an existing database could cause your data to become corrupted.

    MySQL and PostgreSQL both offer good support for UTF-8, although not all collation sequences are available in earlier versions of MySQL. Other databases offer little or no support for UTF-8. If you are unsure of your database's support of UTF-8, you should set this value to No." 
    Stephen

     
  • Gerry Kroll

    Gerry Kroll - 2009-09-12

    John:
    Can you give us a definitive answer on the behaviour described by fisharebest?

    There's no reason the Norwegian rules couldn't be made different from the Danish ones.  To the best of your knowledge, are the rules correct for Danish?

    We need a little guidance here.

     
  • John Morten Malerbakken

    OK, I found the setting, and it is set to "No". Having read the help desk which is referenced by Steven above, I realise that my set-up should have been changed. I have a 60k database with 450 registered users. Sorting is one of the things that really takes time. Is there any way to change this afer the database is created? I have my own "master database" in BK on another machine, so re-loading is not a major problem. I am normally more concerned about my users and their set-ups and stored messages.

    As for the sorting. It is correct that AA should be sorted as Å.
    OE and AE are sometimes used if someone who is unfamiliar with Ø and Æ need to type a name or a word as substitutes, but is not used this way in Scandinavia. Typical would be a Norwegian sitting at an unfamiliar PC in an Internet cafe somewhere. He would likely substitute the Norwegian characters, and we would be able to read it, but it is not correct Norwegian. The same goes for Swedish and Danish.

    The correct sorting for those two letter combinations (OE and AE) is to let them remain as they are. They should not be substituted during sorting or comparison in any of the three languages.

    John Morten

     
  • John Morten Malerbakken

    There is apparently one more sorting question here. The order of the three extra characters in the Scandinavian alphabets are not the same in Sweden as in Norway and Denmark.

    Norway: Æ Ø Å(AA)
    Denmark: Æ Ø Å(AA)
    Sweden: Å Ä Ö

    My Norwegian database is currently sorted according to Swedish standard.

    There is an explanation here: http://en.wikipedia.org/wiki/Danish_and_Norwegian_alphabet

    However, when I read the wikis in Norwegian, Swedish and Danish, there are variations which are not covered in the English version. The Danish wiki states that oe subtitution for ø should be used. That is not the case for Norwegian.
    I would say we need the help of liguists from each country to get all the details right.

    John Morten

    John Morten

     
  • John Morten Malerbakken

    Thanks for the descriptions. I am not sure that I want to do that or if I will be able to. When I said that sorting and comparison could be speeded up by altering this it sounded like a good thing to do. My database is already using UTF-8, so I am not sure that a change is required in the DB itself. Main problem here is that I do not know enough about working the DB in order to do things like this. I jumped at it because of the "better speed" statement.

    John Morten

     

Log in to post a comment.