I have a Norwegian name "Moen".
It seems that the letter combination "oe" is somehow interpreted as the Norwegian character "ø" (also known as o-slash). The result is that the name shows up in the wrong place when looking at the list of surnames.
Is this a bug in PGV, a bug related to the Norwegian translations or could it be a setting on my server?
John Morten
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
What language have you selected in PGV? Different collation rules are used for different languages.
Do you use the "USE DB COLLATION" option in your database settings? With one, sorting is done in PHP, in the other, it is done in the database.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If you haven't changed it, the default is "NO", which means sorting is done in PHP.
I've just checked the code, and this would appear to be working as designed. When you select Danish or Norwegian, the letter sequences AA, AE and OE are sorted as Å, Æ and Ø.
I don't think this code has changed in years. I'm afraid I don't speak Danish or Norwegian, so cannot say whether this behaviour is correct. However, we have many scandinavian users, and I would be surprised if nobody has reported it before.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
John
This is part of your CONFIG.php setup. #2- Database connection.
"Configuration help
Controls whether PhpGedView should use the database's built-in sorting and collation facilities. It is generally quicker to use the database to sort and filter data rather than PHP, although not all databases/versions provide this feature. The collation sequence used for each language is set in that language's settings page.
IMPORTANT: You should only set this value to YES if you do so BEFORE the database tables are created for the first time. Selecting it on an existing database could cause your data to become corrupted.
MySQL and PostgreSQL both offer good support for UTF-8, although not all collation sequences are available in earlier versions of MySQL. Other databases offer little or no support for UTF-8. If you are unsure of your database's support of UTF-8, you should set this value to No."
Stephen
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
John:
Can you give us a definitive answer on the behaviour described by fisharebest?
There's no reason the Norwegian rules couldn't be made different from the Danish ones. To the best of your knowledge, are the rules correct for Danish?
We need a little guidance here.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
OK, I found the setting, and it is set to "No". Having read the help desk which is referenced by Steven above, I realise that my set-up should have been changed. I have a 60k database with 450 registered users. Sorting is one of the things that really takes time. Is there any way to change this afer the database is created? I have my own "master database" in BK on another machine, so re-loading is not a major problem. I am normally more concerned about my users and their set-ups and stored messages.
As for the sorting. It is correct that AA should be sorted as Å.
OE and AE are sometimes used if someone who is unfamiliar with Ø and Æ need to type a name or a word as substitutes, but is not used this way in Scandinavia. Typical would be a Norwegian sitting at an unfamiliar PC in an Internet cafe somewhere. He would likely substitute the Norwegian characters, and we would be able to read it, but it is not correct Norwegian. The same goes for Swedish and Danish.
The correct sorting for those two letter combinations (OE and AE) is to let them remain as they are. They should not be substituted during sorting or comparison in any of the three languages.
John Morten
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There is apparently one more sorting question here. The order of the three extra characters in the Scandinavian alphabets are not the same in Sweden as in Norway and Denmark.
However, when I read the wikis in Norwegian, Swedish and Danish, there are variations which are not covered in the English version. The Danish wiki states that oe subtitution for ø should be used. That is not the case for Norwegian.
I would say we need the help of liguists from each country to get all the details right.
John Morten
John Morten
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2009-09-13
If you want to change your db collation, take a look at the database import/export/conversion utility:
Thanks for the descriptions. I am not sure that I want to do that or if I will be able to. When I said that sorting and comparison could be speeded up by altering this it sounded like a good thing to do. My database is already using UTF-8, so I am not sure that a change is required in the DB itself. Main problem here is that I do not know enough about working the DB in order to do things like this. I jumped at it because of the "better speed" statement.
John Morten
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have a Norwegian name "Moen".
It seems that the letter combination "oe" is somehow interpreted as the Norwegian character "ø" (also known as o-slash). The result is that the name shows up in the wrong place when looking at the list of surnames.
Is this a bug in PGV, a bug related to the Norwegian translations or could it be a setting on my server?
John Morten
What language have you selected in PGV? Different collation rules are used for different languages.
Do you use the "USE DB COLLATION" option in your database settings? With one, sorting is done in PHP, in the other, it is done in the database.
Sorry, but I do not know anything about this setting, where do I find it?
John Morten
The language I have selected for my user is Norwegian.
If you haven't changed it, the default is "NO", which means sorting is done in PHP.
I've just checked the code, and this would appear to be working as designed. When you select Danish or Norwegian, the letter sequences AA, AE and OE are sorted as Å, Æ and Ø.
I don't think this code has changed in years. I'm afraid I don't speak Danish or Norwegian, so cannot say whether this behaviour is correct. However, we have many scandinavian users, and I would be surprised if nobody has reported it before.
John
This is part of your CONFIG.php setup. #2- Database connection.
"Configuration help
Controls whether PhpGedView should use the database's built-in sorting and collation facilities. It is generally quicker to use the database to sort and filter data rather than PHP, although not all databases/versions provide this feature. The collation sequence used for each language is set in that language's settings page.
IMPORTANT: You should only set this value to YES if you do so BEFORE the database tables are created for the first time. Selecting it on an existing database could cause your data to become corrupted.
MySQL and PostgreSQL both offer good support for UTF-8, although not all collation sequences are available in earlier versions of MySQL. Other databases offer little or no support for UTF-8. If you are unsure of your database's support of UTF-8, you should set this value to No."
Stephen
John:
Can you give us a definitive answer on the behaviour described by fisharebest?
There's no reason the Norwegian rules couldn't be made different from the Danish ones. To the best of your knowledge, are the rules correct for Danish?
We need a little guidance here.
OK, I found the setting, and it is set to "No". Having read the help desk which is referenced by Steven above, I realise that my set-up should have been changed. I have a 60k database with 450 registered users. Sorting is one of the things that really takes time. Is there any way to change this afer the database is created? I have my own "master database" in BK on another machine, so re-loading is not a major problem. I am normally more concerned about my users and their set-ups and stored messages.
As for the sorting. It is correct that AA should be sorted as Å.
OE and AE are sometimes used if someone who is unfamiliar with Ø and Æ need to type a name or a word as substitutes, but is not used this way in Scandinavia. Typical would be a Norwegian sitting at an unfamiliar PC in an Internet cafe somewhere. He would likely substitute the Norwegian characters, and we would be able to read it, but it is not correct Norwegian. The same goes for Swedish and Danish.
The correct sorting for those two letter combinations (OE and AE) is to let them remain as they are. They should not be substituted during sorting or comparison in any of the three languages.
John Morten
There is apparently one more sorting question here. The order of the three extra characters in the Scandinavian alphabets are not the same in Sweden as in Norway and Denmark.
Norway: Æ Ø Å(AA)
Denmark: Æ Ø Å(AA)
Sweden: Å Ä Ö
My Norwegian database is currently sorted according to Swedish standard.
There is an explanation here: http://en.wikipedia.org/wiki/Danish_and_Norwegian_alphabet
However, when I read the wikis in Norwegian, Swedish and Danish, there are variations which are not covered in the English version. The Danish wiki states that oe subtitution for ø should be used. That is not the case for Norwegian.
I would say we need the help of liguists from each country to get all the details right.
John Morten
John Morten
If you want to change your db collation, take a look at the database import/export/conversion utility:
: https://sourceforge.net/tracker/?func=detail&aid=2318005&group_id=55456&atid=477081
Thanks for the descriptions. I am not sure that I want to do that or if I will be able to. When I said that sorting and comparison could be speeded up by altering this it sounded like a good thing to do. My database is already using UTF-8, so I am not sure that a change is required in the DB itself. Main problem here is that I do not know enough about working the DB in order to do things like this. I jumped at it because of the "better speed" statement.
John Morten