Menu

Weird “ †’ – showing instead of " or '

Help
2020-10-27
2020-10-29
  • Paul Stephens

    Paul Stephens - 2020-10-27

    I'm having strange characters like “ †’ – showing instead of " or ' often inside notes
    How can I fix this?
    PhpGedView 4.3.1
    PHP version: 5.6.30
    Cheers Paul.

     
  • Gerry Kroll

    Gerry Kroll - 2020-10-27

    You have a browser configuration problem. PhpGedView uses UTF-8 to produce all of its displays. Your browser isn't configured to support this. It's probably set to support only ASCII.

    It's also possible that some of your Notes text was double-encoded. This can happen when you use an external text editor such as Notepad. You open a UTF-8 encoded file in ASCII mode and then incorrectly save it in UTF-8 mode. Unless you tell it otherwise, Notepad always saves in UTF-8 mode.

    When UTF-8 files are read in ASCII mode, special characters will be represented by two or more consecutive odd-looking characters such as what you showed in the original post. When you then save that file that was read in ASCII mode using UTF-8 mode, those funny-looking characters will be individually encoded into the UTF-8 character set, thus resulting in Notes text where the original special characters are no longer interpreted properly.

    This Wikipedia article discusses UTF-8:
    https://en.wikipedia.org/wiki/UTF-8

     
  • Paul Stephens

    Paul Stephens - 2020-10-27

    G'Day & Thanks Gerry,
    It's happening in all browsers & on different computers & for other users viewing my pages too.
    I guess they must have been encoded into ASCII mode at some point.
    Is there some way of correcting this in the database? or am I stuck manually fixing each note?
    Cheers, Paul

     
  • Gerry Kroll

    Gerry Kroll - 2020-10-27

    The trouble is, not all of your Notes that contain special characters have this problem.

    You can export your database to a GEDCOM file that you can then feed into Notepad in UTF-8 mode. Inspect the GEDCOM file until you find a Note whose text exhibits this problem. Do a replace all, replacing that particular sequence (it should be 2 characters) with the correct special character. Repeat for other weird sequences, always doing just the first 2 characters of that sequence, until there are no more errors.

    Note: You may encounter the same erroneous coding in other places, such as place names. These can be corrected in the same way.

    When there are no more weird sequences in your GEDCOM, look at the first few lines of the GEDCOM. There should be a 1 CHAR UTF-8 line in there. If not, change the 1 CHAR line that you find to say "UTF-8".

    When you're completely done, save the resultant file in UTF-8 mode. It's preferable to save the file in UTF-8 without BOM. If there's a 3-character BOM at the beginning of the file, PhpGedView will tell you. The Byte Order Mark self-identifies the file as being in UTF-8. In PhpGedView, this just causes a warning message.

    You can zip the resultant file and then upload and import it into your database. Be sure to tell PhpGedView to replace the existing database contents. If you're asked whether you want to keep media links (shouldn't happen), say "no". The GEDCOM you previously downloaded has all the required media information in it.

     
  • Gerry Kroll

    Gerry Kroll - 2020-10-27

    Addendum:

    I said, "should be 2 characters". That's not true. Depending on which character was originally represented, you can have a 3-byte sequence. You'll just have to play it by ear.

    The UTF-8 specification says that you can have an up to 4-byte sequence. 4 bytes are very rarely used, and I doubt that you'll run into those.

     
  • Paul Stephens

    Paul Stephens - 2020-10-29

    Hmm... Ok I'll give it a go ... watch this space ....

     
  • Paul Stephens

    Paul Stephens - 2020-10-29

    Cool..all done, Thank you Gerry, not as painful as I thought it might be. Cheers.

     
  • Gerry Kroll

    Gerry Kroll - 2020-10-29

    Good. Glad to have been able to help.

    UTF-8 encoding is quite tricky. I had to become intimately familiar with it when I cooked up some UTF-8 functions and sorting algorithms that are used extensively in PhpGedView.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.