Menu

Uploading GEDCOM with the admin function

Help
2003-11-21
2003-11-25
  • Robert Weinland

    Robert Weinland - 2003-11-21

    Hello,

    For compatibility reasons with accented characters of the French language and with my genealogy program (TMG) I set the character encoding of PHPGedview to iso-8859-1 instead of UTF-8 with the CHAR tag set to ANSI in my gedcom files.
    This works well as long as I upload my gedcom files with a FTP program.

    But if I use the upload option of the admin functions of PHPGedview I run into trouble because the CHAR tag of my gedcom file is set to UTF-8.
    Then the accented characters of the imported data are not properly displayed.

    Why does PHPgedview override the CHAR tag of gedcom files as long as the possibility to use a character set encoding different from UTF-8 exists in the program ?

    --
    Robert

     
    • John Finlay

      John Finlay - 2003-11-21

      The upload function of the admin page will detect if the GEDCOM is ANSI and automatically convert it to UTF-8.  I should add a checkbox to the upload form that allows you to specify whether you want to convert it or not.

      I have done more research on the UTF-8 encoding scheme, and I have found that UTF-8 really is the best way to encode your GEDCOM files because it allows you to mix and match languages.  So I could read French, English, Chinese, and Hebrew all in the same page.  Anything that you can encode in ISO-8859 or Unicode UTF-18 can be encoded in UTF-8.  This is good for a project like PhpGedView where you may be reading an English GEDCOM in Chinese.

      The only disadvantage of UTF-8 is that it may take a few more bytes to encode the same character.

      So I recommend that everyone encode their GEDCOMs in UTF-8.

      --John

       
    • Robert Weinland

      Robert Weinland - 2003-11-22

      Hi John,

      The problem is that my genealogy program does not manage UTF-8 encoding (like many other programs I suppose).  

      For example, when you use the UTF-8 encoding for a field that contain accented characters the 'View gedcom record' will display something like :

      2 NOTE soldat au 8ème régiment d'artillerie, 21 ans

      This field won't be imported properly by a genealogy program that can't manage UTF-8 encoding. Am I wrong ?

      If I use the iso-8859-1 encoding, PHPgedview will display :

      2 NOTE soldat au 8me rgiment d'artillerie, 21 ans

      ... that I can import properly with my genealogical program.

      --
      Robert

       
    • John Finlay

      John Finlay - 2003-11-22

      Hi Robert,

      It is pretty easy to convert between UTF-8 and ISO-8859-1 using Windows Notepad or MS Word or most any other text editor or word processor.  You just open up the GEDCOM in one of those programs and then save it in the encoding that you need.

      Along with adding the option of encoding an ISO-8859-1 file to UTF-8 during upload, I should also add the option of converting it back to ISO-8859-1 on download, then your problem would be solved.

      --John

       
    • Robert Weinland

      Robert Weinland - 2003-11-23

      Hi John,

      It seems that Unicode is managed only by Windows 2000/XP with MSOffice 2000/XP.

      My configuration runs Win98 with MSOffice97. So I think the best thing I have to do is to wait for my Christmas gift (a new configuration) or for your Christmas gift to us ! (ISO-8859-1 <-> UTF-8 conversion tool) ;-)

      --
      Robert

       
    • Robert Weinland

      Robert Weinland - 2003-11-25

      Hi John,

      I will be able to use UTF-8 encoding as I could get at a text editor with Unicode<->ANSI conversion capability.

      --
      Robert

       
    • John Finlay

      John Finlay - 2003-11-25

      In version 2.61b2 that I released yesterday, you have the option of choosing to convert the gedcom back to ANSI when you download it.

      --John

       

Log in to post a comment.