problem with accents in french words

Help
2005-06-23
2013-05-28
  • Nobody/Anonymous

    Hi,
    I installed refbase0.8 yesterday, imported our PhD data from endnote, and got it working except that accentuated characters were incorrectly displayed.
    I therefore recreated the database imposing utf-8 as the character set, and that fixed the viewing problem, but search does not detect accentuated characters.
    Can anyone tell me how to get around this pb?
    Thanks!

    For information I have:
    apache2.0
    mysql4.1.10

     
    • Matthias Steffens

      As long as all of your higher ascii chars belong to the ISO latin 1 character set (which I assume to be the case with french characters), it should definitively be fine to use "ISO-8859-1" (aka "latin1") for your database. I.e., there should be no need for "utf8" in your case.

      When importing from Endnote to a "latin1" database, please make sure that the file's characters encoding is set to "Western (ISO Latin 1)" aka "ISO-8859-1", otherwise higher ascii chars won't be displayed correctly.

      More generally speaking, the character encoding of any import file must match the encoding used for the MySQL database which, in turn, must match the encoding specifyied in variable '$contentTypeCharset' (in file 'ini.inc.php').

      I'm using a 'latin1' database with german umlauts without and french accents without any problems, so it's theoretically possible. Searching for higher ascii chars works as well.

      Example output:

      http://polaris.ipoe.uni-kiel.de/refs/show.php?serial=%5E\(14167%7C23624%7C11628)$&submit=Cite

      Regards, Matthias

       
    • Matthias Steffens

      What happens when you enter accented characters via the web GUI? Will they display correctly? And can you search for these manually entered characters?

      Regards, Matthias

       
    • Nobody/Anonymous

      Hi Matthias, and thanks for the fast reply!
      In my current configuration (utf-8 database), accents
      show correctly in the web gui, but a search on them yields zero results. What unfortunately I didn't try was
      to do a search with the original default databse where the characters didn't sow up properly.
      I'll try again with a latin1 database and import file, see if that helps.
      thanks again,   Olivier

       
    • Nobody/Anonymous

      Hmmm, I've switched back to latin1 for the database (switched in the  'ini.inc.php' as well) and latin1_swedish for the database collation; once again it shows up properly on the screen, but doesn't allow me to search with accents.
      Strange!

      Olivier

       
    • Matthias Steffens

      Hi Olivier,

      if I'm understanding you correctly, compared to your initial posting you seem to have succeeded regarding your display problems.

      Then, it's strange that searching for accented characters doesn't work as well.

      What browser and platform are you using? Have you tried other browsers as well?

      And again: What happens when you enter accented characters via the web GUI? Will they display correctly? Can you search for these manually entered characters?

      Thanks and good luck, Matthias

       
    • Nobody/Anonymous

      Hi Matthias,
      to recapitulate, I'm running linux (debian/ubuntu);
      with both latin1 and ut5-8 I can now get the references to appear properly (accentuated characters), but with neither does the search work. When I type an accentuated character in the search field it shows up properly, but tells me that it finds nothing, even if I search on a single accentuated character.
      Hmm, let me try another browser (currently using firefox):
      same thing with epiphany (gnome browser) and galeon :-(
      I'll keep looking!

       
      • Matthias Steffens

        Olivier,

        can I access your database via the internet? Could you pass me the URL so that I could try a search with my setup? If you don't want to disclose the URL in the forums, send an email to msteffens at users.sourceforge.net.

        Can you successfully perform a search for accented characters at:

        http://polaris.ipoe.uni-kiel.de/refs/

        E.g., you should get 180 records returned when searching for an '' using the "Quick Search" form on the main page with 'author' selected.

        Thanks, Matthias

         
    • Nobody/Anonymous

      gee, thanks for taking the time to help!
      I get the 180 records on your database with my browser, so that works.
      My database is currently running at http://156.18.39.28/refbase-0.8.0/

      it should be accessible I think; I'm not exactly sure what our firewall status is!

      Olivier

       
    • Matthias Steffens

      Unfortunately, I can't access your server from here.

      Have you tried entering some higher ascii chars manually via the web GUI? Do these characters display correctly as well? Can you search for them?

      If you view the page source, does it say

      <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">

      in the HTML <head>?

      Thanks, Matthias

       
    • Nobody/Anonymous

      Yeah, I feared as much for the exterior access, our university has tight rules on web servers...
      I have exactly what you wrote in my page source.
      What do you mean by manually entering the higher ascii chars?

       
    • Matthias Steffens

      > What do you mean by manually entering the higher ascii chars?

      If I understand you correctly, your data stem from an Endnote import, i.e. they were not entered via the browser interface but imported directly into the MySQL database.

      Opposed to this direct import, have you tried entering accented characters via your browser by clicking on "edit record", writing some accented characters into, say, the author field and clicking on "Edit Record"?

      Please verify if these characters display correctly and if you can search for them.

      Matthias

       
    • Nobody/Anonymous

      OK, I understand, and indeed you're quite right: accents imported from endnote show correctly whereas those entered by hand in a new record don't work...

       
    • Nobody/Anonymous

      I'll have to clarify the "don't work": they don't show up properly, but I just noticed that they do work for the search!

       
    • Matthias Steffens

      Yes, that's what I expected. I had a similar problem months ago, but unfortunately, I don't remember what did the trick to fix the problem.

      How did you revert to the "latin1" database? Did you use 'install.php'. If not, it would make sense to start with a completely fresh installation (to another directory) and using another database name in 'db.inc.php' (to avoid collisions).

      Then, choose "latin1" on installation. Before you import any data, make sure that variable '$contentTypeCharset' (in file 'ini.inc.php') is set to "ISO-8859-1" and that your page source shows

      <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">

      Then, verify that the example records that were provided with refbase display accented characters correctly and that you can successfully search for them.

      Now, edit one record and enter some accented characters. Again, make sure that these characters display correctly and that you can search for them.

      If this all works, then you have a working setup.

      Before importing your Endnote data, make sure that accented characters display correctly in a text editor with "latin1" as character encoding. Then import your data.

      Matthias

       
    • Matthias Steffens

      Regarding your existing refbase MySQL database (I assume it's named 'literature', adopt below if otherwise):

      Can you execute the following two statements using your MySQL command line interpreter and report the output:

      show create database literature;

      show create table literature.refs;

      Thanks, Matthias

       
    • Nobody/Anonymous

      arghh! finally worked out what was wrong:
      by default the mysqld was setting the server  char set to utf8 and the connection charset to latin1...
      Fixing that by hand fixed up everything! For sake of thoroughness I tried it with both utf8 and latin1: both work :-)
      THanks again,
      Olivier

       
    • Matthias Steffens

      Glad you could solve the problem! :-)

      Best regards, Matthias

       

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks