First Names are not shown properly

Help
2010-04-04
2013-05-28
  • rayhan rashid

    rayhan rashid - 2010-04-04

    Hi,
    We are having some problem with displaying the "first name" in the "Most recent additions" display panel on the first page. For example, "Cassese, Antonio" is diplayed as "Cassese, Ant. onio", or "Huyse, Luc" is diplayed as "Huyse, L. u. c." .  I am not sure which file/directory processes this display output. Any suggestion would be gratefully received.
    Warm regards,
    Rayhan

     
  • Florian

    Florian - 2010-04-06

    How did you import those references? Because  I suspect that this is less a problem with refbase directly, but with corrupted data from your metadata.
    I guess it would be helpful to check those metadata files which you used for import into refbase and see if the same errors (names) occur there as well.

    Since I guess that this happend with a lot of names I don't think it would make sense to use SQL-commands to "fix" those references, but a reimport of all those references should work fine…

     
  • rayhan rashid

    rayhan rashid - 2010-06-09

    @ florianneuro,

    Thank you for your reply. For clarification: the entries in question were not imported and they were entered manually. So it is unlikely to be any problem with metadata. We have done some checks and this is what we have found which hopefully narrows down the problem:

    1. The first names break only in the cases of entries that are either "Whole books" or "book chapters". We have checked and can confirm that problem is not with any individual citation-style files as the same happens to all of the styles including MLA, APA, Harvard 1 etc. We have even tried replacing the citation related core php files with the ones from Refbase repository. Did not change anything. So we can rule out that it is due to any corrupt file.

    2. Currently, we are running refbase db from a Virtual Dedicated server at MediaTemple. In fact, this whole problem started since we moved the site to this Dedicated server. As a test we have transferred all refbase files and the database back to a localhost and to another Shared Host. In both cases, the problem was solved. So it is possible that the problem lies with the Server Settings which is conflicting with some Refbase codes. 

    3. Based on no#2 above, we can also rule out that it is a problem with the DB, as the DB delivers perfect output in LocalHost and Shared Server environment.

    Hope this helps. Please let me know what you think.

    With warm regards,

    RR

     
  • rayhan rashid

    rayhan rashid - 2010-06-09

    a)
    I found another ongoing thread which may have some bearing on this problem:
    https://sourceforge.net/projects/refbase/forums/forum/218758/topic/3371081/index/page/1

    b)
    SMALL CORRECTION TO MY PREVIOUS COMMENT:

    The first names break only in the cases of entries that are either "Whole books" or "book chapters".

    It appears that the problem is not limited to BOOK entries. In some cases the journal entries are also affected.

    c)
    Of the affected entries, I found a pattern. The strange periods appear only in cases of First Names, usually after letters "U", "T", and in some cases after "L". For example: 'Naziru. ddin' (it should be Naziruddin), 'Su. zau. ddin' (it should be Suzauddin), 'Qu. t. u. bu. ddin' (it should be Qutubuddin), 'L. awrence' (it should be Lawrence). Let me re-confirm - none of the entries were imported from Endnote or any other format; each were manually entered.

    d)
    This is our reference website: http://www.wcsf.info/library/

    Thanks.
    rr

     
  • Richard Karnesky

    This only happens in cite view for some styles.  AMA, Harvard 3, Vancouver, and Mar Biol are fine.  As a temporary work-around, you can set one of these "safe" styles to your default style.

    I have still not been able to replicate this.  on a server that I have control over.  Please see http://www.refbase.net/index.php/Installation-Troubleshooting

    In particular: give type and version info for: OS, web server, database server, and refbase.  Please also check character set information, as per http://www.refbase.net/index.php/Troubleshooting

     
  • Richard Karnesky

    AMA, Harvard 3, Vancouver, and Mar Biol are fine.

    This is merely because reArrangeAuthorContents has a null delimiter between multiple initials for these styles (variable 9).  Another work-around would be to make this a null character for other styles (but the other styles would then render slightly incorrectly).

    So, there seems to be a problem splitting on parts of the name.  This smells of a character encoding problem…

     
  • rayhan rashid

    rayhan rashid - 2010-06-09

    Thank you very much Karnesky. I will try the workaround.

    Here are the details you asked for:

    Refbase version: 0.9.5
    MySql: 5.0.77
    MySql Charset: UTF8
    DB Collation: utf8_general_ci
    Server: Media Temple (Virtual Dedicated)

    Thanks.

     
  • Richard Karnesky

    apache/php versions?
    grep contentTypeCharset initialize/ini.inc.php

    in mysql:
    SHOW CREATE DATABASE literature;

     
  • rayhan rashid

    rayhan rashid - 2010-06-09

    Update:

    Harvard 3: not safe either. For instance, "l" in first names get omitted. eg, 'Abdullah' displays as 'Abduah,' 'Wardatul' as 'Wardatu'.

    AMA: not safe either. Omits the Ls in first names. Also, disrupts the first-last name sequencing.

    Would prefer to have at least one Citation style working properly, preferably MLA.

     
  • Richard Karnesky

    Would prefer to have at least one Citation style working properly, preferably MLA.

    Well, another work-around would be to not use rearrangeAuthorContents.  Still waiting for additional information to see if I can replicate the problem you're having, though…

     
  • rayhan rashid

    rayhan rashid - 2010-06-10

    Further server info, as requested:

    1: Apache v2.2.3
    2: PHP v5.2.6
    3: MySql v5.1.26
    4. contentTypeCharset initialize/ini.inc.php = UTF-8
    5. Not sure what you meant by "in mysql: SHOW CREATE DATABASE literature". Please could you explain?

    UPDATE:
    Edited the "cite_MLA.php" file's value#9 regarding period-delimiters. The Citation-output problem in MLA regarding the first names appears to have solved now. Thank you very much for the clue.

    I still have a feeling that this is bug in the citation system and it may have something to do with the Char-Encoding as you predicted. Hope to see a permanent solution in the later versions of the Software.

    With kind regards,
    Rayhan

     
  • Richard Karnesky

    5. Not sure what you meant by "in mysql: SHOW CREATE DATABASE literature". Please could you explain?

    You can have a UTF-8 mysql with a non-UTF-8 database or non-UTF-8 tables. 

    I still have a feeling that this is bug in the citation system

    Possibly, but it works fine on other UTF-8 systems that have data that does not work on your system.  Character encoding issues are very tricky & since this has only been seen on two instances, I do wonder if it is not an issue outside of refbase's control (such as the encoding of your databas and/or tables).

     
  • Matthias Steffens

    Hi Rayhan,

    I don't think that this is a bug in the citation system. As Rick noted, this is most likely an issue either with your server's UTF-8/Unicode support, or with the UTF-8 setup of your refbase installation.

    To ensure that everything has been set up correctly, you may want to check out the links given by Rick, especially this section:

    http://www.refbase.net/index.php/Installation-Troubleshooting#Problems_with_special_characters

    When debugging encoding/character set problems, following MySQL commands may be helpful:

    SHOW VARIABLES LIKE '%character%';
    SHOW VARIABLES LIKE '%collation%';

    You'd need to enter these commands in your server's MySQL command line interpreter (either via the shell or via a webbased tool such as phpMyAdmin). Additionaly, as Rick noted, please enter following command (also from within your server's MySQL command line interpreter) and report the output:

    SHOW CREATE DATABASE literature;

    W.r.t. your modifications in file 'cite_MLA.php': While this avoids the problem, it doesn't fix it. Rick mentioned function 'reArrangeAuthorContents()' (in file 'includes/include.inc.php') which (among other things) abbreviates given names and separates multiple author initials with the delimiter specified by the citation style.

    The issue seems to be with the definitions of variables '$upper' and/or '$lower' in file 'includes/transtab_unicode_charset.inc.php'. In this file (which is used for UTF-8 based refbase databases), variables are defined in a Unicode-aware manner. E.g., in case of '$upper', the variable defines the Unicode-aware equivalent of "" as follows:

    $upper = "\p{Lu}\p{Lt}";

    I.e., this file uses Unicode character properties instead of the normal POSIX character classes. E.g., '\p{Lu}' matches Unicode upper case letters while "\p{Lt}" matches Unicode title case letters. These Unicode character properties are available since PHP 4.4.0 and PHP 5.1.0. More info:

    http://www.php.net/manual/en/regexp.reference.unicode.php

    As a comment on that page indicates, these properties seem to be only available if PCRE has been compiled with the "-enable-unicode-properties" option. Maybe this is the issue? Could you verify this using the output of the phpinfo() function?

    In any case, it seems as if your server treats these Unicode character properties as literal characters (L, U, T), and thus the 'preg_replace()' function calls in function 'reArrangeAuthorContents()' match & behave incorrectly.

    Matthias

     
  • rayhan rashid

    rayhan rashid - 2010-06-10

    Thank you very much Matthias for your detailed comment. This looks very helpful.

    I can confirm that:
    a) contentTypeCharset initialize/ini.inc.php = UTF-8
    b) there are 16 tables in my refbase db and their collations are showing as = utf8_general_ci

    However, I have followed your instruction with the sql commands to display the "show create db literature" and it seems there are some discrepencies. Not sure what to make of them. They are as follows:

    SHOW VARIABLES LIKE '%character%';

    Variable_name  Value
    character_set_client utf8
    character_set_connection utf8
    character_set_database utf8
    character_set_filesystem binary
    character_set_results utf8
    character_set_server latin1
    character_set_system utf8
    character_sets_dir /usr/share/mysql/charsets/

    SHOW VARIABLES LIKE '%collation%';

    Variable_name  Value
    collation_connection utf8_general_ci
    collation_database utf8_general_ci
    collation_server latin1_swedish_ci

    Kind regards,
    Rayhan

     
  • rayhan rashid

    rayhan rashid - 2010-06-10

    ran "SHOW CREATE DATABASE literature; " from mysql command line interpretor and got this error message:

    #1049 - Unknown database 'literature'

     
  • rayhan rashid

    rayhan rashid - 2010-06-10

    oops, sorry, my mistake. this is the Report from "show create database literature" command:

    SQL result

    Host: localhost:3306
    Database: dbname
    Generation Time: Jun 10, 2010 at 03:37 PM
    Generated by: phpMyAdmin 2.8.2.4 / MySQL 5.0.77
    SQL query: SHOW CREATE DATABASE dbname;;
    Rows: 1
    Database Create Database
    dbname CREATE DATABASE `dbname` /*!40100 DEFAULT CHARACTER SET utf8 */

     
  • Matthias Steffens

    Hi Rayhan,

    thanks for the output from the SQL commands. The fact that the server's character set/collation is set to latin1 shouldn't pose a problem, since the lower levels (database & connection) are set to utf8. Your "SHOW CREATE DATABASE …" statement reports that your refbase MySQL database is utf8-based, and you've set variable '$contentTypeCharset' in file 'initialize/ini.inc.php' to "UTF-8". So this looks all good I think.

    As outlined previously, it could be that the PHP PCRE extension on your server has been compiled without support for Unicode characters on your server. If you're able to compile stuff on your server, please try to compile the extension with Unicode support (-enable-unicode-properties), otherwise please ask your provider about it.

    : See e.g. the end of this thread: http://www.contao.org/forum/topic/8670.html#new-name

    HTH, Matthias

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks