Hi,
We are having some problem with displaying the "first name" in the "Most recent additions" display panel on the first page. For example, "Cassese, Antonio" is diplayed as "Cassese, Ant. onio", or "Huyse, Luc" is diplayed as "Huyse, L. u. c." . I am not sure which file/directory processes this display output. Any suggestion would be gratefully received.
Warm regards,
Rayhan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
How did you import those references? Because I suspect that this is less a problem with refbase directly, but with corrupted data from your metadata.
I guess it would be helpful to check those metadata files which you used for import into refbase and see if the same errors (names) occur there as well.
Since I guess that this happend with a lot of names I don't think it would make sense to use SQL-commands to "fix" those references, but a reimport of all those references should work fine…
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you for your reply. For clarification: the entries in question were not imported and they were entered manually. So it is unlikely to be any problem with metadata. We have done some checks and this is what we have found which hopefully narrows down the problem:
1. The first names break only in the cases of entries that are either "Whole books" or "book chapters". We have checked and can confirm that problem is not with any individual citation-style files as the same happens to all of the styles including MLA, APA, Harvard 1 etc. We have even tried replacing the citation related core php files with the ones from Refbase repository. Did not change anything. So we can rule out that it is due to any corrupt file.
2. Currently, we are running refbase db from a Virtual Dedicated server at MediaTemple. In fact, this whole problem started since we moved the site to this Dedicated server. As a test we have transferred all refbase files and the database back to a localhost and to another Shared Host. In both cases, the problem was solved. So it is possible that the problem lies with the Server Settings which is conflicting with some Refbase codes.
3. Based on no#2 above, we can also rule out that it is a problem with the DB, as the DB delivers perfect output in LocalHost and Shared Server environment.
Hope this helps. Please let me know what you think.
With warm regards,
RR
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The first names break only in the cases of entries that are either "Whole books" or "book chapters".
It appears that the problem is not limited to BOOK entries. In some cases the journal entries are also affected.
c)
Of the affected entries, I found a pattern. The strange periods appear only in cases of First Names, usually after letters "U", "T", and in some cases after "L". For example: 'Naziru. ddin' (it should be Naziruddin), 'Su. zau. ddin' (it should be Suzauddin), 'Qu. t. u. bu. ddin' (it should be Qutubuddin), 'L. awrence' (it should be Lawrence). Let me re-confirm - none of the entries were imported from Endnote or any other format; each were manually entered.
This only happens in cite view for some styles. AMA, Harvard 3, Vancouver, and Mar Biol are fine. As a temporary work-around, you can set one of these "safe" styles to your default style.
In particular: give type and version info for: OS, web server, database server, and refbase. Please also check character set information, as per http://www.refbase.net/index.php/Troubleshooting
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This is merely because reArrangeAuthorContents has a null delimiter between multiple initials for these styles (variable 9). Another work-around would be to make this a null character for other styles (but the other styles would then render slightly incorrectly).
So, there seems to be a problem splitting on parts of the name. This smells of a character encoding problem…
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Would prefer to have at least one Citation style working properly, preferably MLA.
Well, another work-around would be to not use rearrangeAuthorContents. Still waiting for additional information to see if I can replicate the problem you're having, though…
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1: Apache v2.2.3
2: PHP v5.2.6
3: MySql v5.1.26
4. contentTypeCharset initialize/ini.inc.php = UTF-8
5. Not sure what you meant by "in mysql: SHOW CREATE DATABASE literature". Please could you explain?
UPDATE:
Edited the "cite_MLA.php" file's value#9 regarding period-delimiters. The Citation-output problem in MLA regarding the first names appears to have solved now. Thank you very much for the clue.
I still have a feeling that this is bug in the citation system and it may have something to do with the Char-Encoding as you predicted. Hope to see a permanent solution in the later versions of the Software.
With kind regards,
Rayhan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
5. Not sure what you meant by "in mysql: SHOW CREATE DATABASE literature". Please could you explain?
You can have a UTF-8 mysql with a non-UTF-8 database or non-UTF-8 tables.
I still have a feeling that this is bug in the citation system
Possibly, but it works fine on other UTF-8 systems that have data that does not work on your system. Character encoding issues are very tricky & since this has only been seen on two instances, I do wonder if it is not an issue outside of refbase's control (such as the encoding of your databas and/or tables).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I don't think that this is a bug in the citation system. As Rick noted, this is most likely an issue either with your server's UTF-8/Unicode support, or with the UTF-8 setup of your refbase installation.
To ensure that everything has been set up correctly, you may want to check out the links given by Rick, especially this section:
When debugging encoding/character set problems, following MySQL commands may be helpful:
SHOW VARIABLES LIKE '%character%';
SHOW VARIABLES LIKE '%collation%';
You'd need to enter these commands in your server's MySQL command line interpreter (either via the shell or via a webbased tool such as phpMyAdmin). Additionaly, as Rick noted, please enter following command (also from within your server's MySQL command line interpreter) and report the output:
SHOW CREATE DATABASE literature;
W.r.t. your modifications in file 'cite_MLA.php': While this avoids the problem, it doesn't fix it. Rick mentioned function 'reArrangeAuthorContents()' (in file 'includes/include.inc.php') which (among other things) abbreviates given names and separates multiple author initials with the delimiter specified by the citation style.
The issue seems to be with the definitions of variables '$upper' and/or '$lower' in file 'includes/transtab_unicode_charset.inc.php'. In this file (which is used for UTF-8 based refbase databases), variables are defined in a Unicode-aware manner. E.g., in case of '$upper', the variable defines the Unicode-aware equivalent of "" as follows:
$upper = "\p{Lu}\p{Lt}";
I.e., this file uses Unicode character properties instead of the normal POSIX character classes. E.g., '\p{Lu}' matches Unicode upper case letters while "\p{Lt}" matches Unicode title case letters. These Unicode character properties are available since PHP 4.4.0 and PHP 5.1.0. More info:
As a comment on that page indicates, these properties seem to be only available if PCRE has been compiled with the "-enable-unicode-properties" option. Maybe this is the issue? Could you verify this using the output of the phpinfo() function?
In any case, it seems as if your server treats these Unicode character properties as literal characters (L, U, T), and thus the 'preg_replace()' function calls in function 'reArrangeAuthorContents()' match & behave incorrectly.
Matthias
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you very much Matthias for your detailed comment. This looks very helpful.
I can confirm that:
a) contentTypeCharset initialize/ini.inc.php = UTF-8
b) there are 16 tables in my refbase db and their collations are showing as = utf8_general_ci
However, I have followed your instruction with the sql commands to display the "show create db literature" and it seems there are some discrepencies. Not sure what to make of them. They are as follows:
thanks for the output from the SQL commands. The fact that the server's character set/collation is set to latin1 shouldn't pose a problem, since the lower levels (database & connection) are set to utf8. Your "SHOW CREATE DATABASE …" statement reports that your refbase MySQL database is utf8-based, and you've set variable '$contentTypeCharset' in file 'initialize/ini.inc.php' to "UTF-8". So this looks all good I think.
As outlined previously, it could be that the PHP PCRE extension on your server has been compiled without support for Unicode characters on your server. If you're able to compile stuff on your server, please try to compile the extension with Unicode support (-enable-unicode-properties), otherwise please ask your provider about it.
Hi,
We are having some problem with displaying the "first name" in the "Most recent additions" display panel on the first page. For example, "Cassese, Antonio" is diplayed as "Cassese, Ant. onio", or "Huyse, Luc" is diplayed as "Huyse, L. u. c." . I am not sure which file/directory processes this display output. Any suggestion would be gratefully received.
Warm regards,
Rayhan
How did you import those references? Because I suspect that this is less a problem with refbase directly, but with corrupted data from your metadata.
I guess it would be helpful to check those metadata files which you used for import into refbase and see if the same errors (names) occur there as well.
Since I guess that this happend with a lot of names I don't think it would make sense to use SQL-commands to "fix" those references, but a reimport of all those references should work fine…
@ florianneuro,
Thank you for your reply. For clarification: the entries in question were not imported and they were entered manually. So it is unlikely to be any problem with metadata. We have done some checks and this is what we have found which hopefully narrows down the problem:
1. The first names break only in the cases of entries that are either "Whole books" or "book chapters". We have checked and can confirm that problem is not with any individual citation-style files as the same happens to all of the styles including MLA, APA, Harvard 1 etc. We have even tried replacing the citation related core php files with the ones from Refbase repository. Did not change anything. So we can rule out that it is due to any corrupt file.
2. Currently, we are running refbase db from a Virtual Dedicated server at MediaTemple. In fact, this whole problem started since we moved the site to this Dedicated server. As a test we have transferred all refbase files and the database back to a localhost and to another Shared Host. In both cases, the problem was solved. So it is possible that the problem lies with the Server Settings which is conflicting with some Refbase codes.
3. Based on no#2 above, we can also rule out that it is a problem with the DB, as the DB delivers perfect output in LocalHost and Shared Server environment.
Hope this helps. Please let me know what you think.
With warm regards,
RR
a)
I found another ongoing thread which may have some bearing on this problem:
https://sourceforge.net/projects/refbase/forums/forum/218758/topic/3371081/index/page/1
b)
SMALL CORRECTION TO MY PREVIOUS COMMENT:
It appears that the problem is not limited to BOOK entries. In some cases the journal entries are also affected.
c)
Of the affected entries, I found a pattern. The strange periods appear only in cases of First Names, usually after letters "U", "T", and in some cases after "L". For example: 'Naziru. ddin' (it should be Naziruddin), 'Su. zau. ddin' (it should be Suzauddin), 'Qu. t. u. bu. ddin' (it should be Qutubuddin), 'L. awrence' (it should be Lawrence). Let me re-confirm - none of the entries were imported from Endnote or any other format; each were manually entered.
d)
This is our reference website: http://www.wcsf.info/library/
Thanks.
rr
This only happens in cite view for some styles. AMA, Harvard 3, Vancouver, and Mar Biol are fine. As a temporary work-around, you can set one of these "safe" styles to your default style.
I have still not been able to replicate this. on a server that I have control over. Please see http://www.refbase.net/index.php/Installation-Troubleshooting
In particular: give type and version info for: OS, web server, database server, and refbase. Please also check character set information, as per http://www.refbase.net/index.php/Troubleshooting
This is merely because reArrangeAuthorContents has a null delimiter between multiple initials for these styles (variable 9). Another work-around would be to make this a null character for other styles (but the other styles would then render slightly incorrectly).
So, there seems to be a problem splitting on parts of the name. This smells of a character encoding problem…
Thank you very much Karnesky. I will try the workaround.
Here are the details you asked for:
Refbase version: 0.9.5
MySql: 5.0.77
MySql Charset: UTF8
DB Collation: utf8_general_ci
Server: Media Temple (Virtual Dedicated)
Thanks.
apache/php versions?
grep contentTypeCharset initialize/ini.inc.php
in mysql:
SHOW CREATE DATABASE literature;
Update:
Harvard 3: not safe either. For instance, "l" in first names get omitted. eg, 'Abdullah' displays as 'Abduah,' 'Wardatul' as 'Wardatu'.
AMA: not safe either. Omits the Ls in first names. Also, disrupts the first-last name sequencing.
Would prefer to have at least one Citation style working properly, preferably MLA.
Well, another work-around would be to not use rearrangeAuthorContents. Still waiting for additional information to see if I can replicate the problem you're having, though…
Further server info, as requested:
1: Apache v2.2.3
2: PHP v5.2.6
3: MySql v5.1.26
4. contentTypeCharset initialize/ini.inc.php = UTF-8
5. Not sure what you meant by "in mysql: SHOW CREATE DATABASE literature". Please could you explain?
UPDATE:
Edited the "cite_MLA.php" file's value#9 regarding period-delimiters. The Citation-output problem in MLA regarding the first names appears to have solved now. Thank you very much for the clue.
I still have a feeling that this is bug in the citation system and it may have something to do with the Char-Encoding as you predicted. Hope to see a permanent solution in the later versions of the Software.
With kind regards,
Rayhan
You can have a UTF-8 mysql with a non-UTF-8 database or non-UTF-8 tables.
Possibly, but it works fine on other UTF-8 systems that have data that does not work on your system. Character encoding issues are very tricky & since this has only been seen on two instances, I do wonder if it is not an issue outside of refbase's control (such as the encoding of your databas and/or tables).
Hi Rayhan,
I don't think that this is a bug in the citation system. As Rick noted, this is most likely an issue either with your server's UTF-8/Unicode support, or with the UTF-8 setup of your refbase installation.
To ensure that everything has been set up correctly, you may want to check out the links given by Rick, especially this section:
http://www.refbase.net/index.php/Installation-Troubleshooting#Problems_with_special_characters
When debugging encoding/character set problems, following MySQL commands may be helpful:
SHOW VARIABLES LIKE '%character%';
SHOW VARIABLES LIKE '%collation%';
You'd need to enter these commands in your server's MySQL command line interpreter (either via the shell or via a webbased tool such as phpMyAdmin). Additionaly, as Rick noted, please enter following command (also from within your server's MySQL command line interpreter) and report the output:
SHOW CREATE DATABASE literature;
W.r.t. your modifications in file 'cite_MLA.php': While this avoids the problem, it doesn't fix it. Rick mentioned function 'reArrangeAuthorContents()' (in file 'includes/include.inc.php') which (among other things) abbreviates given names and separates multiple author initials with the delimiter specified by the citation style.
The issue seems to be with the definitions of variables '$upper' and/or '$lower' in file 'includes/transtab_unicode_charset.inc.php'. In this file (which is used for UTF-8 based refbase databases), variables are defined in a Unicode-aware manner. E.g., in case of '$upper', the variable defines the Unicode-aware equivalent of "" as follows:
$upper = "\p{Lu}\p{Lt}";
I.e., this file uses Unicode character properties instead of the normal POSIX character classes. E.g., '\p{Lu}' matches Unicode upper case letters while "\p{Lt}" matches Unicode title case letters. These Unicode character properties are available since PHP 4.4.0 and PHP 5.1.0. More info:
http://www.php.net/manual/en/regexp.reference.unicode.php
As a comment on that page indicates, these properties seem to be only available if PCRE has been compiled with the "-enable-unicode-properties" option. Maybe this is the issue? Could you verify this using the output of the phpinfo() function?
In any case, it seems as if your server treats these Unicode character properties as literal characters (L, U, T), and thus the 'preg_replace()' function calls in function 'reArrangeAuthorContents()' match & behave incorrectly.
Matthias
Thank you very much Matthias for your detailed comment. This looks very helpful.
I can confirm that:
a) contentTypeCharset initialize/ini.inc.php = UTF-8
b) there are 16 tables in my refbase db and their collations are showing as = utf8_general_ci
However, I have followed your instruction with the sql commands to display the "show create db literature" and it seems there are some discrepencies. Not sure what to make of them. They are as follows:
SHOW VARIABLES LIKE '%character%';
SHOW VARIABLES LIKE '%collation%';
Kind regards,
Rayhan
ran "SHOW CREATE DATABASE literature; " from mysql command line interpretor and got this error message:
#1049 - Unknown database 'literature'
oops, sorry, my mistake. this is the Report from "show create database literature" command:
Host: localhost:3306
Database: dbname
Generation Time: Jun 10, 2010 at 03:37 PM
Generated by: phpMyAdmin 2.8.2.4 / MySQL 5.0.77
SQL query: SHOW CREATE DATABASE dbname;;
Rows: 1
Database Create Database
dbname CREATE DATABASE `dbname` /*!40100 DEFAULT CHARACTER SET utf8 */
Hi Rayhan,
thanks for the output from the SQL commands. The fact that the server's character set/collation is set to latin1 shouldn't pose a problem, since the lower levels (database & connection) are set to utf8. Your "SHOW CREATE DATABASE …" statement reports that your refbase MySQL database is utf8-based, and you've set variable '$contentTypeCharset' in file 'initialize/ini.inc.php' to "UTF-8". So this looks all good I think.
As outlined previously, it could be that the PHP PCRE extension on your server has been compiled without support for Unicode characters on your server. If you're able to compile stuff on your server, please try to compile the extension with Unicode support (-enable-unicode-properties), otherwise please ask your provider about it.
: See e.g. the end of this thread: http://www.contao.org/forum/topic/8670.html#new-name
HTH, Matthias