Malformed strings in message files cause crash
Virtual Research Environment / On-line Bibliography Manager
Brought to you by:
sirfragalot
Just tried one of the Arabic locales and ended up with this:
WIKINDX version : 6.12.1
PHP version : 8.2.0
Script : /Applications/MAMP/htdocs/wikindx6/trunk/src/core/messages/wikindx-ar.php
Line : 761
Code : 0
Message : Unmatched ')'
At that line, there is:
\'C:\\Program Files\\\example\\\\').';
The 4 backslashes at the end should be 3. It occurs elsewhere in that file (e.g., line 762) and may well be endemic in many language files.
As the new locale is already written to the users table, it is impossible to get out of this without manually editing the users table.
I'm taking the raw file as it comes out of Transifex. I'm going to see if there's a way in Transifex to protect specific sequences because it's the automatic translation that's doing a poor job.
Hi Mark,
The PHP output of Transifex is buggy for two reasons:
Since I have no way to control the output of Transifex, I changed the input/output format to JSON. In this operation, half of the translations are lost due to the change in encoding of the misinterpreted sources.
I'm not immediately deleting the PHP catalogs until you've tested the new version.
Transifex added a JSON suffix to the filename and I can't figure out how to get rid of it without reloading all 352 catalogs to avoid losing the translation memories. :-((
That's enough for today. I haven't dealt with the time zone.
Regards,
Sounds like a significant problem. I've done a quick check with the latest SVN and all seems fine. Arabic (multiple versions I think) is listed—I chose one or two of these without the issues previously reported.
Mark
Thank you, a first step in the right direction. After spending the night on it, I'm hesitant.
Even in the newly exported JSON files, there were incorrectly encoded characters. I could only correct them manually, searching for the strings one by one for hours. I simply deleted the erroneous translations. Transifex, once again, exports without properly respecting the encoding of the target format. So, now I have serious doubts about the platform's reliability.
Should we redo all the translations or not? Should we simply remove Arabic because the automatic translator doesn't seem up to the task? Won't this break again later?
Our placeholder system in strings is also flawed, non-standard, and therefore poorly recognized. Variable substitution is poorly implemented. We also can't specify text sequences not to be translated.
So, since we're losing half the translations, I'm wondering if I shouldn't just fix our system and abandon Transifex.