This is caused by the inability of PHP to recognise UTF characters in regular expressions, (even when the "u" qualifier is specified).
The code to replace local text with gedcom keywords uses the "\b" (word-break) qualifiers. This prevents, for example, the german "Juli" from matching "@#DJULIAN@".
The solution will be to implement a hebrew-specific version of this particular function. The framework already exists, so we'll simply need to add a function to includes/extras/functions.he.php
Before I can do this, I need you to answer a question for me. Are there any cases where one hebrew date word might appear as a substring of another (like Juli above).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
and
$pgv_lang["jul"] = "יולי";
$pgv_lang["cal_julian"] = "יוליאני";
We have Hebrew days and years as Hebrew letters that represent the numbers.
And we have
$pgv_lang["and"] = "ו-";
which is 6-
$pgv_lang["from"] = "מ-";
which is 40-
$pgv_lang["to"] = "עד";
which is 74
I do not think they will be confused as days or years.
תש or ת"ש is 5700 and is part of
$pgv_lang["tsh"] = "תשרי";
שו ש"ו is 5306 and part of
$pgv_lang["csh"] = "חשוון";
תמ ת"מ is 5440 and part of
$pgv_lang["tmz"] = "תמוז";
תמו תמ"ו is 5446 and part of
$pgv_lang["tmz"] = "תמוז";
ת is 5400 and part of tsh, tvt and tmz
the unit letters (end of number) for 1, 2, 4, 6, 7, 8, 9
the letters for tens 10, 16, 20, 30, 36, 40, 46, 50, 60, 70, 74, 80
the letters for hundreds (with not specifically mentioned thousands)
200 210 300 302 306 350 356 400 440 446 5200 5210 5300 5302 5350 5356 5400
are part of the month names and ABT etc. texts...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Logged In: YES
user_id=1466942
Originator: NO
This is caused by the inability of PHP to recognise UTF characters in regular expressions, (even when the "u" qualifier is specified).
The code to replace local text with gedcom keywords uses the "\b" (word-break) qualifiers. This prevents, for example, the german "Juli" from matching "@#DJULIAN@".
The solution will be to implement a hebrew-specific version of this particular function. The framework already exists, so we'll simply need to add a function to includes/extras/functions.he.php
Before I can do this, I need you to answer a question for me. Are there any cases where one hebrew date word might appear as a substring of another (like Juli above).
Logged In: YES
user_id=959928
Originator: YES
We have "adr", "ads" and "adr_leap_year"
and
$pgv_lang["jul"] = "יולי";
$pgv_lang["cal_julian"] = "יוליאני";
We have Hebrew days and years as Hebrew letters that represent the numbers.
And we have
$pgv_lang["and"] = "ו-";
which is 6-
$pgv_lang["from"] = "מ-";
which is 40-
$pgv_lang["to"] = "עד";
which is 74
I do not think they will be confused as days or years.
תש or ת"ש is 5700 and is part of
$pgv_lang["tsh"] = "תשרי";
שו ש"ו is 5306 and part of
$pgv_lang["csh"] = "חשוון";
תמ ת"מ is 5440 and part of
$pgv_lang["tmz"] = "תמוז";
תמו תמ"ו is 5446 and part of
$pgv_lang["tmz"] = "תמוז";
ת is 5400 and part of tsh, tvt and tmz
the unit letters (end of number) for 1, 2, 4, 6, 7, 8, 9
the letters for tens 10, 16, 20, 30, 36, 40, 46, 50, 60, 70, 74, 80
the letters for hundreds (with not specifically mentioned thousands)
200 210 300 302 306 350 356 400 440 446 5200 5210 5300 5302 5350 5356 5400
are part of the month names and ABT etc. texts...
Logged In: YES
user_id=1466942
Originator: NO
This is more complicated than I had expected.
I think it is probably best to leave this task to a native hebrew speaker.