I'd like to know how input the vocal "lengthening dash" in the
Japanese katakana mode. I was already once scolded for
using a wrong character (㇝).
Which input method are you using for Japanese, scim-tables, UIM's
or the one provided by M17N?
The hyphen (-) is usually used for the vocal "lengthening dash" in
Japanese input methods, but it looks like a wrong glyph is assigned
to it in scim-tables (KATAKANA). Here's an example.
They look almost the same but different. When I tried the first
one on Babel Fish translator (from Japanese to English), it
returned "プ? ル" instead of "pool".
I don't use scim-tables for Japanese, so not sure if there's any
other way to input the correct one...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
My guess is that "scim_event.cpp" should be edited somehow but
I don't know how.
Apart from that, how about also applying the attached patch to
scim-tables-ja (which I do use, to answer to Yukiko's question)?
Writing ???? as "kaaten" is, IMHO, much more convenient than
"ka - ten". It also adds some foreign syllables so that I can, for
example, write my home land ?????? as "finlando" instead of
"w_inrando".
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm attaching a bit improved version of the patch. It also defines
alternative spellings for "oo" ("o-", "ou") and "ee" ("e-", "ei").
Making similar shortcuts for double consonants (e.g. "hokkaido"
=> "ho_tsukaido") for hiragana, too, might also be handy --
perhaps even *generating* all the necessary combinations at
build time.
I'm not an expert in kana writing or native Japanese, though, so I'd
also very much like to hear Yukiko's comments on wether or not
these sound like good ideas or not.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> how do I get the "middle dot"?
It looks like the middle dot is not defined in scim-tables-ja. In
UIM-anthy and UIM-skk (which I use), "z/" is used for it.
> Writing ???? as "kaaten" is, IMHO, much more convenient than
> "ka - ten".
Maybe... but what if you want to write "kaasan(Mom)" in katakana
for some reason?
> write my home land ?????? as "finlando" instead of "w_inrando".
I think "finrando" would be better. As far as I know, "l" is not used
for ra-ri-ru-re-ro in any Japanese romanization system. It is
reserved for other thing in UIM-anthy. For example, "la" for a
small a.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've uploaded japanese.scm to my web space. It defines all key
combinations to generate hiragana/katakana in UIM-anthy. http://www.h4.dion.ne.jp/~apricots/files/japanese.scm
Note that it is written in eucjp encoding.
This page also might be of some help. http://en.wikipedia.org/wiki/Romaji
(Click "nihongo" in "in other languages" to see more information.)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> It also defines alternative spellings for "oo" ("o-", "ou") and
> "ee" ("e-", "ei").
Sorry, I don't seem to understand your intention here, but please
note that "oo", "ou" and "o-" are all different. For example:
"Osaka" is "o o sa ka" in hiragana/katakana
"king" is "o u", not "o o"
"OK" is "o - ke (-)" in katakana.
I think the problem is "-" (for the lengthening dash) is not treated
the same way as a consonant. For example, you can simply type
"biru" for "building" but have to use Space to write "beer" like
"bi(Space)-ru". If "bi" is committed when you input "-", you don't
need to press Space.
"hokkaido" is a common practice.
> perhaps even *generating* all the necessary combinations at
> build time.
That will improve usability. From the point of view of a native
Japanese, the current scim-tables-ja is not easy to use...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
>> "ee" ("e-", "ei").
>Sorry, I don't seem to understand your intention here
I suggested adding alternative spellings so you can easily switch
between them. Suppose you want to write cake in Japanese:
"keeki" would automatically become "ke-ki" in katakan mode. If,
however, you wanted to write "business" in katana (like in your
"okaasan" example), you could write "kee<down>ki" to get "ke i
ki".
Of course, you could always type "keiki" and "oka asan" but since
<down> (or whatever you have configured it to) is supposed to
give you alternative spellings in Scim, I think it would be logical for
katakana and hiragana ("oo" vs. "ou") to do so, too.
> I think the problem is "-" (for the lengthening dash) is
> not treated the same way as a consonant.
Mm. Perhaps that should be corrected. However, as the standard
romanization for beer is "biiru", making scim-tables-ja output
"bi-ru" by default in katakana mode (and having "ii" as alternative,
toggled by <down>) would also seem quite logical and convenient
to me. Even more so because it's not obvious to the user what
sort of "kana dash" the dash key is mapped to, but it's
immediately clear that the dash that appears when you write
"biiru" is most likely the correct one.
> "hokkaido" is a common practice. [...]
> That will improve usability.
Ok, I'll write a shell script once we agree about the vowel
doubling.
> I think "finrando" would be better.
Yes, you're probably right. You have to know some katakana in
order to write your name with it anyway, so using standard
romanization rules is most likely better than providing hints like
this.
> It looks like the middle dot is not defined in scim-tables-ja.
> In UIM-anthy and UIM-skk (which I use), "z/" is used for it.
Hmm. Scim-tables seems to ignore punctuation characters like '/'
in the table definitions. In your opinion, would for example the "x"
key do?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> <down> is supposed to give you alternative spellings in Scim, I
> think it would be logical for katakana and hiragana to do so, too.
In commonly used Japanese input methods, we press <space>
to convert a string of hiragana into a mixture of kanji and hiragana
(and sometimes katakana). We never have to press that key
while writing in hiragana or katakana. Frankly, "kee<down>ki"
appears very odd to me.
However, I have the impression that scim-tables-ja is meant for
non-native occassional writers. So, if you think giving alternatives
would help people who are not familiar with Japanese or any
Japanese input methods, it might be a good idea.
"x" is also used to get small vowels like "l".
If you think ease of use (for non-native writers) should come first,
it might also be better to allow "finlando". It's easier to type
lemon than remon for me too. :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Ok, here are the generation scripts ("bash HIRAGANA.txt.in.sh >
HIRAGANA.txt.in"). They are a compromise based on the discussion:
Double consonant and the foreign extras are in but the vowel lengthening trick is
*reversed*, i.e. typing 'kaaten' now results in "ka a te n" by default and
'kaa<down>ten' makes it a dash: "ka- te n". Key 'x' now works as an alternative to '_'
so instructions for other systems mostly apply and 'q' writes the middle dot (this is a
bit questionable but it's the best I could come up with without editing any C code).
I'm quite satisfied with the tables now. Some examples:
In katakana mode 'jack daniels' writes "zi _a _tsu ku da ni e ru su", and 'helsinki' writes
"he ru su _i n ki". On the other hand, 'keeki' => "ke e ki" and 'kee<down>ki' => "ke-
ki" and further 'kee<down><down>ki' => "ke i ki" (of course 'keiki' works too!).
In hiragana, 'hokkaido' => "ho _tsu ka i do" but 'gomennasai' => 'go me n na sa i' as
expected.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
AFAICT, just download http://sourceforge.net/tracker/download.php?
group_id=108454&atid=650540&file_id=110169&aid=1070059
KATAKANA.txt.in.sh > KATAKANA.txt.in
do not forget to
chmod +x KATAKANA.txt.in.sh
cp KATAKANA.txt.in scim-tables/ja to replace the original one,
and make && make install
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, or instead of chmod'ing, you could type "bash
KATAKANA.txt.in.sh > KATAKANA.txt.in && bash
HIRAGANA.txt.in.sh > HIRAGANA.txt.in". Of course, in the final
packaging, the makefile should run the scripts.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> In katakana mode 'jack daniels' writes "zi _a _tsu ku da ni e ru
> su", and 'helsinki' writes "he ru su _i n ki".
It works great! Congratulations! It's an interesting way of typing
foreign words in Japanese katakana. :)
Though, there seems to be no way to type the legthening dash
independently... Also, how about adding combinations of a
consonat and y which mostly end with the dash when written in
katakana? Here are some examples.
ny > "ni-" as in Sony
ky > "ki-" as in wisky
dy > "de_i-" as in candy
ly > "ri-" as in lily
py > "pi-" as in Snoopy etc.
> 'q' writes the middle dot
I think it's a good idea.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I filed a separate issue for the punctuation marks: #1083680.
Will try the extensions suggested by Yukiko once I get some
other busy business out of my hands.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Logged In: YES
user_id=1163254
Which input method are you using for Japanese, scim-tables, UIM's
or the one provided by M17N?
The hyphen (-) is usually used for the vocal "lengthening dash" in
Japanese input methods, but it looks like a wrong glyph is assigned
to it in scim-tables (KATAKANA). Here's an example.
[scim-tables: KATAKANA]
pu [Space] - ru [Space] gives プ-ル
[UIM-anthy: hiragana]
pu-ru [F7] gives プール
They look almost the same but different. When I tried the first
one on Babel Fish translator (from Japanese to English), it
returned "プ? ル" instead of "pool".
I don't use scim-tables for Japanese, so not sure if there's any
other way to input the correct one...
Logged In: YES
user_id=600121
Yukiko suggested that there may be a mistake in the table
file for katakana. Do you think how can I rectify it?
Logged In: YES
user_id=312071
My guess is that "scim_event.cpp" should be edited somehow but
I don't know how.
Apart from that, how about also applying the attached patch to
scim-tables-ja (which I do use, to answer to Yukiko's question)?
Writing ???? as "kaaten" is, IMHO, much more convenient than
"ka - ten". It also adds some foreign syllables so that I can, for
example, write my home land ?????? as "finlando" instead of
"w_inrando".
Logged In: YES
user_id=312071
Another related question, BTW: how do I get the "middle dot"?-)
Logged In: YES
user_id=600121
thanks for your patch.
As I have no idea about Japanese, could the other Japanese
users comment on the patch?
Yukiko, what do you think?
Improved version of the "double vowel trick" patch
Logged In: YES
user_id=312071
I'm attaching a bit improved version of the patch. It also defines
alternative spellings for "oo" ("o-", "ou") and "ee" ("e-", "ei").
Making similar shortcuts for double consonants (e.g. "hokkaido"
=> "ho_tsukaido") for hiragana, too, might also be handy --
perhaps even *generating* all the necessary combinations at
build time.
I'm not an expert in kana writing or native Japanese, though, so I'd
also very much like to hear Yukiko's comments on wether or not
these sound like good ideas or not.
Logged In: YES
user_id=1163254
> how do I get the "middle dot"?
It looks like the middle dot is not defined in scim-tables-ja. In
UIM-anthy and UIM-skk (which I use), "z/" is used for it.
> Writing ???? as "kaaten" is, IMHO, much more convenient than
> "ka - ten".
Maybe... but what if you want to write "kaasan(Mom)" in katakana
for some reason?
> write my home land ?????? as "finlando" instead of "w_inrando".
I think "finrando" would be better. As far as I know, "l" is not used
for ra-ri-ru-re-ro in any Japanese romanization system. It is
reserved for other thing in UIM-anthy. For example, "la" for a
small a.
Logged In: YES
user_id=1163254
I've uploaded japanese.scm to my web space. It defines all key
combinations to generate hiragana/katakana in UIM-anthy.
http://www.h4.dion.ne.jp/~apricots/files/japanese.scm
Note that it is written in eucjp encoding.
This page also might be of some help.
http://en.wikipedia.org/wiki/Romaji
(Click "nihongo" in "in other languages" to see more information.)
Logged In: YES
user_id=1163254
> It also defines alternative spellings for "oo" ("o-", "ou") and
> "ee" ("e-", "ei").
Sorry, I don't seem to understand your intention here, but please
note that "oo", "ou" and "o-" are all different. For example:
"Osaka" is "o o sa ka" in hiragana/katakana
"king" is "o u", not "o o"
"OK" is "o - ke (-)" in katakana.
I think the problem is "-" (for the lengthening dash) is not treated
the same way as a consonant. For example, you can simply type
"biru" for "building" but have to use Space to write "beer" like
"bi(Space)-ru". If "bi" is committed when you input "-", you don't
need to press Space.
"hokkaido" is a common practice.
> perhaps even *generating* all the necessary combinations at
> build time.
That will improve usability. From the point of view of a native
Japanese, the current scim-tables-ja is not easy to use...
Logged In: YES
user_id=312071
>> "ee" ("e-", "ei").
>Sorry, I don't seem to understand your intention here
I suggested adding alternative spellings so you can easily switch
between them. Suppose you want to write cake in Japanese:
"keeki" would automatically become "ke-ki" in katakan mode. If,
however, you wanted to write "business" in katana (like in your
"okaasan" example), you could write "kee<down>ki" to get "ke i
ki".
Of course, you could always type "keiki" and "oka asan" but since
<down> (or whatever you have configured it to) is supposed to
give you alternative spellings in Scim, I think it would be logical for
katakana and hiragana ("oo" vs. "ou") to do so, too.
> I think the problem is "-" (for the lengthening dash) is
> not treated the same way as a consonant.
Mm. Perhaps that should be corrected. However, as the standard
romanization for beer is "biiru", making scim-tables-ja output
"bi-ru" by default in katakana mode (and having "ii" as alternative,
toggled by <down>) would also seem quite logical and convenient
to me. Even more so because it's not obvious to the user what
sort of "kana dash" the dash key is mapped to, but it's
immediately clear that the dash that appears when you write
"biiru" is most likely the correct one.
> "hokkaido" is a common practice. [...]
> That will improve usability.
Ok, I'll write a shell script once we agree about the vowel
doubling.
> I think "finrando" would be better.
Yes, you're probably right. You have to know some katakana in
order to write your name with it anyway, so using standard
romanization rules is most likely better than providing hints like
this.
> It looks like the middle dot is not defined in scim-tables-ja.
> In UIM-anthy and UIM-skk (which I use), "z/" is used for it.
Hmm. Scim-tables seems to ignore punctuation characters like '/'
in the table definitions. In your opinion, would for example the "x"
key do?
Logged In: YES
user_id=1163254
> <down> is supposed to give you alternative spellings in Scim, I
> think it would be logical for katakana and hiragana to do so, too.
In commonly used Japanese input methods, we press <space>
to convert a string of hiragana into a mixture of kanji and hiragana
(and sometimes katakana). We never have to press that key
while writing in hiragana or katakana. Frankly, "kee<down>ki"
appears very odd to me.
However, I have the impression that scim-tables-ja is meant for
non-native occassional writers. So, if you think giving alternatives
would help people who are not familiar with Japanese or any
Japanese input methods, it might be a good idea.
"x" is also used to get small vowels like "l".
If you think ease of use (for non-native writers) should come first,
it might also be better to allow "finlando". It's easier to type
lemon than remon for me too. :)
Logged In: YES
user_id=312071
Ok, here are the generation scripts ("bash HIRAGANA.txt.in.sh >
HIRAGANA.txt.in"). They are a compromise based on the discussion:
Double consonant and the foreign extras are in but the vowel lengthening trick is
*reversed*, i.e. typing 'kaaten' now results in "ka a te n" by default and
'kaa<down>ten' makes it a dash: "ka- te n". Key 'x' now works as an alternative to '_'
so instructions for other systems mostly apply and 'q' writes the middle dot (this is a
bit questionable but it's the best I could come up with without editing any C code).
I'm quite satisfied with the tables now. Some examples:
In katakana mode 'jack daniels' writes "zi _a _tsu ku da ni e ru su", and 'helsinki' writes
"he ru su _i n ki". On the other hand, 'keeki' => "ke e ki" and 'kee<down>ki' => "ke-
ki" and further 'kee<down><down>ki' => "ke i ki" (of course 'keiki' works too!).
In hiragana, 'hokkaido' => "ho _tsu ka i do" but 'gomennasai' => 'go me n na sa i' as
expected.
Logged In: YES
user_id=312071
Logged In: YES
user_id=312071
(a few more minor fixes)
Logged In: YES
user_id=312071
(...and still some more. This is the last one for now, I promise :))
Generation scripts for both kanas
Logged In: YES
user_id=600121
How to use the new tables: (reply to Yukiko)
AFAICT, just download
http://sourceforge.net/tracker/download.php?
group_id=108454&atid=650540&file_id=110169&aid=1070059
KATAKANA.txt.in.sh > KATAKANA.txt.in
do not forget to
chmod +x KATAKANA.txt.in.sh
cp KATAKANA.txt.in scim-tables/ja to replace the original one,
and make && make install
Logged In: YES
user_id=312071
Yes, or instead of chmod'ing, you could type "bash
KATAKANA.txt.in.sh > KATAKANA.txt.in && bash
HIRAGANA.txt.in.sh > HIRAGANA.txt.in". Of course, in the final
packaging, the makefile should run the scripts.
Logged In: YES
user_id=1163254
Thanks, Cougar, for your help.
> In katakana mode 'jack daniels' writes "zi _a _tsu ku da ni e ru
> su", and 'helsinki' writes "he ru su _i n ki".
It works great! Congratulations! It's an interesting way of typing
foreign words in Japanese katakana. :)
Though, there seems to be no way to type the legthening dash
independently... Also, how about adding combinations of a
consonat and y which mostly end with the dash when written in
katakana? Here are some examples.
ny > "ni-" as in Sony
ky > "ki-" as in wisky
dy > "de_i-" as in candy
ly > "ri-" as in lily
py > "pi-" as in Snoopy etc.
> 'q' writes the middle dot
I think it's a good idea.
Logged In: YES
user_id=312071
I filed a separate issue for the punctuation marks: #1083680.
Will try the extensions suggested by Yukiko once I get some
other busy business out of my hands.