Smart Common Input Method platform / Support Requests / #1 Inputting "vowel lengthening dash" in katakana mode?

Jarno Elonen - 2004-11-20

summary: Inputting "vocal dash" in katakana mode? --> Inputting "vowel lengthening dash" in katakana mode?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yukiko Bando - 2004-11-21

Logged In: YES
user_id=1163254

Which input method are you using for Japanese, scim-tables, UIM's
or the one provided by M17N?

The hyphen (-) is usually used for the vocal "lengthening dash" in
Japanese input methods, but it looks like a wrong glyph is assigned
to it in scim-tables (KATAKANA). Here's an example.

[scim-tables: KATAKANA]
pu [Space] - ru [Space] gives プ－ル
[UIM-anthy: hiragana]
pu-ru [F7] gives プール

They look almost the same but different. When I tried the first
one on Babel Fish translator (from Japanese to English), it
returned "プ? ル" instead of "pool".

I don't use scim-tables for Japanese, so not sure if there's any
other way to input the correct one...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

LiuCougar - 2004-11-21

Logged In: YES
user_id=600121

Yukiko suggested that there may be a mistake in the table
file for katakana. Do you think how can I rectify it?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-11-21

Logged In: YES
user_id=312071

My guess is that "scim_event.cpp" should be edited somehow but
I don't know how.

Apart from that, how about also applying the attached patch to
scim-tables-ja (which I do use, to answer to Yukiko's question)?
Writing ???? as "kaaten" is, IMHO, much more convenient than
"ka - ten". It also adds some foreign syllables so that I can, for
example, write my home land ?????? as "finlando" instead of
"w_inrando".

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-11-21

prolongedsound-double-vowel-trick.diff

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-11-21

Logged In: YES
user_id=312071

Another related question, BTW: how do I get the "middle dot"?-)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

LiuCougar - 2004-11-21

Logged In: YES
user_id=600121

thanks for your patch.

As I have no idea about Japanese, could the other Japanese
users comment on the patch?

Yukiko, what do you think?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-11-21

Improved version of the "double vowel trick" patch

prolongedsound-double-vowel-trick2.diff

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-11-21

Logged In: YES
user_id=312071

I'm attaching a bit improved version of the patch. It also defines
alternative spellings for "oo" ("o-", "ou") and "ee" ("e-", "ei").

Making similar shortcuts for double consonants (e.g. "hokkaido"
=> "ho_tsukaido") for hiragana, too, might also be handy --
perhaps even *generating* all the necessary combinations at
build time.

I'm not an expert in kana writing or native Japanese, though, so I'd
also very much like to hear Yukiko's comments on wether or not
these sound like good ideas or not.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yukiko Bando - 2004-11-22

Logged In: YES
user_id=1163254

> how do I get the "middle dot"?
It looks like the middle dot is not defined in scim-tables-ja. In
UIM-anthy and UIM-skk (which I use), "z/" is used for it.

> Writing ???? as "kaaten" is, IMHO, much more convenient than
> "ka - ten".
Maybe... but what if you want to write "kaasan(Mom)" in katakana
for some reason?

> write my home land ?????? as "finlando" instead of "w_inrando".
I think "finrando" would be better. As far as I know, "l" is not used
for ra-ri-ru-re-ro in any Japanese romanization system. It is
reserved for other thing in UIM-anthy. For example, "la" for a
small a.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yukiko Bando - 2004-11-22

Logged In: YES
user_id=1163254

I've uploaded japanese.scm to my web space. It defines all key
combinations to generate hiragana/katakana in UIM-anthy.
http://www.h4.dion.ne.jp/~apricots/files/japanese.scm
Note that it is written in eucjp encoding.

This page also might be of some help.
http://en.wikipedia.org/wiki/Romaji
(Click "nihongo" in "in other languages" to see more information.)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yukiko Bando - 2004-11-22

Logged In: YES
user_id=1163254

> It also defines alternative spellings for "oo" ("o-", "ou") and
> "ee" ("e-", "ei").
Sorry, I don't seem to understand your intention here, but please
note that "oo", "ou" and "o-" are all different. For example:

"Osaka" is "o o sa ka" in hiragana/katakana
"king" is "o u", not "o o"
"OK" is "o - ke (-)" in katakana.

I think the problem is "-" (for the lengthening dash) is not treated
the same way as a consonant. For example, you can simply type
"biru" for "building" but have to use Space to write "beer" like
"bi(Space)-ru". If "bi" is committed when you input "-", you don't
need to press Space.

"hokkaido" is a common practice.

> perhaps even *generating* all the necessary combinations at
> build time.
That will improve usability. From the point of view of a native
Japanese, the current scim-tables-ja is not easy to use...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-11-22

Logged In: YES
user_id=312071

>> "ee" ("e-", "ei").
>Sorry, I don't seem to understand your intention here

I suggested adding alternative spellings so you can easily switch
between them. Suppose you want to write cake in Japanese:
"keeki" would automatically become "ke-ki" in katakan mode. If,
however, you wanted to write "business" in katana (like in your
"okaasan" example), you could write "kee<down>ki" to get "ke i
ki".

Of course, you could always type "keiki" and "oka asan" but since
<down> (or whatever you have configured it to) is supposed to
give you alternative spellings in Scim, I think it would be logical for
katakana and hiragana ("oo" vs. "ou") to do so, too.

> I think the problem is "-" (for the lengthening dash) is
> not treated the same way as a consonant.

Mm. Perhaps that should be corrected. However, as the standard
romanization for beer is "biiru", making scim-tables-ja output
"bi-ru" by default in katakana mode (and having "ii" as alternative,
toggled by <down>) would also seem quite logical and convenient
to me. Even more so because it's not obvious to the user what
sort of "kana dash" the dash key is mapped to, but it's
immediately clear that the dash that appears when you write
"biiru" is most likely the correct one.

> "hokkaido" is a common practice. [...]
> That will improve usability.

Ok, I'll write a shell script once we agree about the vowel
doubling.

> I think "finrando" would be better.

Yes, you're probably right. You have to know some katakana in
order to write your name with it anyway, so using standard
romanization rules is most likely better than providing hints like
this.

> It looks like the middle dot is not defined in scim-tables-ja.
> In UIM-anthy and UIM-skk (which I use), "z/" is used for it.

Hmm. Scim-tables seems to ignore punctuation characters like '/'
in the table definitions. In your opinion, would for example the "x"
key do?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yukiko Bando - 2004-11-22

Logged In: YES
user_id=1163254

> <down> is supposed to give you alternative spellings in Scim, I
> think it would be logical for katakana and hiragana to do so, too.

In commonly used Japanese input methods, we press <space>
to convert a string of hiragana into a mixture of kanji and hiragana
(and sometimes katakana). We never have to press that key
while writing in hiragana or katakana. Frankly, "kee<down>ki"
appears very odd to me.

However, I have the impression that scim-tables-ja is meant for
non-native occassional writers. So, if you think giving alternatives
would help people who are not familiar with Japanese or any
Japanese input methods, it might be a good idea.

"x" is also used to get small vowels like "l".

If you think ease of use (for non-native writers) should come first,
it might also be better to allow "finlando". It's easier to type
lemon than remon for me too. :)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

LiuCougar - 2004-11-24

assigned_to: nobody --> ybando
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-11-26

Logged In: YES
user_id=312071

Ok, here are the generation scripts ("bash HIRAGANA.txt.in.sh >
HIRAGANA.txt.in"). They are a compromise based on the discussion:

Double consonant and the foreign extras are in but the vowel lengthening trick is
*reversed*, i.e. typing 'kaaten' now results in "ka a te n" by default and
'kaa<down>ten' makes it a dash: "ka- te n". Key 'x' now works as an alternative to '_'
so instructions for other systems mostly apply and 'q' writes the middle dot (this is a
bit questionable but it's the best I could come up with without editing any C code).

I'm quite satisfied with the tables now. Some examples:

In katakana mode 'jack daniels' writes "zi _a _tsu ku da ni e ru su", and 'helsinki' writes
"he ru su _i n ki". On the other hand, 'keeki' => "ke e ki" and 'kee<down>ki' => "ke-
ki" and further 'kee<down><down>ki' => "ke i ki" (of course 'keiki' works too!).

In hiragana, 'hokkaido' => "ho _tsu ka i do" but 'gomennasai' => 'go me n na sa i' as
expected.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-11-26

Logged In: YES
user_id=312071

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-11-26

Logged In: YES
user_id=312071

(a few more minor fixes)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-11-26

Logged In: YES
user_id=312071

(...and still some more. This is the last one for now, I promise :))

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-11-26

Generation scripts for both kanas

kana-scripts-3.tar.gz

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

LiuCougar - 2004-11-27

labels: --> Everyday use (inputing)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

LiuCougar - 2004-11-27

Logged In: YES
user_id=600121

How to use the new tables: (reply to Yukiko)

AFAICT, just download
http://sourceforge.net/tracker/download.php?
group_id=108454&atid=650540&file_id=110169&aid=1070059
KATAKANA.txt.in.sh > KATAKANA.txt.in
do not forget to
chmod +x KATAKANA.txt.in.sh
cp KATAKANA.txt.in scim-tables/ja to replace the original one,
and make && make install

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-11-28

Logged In: YES
user_id=312071

Yes, or instead of chmod'ing, you could type "bash
KATAKANA.txt.in.sh > KATAKANA.txt.in && bash
HIRAGANA.txt.in.sh > HIRAGANA.txt.in". Of course, in the final
packaging, the makefile should run the scripts.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yukiko Bando - 2004-11-28

Logged In: YES
user_id=1163254

Thanks, Cougar, for your help.

> In katakana mode 'jack daniels' writes "zi _a _tsu ku da ni e ru
> su", and 'helsinki' writes "he ru su _i n ki".

It works great! Congratulations! It's an interesting way of typing
foreign words in Japanese katakana. :)

Though, there seems to be no way to type the legthening dash
independently... Also, how about adding combinations of a
consonat and y which mostly end with the dash when written in
katakana? Here are some examples.

ny > "ni-" as in Sony
ky > "ki-" as in wisky
dy > "de_i-" as in candy
ly > "ri-" as in lily
py > "pi-" as in Snoopy etc.

> 'q' writes the middle dot

I think it's a good idea.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jarno Elonen - 2004-12-12

Logged In: YES
user_id=312071

I filed a separate issue for the punctuation marks: #1083680.
Will try the extensions suggested by Yukiko once I get some
other busy business out of my hands.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Inputting "vowel lengthening dash" in katakana mode?

Group

Searches

Help

#1 Inputting "vowel lengthening dash" in katakana mode?

Discussion