|
From: Bert F. <be...@im...> - 2004-09-28 10:19:57
|
Folks, I discussed this with Michael a bit and we came up with this proposal: * Add a new system attribute 1006 returning a string describing = the=20 expected VM string encoding (http://minnow.cc.gatech.edu/squeak/314). * Values are "UTF-8", "macintosh", "ISO-8859-1", etc. (exact = spelling=20 as in http://www.iana.org/assignments/character-sets). * If attribute 1006 is not supported, assume "macintosh". Using an attribute instead of a primitive seems simpler than a new=20 primitive. I'm not sure but I think it's possible to have filesystem-specific file=20= name encodings under Unix. If neccessary, we should add a primitive to=20= FilePlugin to answer the encoding for a particular directory, using the=20= same values as described above. Anyway, we probably can leave that to=20 later. - Bert - Am 27.09.2004 um 02:54 schrieb Andreas Raab: > John, > > We talked about this in the past - we need to do something to figure=20= > out what the primitives expect in their stringy interfaces (file=20 > names, clipboard, etc). I'm still in favour of having a primitive=20 > which answers the VM's expected encoding and defaults to MacRoman=20 > (which is indeed what I think most VMs actually use). After which we=20= > can start playing with using UTF8 or Latin1 or whatever else (I could=20= > easily imagine that an Eastern European VM uses a different encoding=20= > than a Far East VM). > > Cheers, > - Andreas > > PS. I bought a Really, Really Good (tm) coffee maker today. It's just=20= > unbelievable how good coffee can be (heh, heh). > > ----- Original Message ----- From: "John M McIntosh"=20 > <jo...@sm...> > To: "The general-purpose Squeak developers list"=20 > <squ...@li...> > Sent: Sunday, September 26, 2004 3:43 PM > Subject: Re: umlaute in squeak? > > > I think for this we need cut/copy/paste primitives that understand > unicode > Yell louder, I'm sure it exists in the OS api, just no-one has looked > at it yet... > This would imply that m17n would need to handle things. I'd think one > could change the methods to > figure out if the VM supports unicode cut/copy/paste and do the right > thing... > > Perhaps one could even be convinced to allow for other types of data = on > the clipboard (pictures?) > > On Sep 26, 2004, at 11:18 AM, Bert Freudenberg wrote: > >> Yep, there are still some open ends in m17n, mostly VM related. For=20= >> example, cut and paste from external sources shredders umlauts=20 >> (tested on Win and Mac), and file names in the file list do not look=20= >> right (although, on the Mac at least they can still be accessed). >> >> - Bert - >> >> Am 26.09.2004 um 19:26 schrieb Martin Kuball: >> >>> Hi! >>> >>> After some digging in the source code I found the problem. I'm using=20= >>> a >>> utf8 locale and that produces 2 byte characters for the special >>> german characters. But the vm uses only the 1st byte. This explains >>> why I always see the same character for different umlaut characters. >>> They always have the same 1st byte and differ only in the 2nd byte. >>> >>> I will try to work out a solution (other than changing the locale, >>> because I think it should work out of the box in as many = environments >>> as possible) >>> >>> Martin >>> >>> >>> Am Monday 20 September 2004 11:12 schrieb danil a. osipchuk: >>>> Hi, Martin >>>> >>>> It seems that you are using unix vm. I've solved the issue by >>>> editing sqUnixX11.c and setting there: >>>> static x2sqKey_t x2sqKey=3D x2sqKeyInput; >>>> (it's x2sqKeyPlain by default in sources on Ian site). I also have >>>> built some fonts from TTF (russian in my case). >>>> After rebuilding vm I've got squeak with russian fonts. >>>> I hope that things will be less complicated when m17n project will >>>> be included in core Squeak. Also, there are a plenty of German >>>> squeakers here - may be they will point the shortest path. >>>> Danil >>>> >>>>> Am Saturday 18 September 2004 23:13 schrieb Bernhard Pieber: >>>>>> Martin Kuball <Mar...@we...> wrote: >>>>>>> Is it possible to enter non 7bit characters like german umlaute >>>>>>> into squeak text fields? When I type one of these (s=DDS...) I >>>>>>> only get an A with a ~ above it. >>>>>> >>>>>> What do you mean by squeak text fields? I just tried it in a >>>>>> workspace in 3.7 and 3.8alpha and there it works. Which version >>>>>> of Squeak and which font did you use? >>>>> >>>>> With text field I mean any morph where you can enter text. I tried >>>>> with the new 3.7full and the standard font. By the way it has >>>>> never worked for me. I even tried the Windows version once but it >>>>> showed the same behaviour. >>>>> >>>>> Martin |
|
From: John M M. <jo...@sm...> - 2004-10-03 00:03:51
|
On Oct 2, 2004, at 11:29 AM, Bert Freudenberg wrote: >> > > Hmm, can't comment on that ... John? Well I did, but it only went to Yoshiki Begin forwarded message: > From: John M McIntosh <jo...@sm...> > Date: October 1, 2004 8:29:48 PM PDT > To: Yoshiki Ohshima <Yos...@ac...> > Subject: Re: [Squeak-VMdev] Proposal: VM string encoding > > Hi, have you tried altering the Info.plist in the Squeak VM ? > > key is SqueakEncodingType > value right now is macintosh > > but could it be > ShiftJIS > or > UTF-8 -- ======================================================================== === John M. McIntosh <jo...@sm...> 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== === |
|
From: Bert F. <be...@im...> - 2004-10-03 09:51:53
|
Am 03.10.2004 um 02:02 schrieb John M McIntosh: > On Oct 2, 2004, at 11:29 AM, Bert Freudenberg wrote: > >> Hmm, can't comment on that ... John? > > Well I did, but it only went to Yoshiki Okay. I just noticed Yoshiki is subscribed here, too, so we do not need to cc: him. > Begin forwarded message: > >> From: John M McIntosh <jo...@sm...> >> Hi, have you tried altering the Info.plist in the Squeak VM ? >> >> key is SqueakEncodingType >> value right now is macintosh >> >> but could it be >> ShiftJIS >> or >> UTF-8 And this is a single global value applying to all operations, right? I just grep'ed for gCurrentVMEncoding and it seems to be used all over the place. - Bert - |
|
From: John M M. <jo...@sm...> - 2004-10-04 02:49:31
|
On Oct 3, 2004, at 2:51 AM, Bert Freudenberg wrote: >>> key is SqueakEncodingType >>> value right now is macintosh >>> >>> but could it be >>> ShiftJIS >>> or >>> UTF-8 > > And this is a single global value applying to all operations, right? I > just grep'ed for gCurrentVMEncoding and it seems to be used all over > the place. > > - Bert - Yes but if you look closer you'll see it's only used for file name operations. -- ======================================================================== === John M. McIntosh <jo...@sm...> 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== === |
|
From: Yoshiki O. <Yos...@ac...> - 2004-10-05 00:35:28
|
John and Bert, > > Hi, have you tried altering the Info.plist in the Squeak VM ? > > > > key is SqueakEncodingType > > value right now is macintosh > > > > but could it be > > ShiftJIS > > or > > UTF-8 Ah, I think this is how the VM distributed in the japanese distribution for Mac. My understanding must have been obsolete. -- Yoshiki |
|
From: Andreas R. <and...@gm...> - 2004-10-05 02:00:24
|
Hi Guys, Please, keep it simple. The question at hand was whether we need to distinguish between various (potentially differing) encodings or not. Has this question be sufficiently answered? Cheers, - Andreas ----- Original Message ----- From: "Yoshiki Ohshima" <Yos...@ac...> To: "Squeak VM Developers" <squ...@li...> Sent: Monday, October 04, 2004 5:32 PM Subject: Re: [Squeak-VMdev] Proposal: VM string encoding > Bert, > >> Okay. But how should the VM know which one is actually in use? I don't >> know any system call to get a character set. > > It is often the case that there is no perfectly reliable way. > Reading the environment variables that the IM and window systems use > is the common way. > > (Did I write EUC for clipboard? The native encoding was > "x-compound-text"...) > >> >> Or can the encoding differ for distinct clipboard accesses or filename >> >> uses? If the latter, then a VM attribute would not be optimal. >> > >> > For file name, it has to be set in the earlier stage of VM >> > invocation, but the clipboard and keyboard input can be set later. >> > So, potentially read-write VM attributes would do it. >> >> ... so why would we need to *set* it if it's constant? Can't the >> conversion happen on the image side? > > Yes, it does and it doesn't have to be a read-write attribute. (Note > the word "potentially".) But if the VM and underlying system support, > both X Compound Text and UTF-8 for clipboard encoding, it would be > nice if we can explicitly set to use UTF-8 so that wider set of > characters can be handled. > > I admit that setting the encoding concept could introduce more > headache, though... > > -- Yoshiki > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out > more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Squeak-VMdev mailing list > Squ...@li... > https://lists.sourceforge.net/lists/listinfo/squeak-vmdev > |
|
From: Yoshiki O. <Yos...@ac...> - 2004-10-08 01:46:16
|
Andreas, > Please, keep it simple. The question at hand was whether we need to > distinguish between various (potentially differing) encodings or not. Has > this question be sufficiently answered? Ah, yes. read-only attribute is ok, and different encodings for file names, clipboard, and keyboard input (and default system encoding for file contents) is necessary. -- Yoshiki |
|
From: Andreas R. <and...@gm...> - 2004-09-28 12:51:24
|
Sounds good to me. Cheers, - A. ----- Original Message -----=20 From: "Bert Freudenberg" <be...@im...> To: <squ...@li...> Cc: "Michael R=FCger" <mi...@im...>; "Yoshiki Ohshima"=20 <Yos...@ac...> Sent: Tuesday, September 28, 2004 3:20 AM Subject: [Squeak-VMdev] Proposal: VM string encoding Folks, I discussed this with Michael a bit and we came up with this proposal: * Add a new system attribute 1006 returning a string describing the expected VM string encoding (http://minnow.cc.gatech.edu/squeak/314). * Values are "UTF-8", "macintosh", "ISO-8859-1", etc. (exact spelling as in http://www.iana.org/assignments/character-sets). * If attribute 1006 is not supported, assume "macintosh". Using an attribute instead of a primitive seems simpler than a new primitive. I'm not sure but I think it's possible to have filesystem-specific file name encodings under Unix. If neccessary, we should add a primitive to FilePlugin to answer the encoding for a particular directory, using the same values as described above. Anyway, we probably can leave that to later. - Bert - Am 27.09.2004 um 02:54 schrieb Andreas Raab: > John, > > We talked about this in the past - we need to do something to figure ou= t=20 > what the primitives expect in their stringy interfaces (file names,=20 > clipboard, etc). I'm still in favour of having a primitive which answer= s=20 > the VM's expected encoding and defaults to MacRoman (which is indeed wh= at=20 > I think most VMs actually use). After which we can start playing with=20 > using UTF8 or Latin1 or whatever else (I could easily imagine that an=20 > Eastern European VM uses a different encoding than a Far East VM). > > Cheers, > - Andreas > > PS. I bought a Really, Really Good (tm) coffee maker today. It's just=20 > unbelievable how good coffee can be (heh, heh). > > ----- Original Message ----- From: "John M McIntosh"=20 > <jo...@sm...> > To: "The general-purpose Squeak developers list"=20 > <squ...@li...> > Sent: Sunday, September 26, 2004 3:43 PM > Subject: Re: umlaute in squeak? > > > I think for this we need cut/copy/paste primitives that understand > unicode > Yell louder, I'm sure it exists in the OS api, just no-one has looked > at it yet... > This would imply that m17n would need to handle things. I'd think one > could change the methods to > figure out if the VM supports unicode cut/copy/paste and do the right > thing... > > Perhaps one could even be convinced to allow for other types of data on > the clipboard (pictures?) > > On Sep 26, 2004, at 11:18 AM, Bert Freudenberg wrote: > >> Yep, there are still some open ends in m17n, mostly VM related. For=20 >> example, cut and paste from external sources shredders umlauts (tested= =20 >> on Win and Mac), and file names in the file list do not look right=20 >> (although, on the Mac at least they can still be accessed). >> >> - Bert - >> >> Am 26.09.2004 um 19:26 schrieb Martin Kuball: >> >>> Hi! >>> >>> After some digging in the source code I found the problem. I'm using = a >>> utf8 locale and that produces 2 byte characters for the special >>> german characters. But the vm uses only the 1st byte. This explains >>> why I always see the same character for different umlaut characters. >>> They always have the same 1st byte and differ only in the 2nd byte. >>> >>> I will try to work out a solution (other than changing the locale, >>> because I think it should work out of the box in as many environments >>> as possible) >>> >>> Martin >>> >>> >>> Am Monday 20 September 2004 11:12 schrieb danil a. osipchuk: >>>> Hi, Martin >>>> >>>> It seems that you are using unix vm. I've solved the issue by >>>> editing sqUnixX11.c and setting there: >>>> static x2sqKey_t x2sqKey=3D x2sqKeyInput; >>>> (it's x2sqKeyPlain by default in sources on Ian site). I also have >>>> built some fonts from TTF (russian in my case). >>>> After rebuilding vm I've got squeak with russian fonts. >>>> I hope that things will be less complicated when m17n project will >>>> be included in core Squeak. Also, there are a plenty of German >>>> squeakers here - may be they will point the shortest path. >>>> Danil >>>> >>>>> Am Saturday 18 September 2004 23:13 schrieb Bernhard Pieber: >>>>>> Martin Kuball <Mar...@we...> wrote: >>>>>>> Is it possible to enter non 7bit characters like german umlaute >>>>>>> into squeak text fields? When I type one of these (s=DDS...) I >>>>>>> only get an A with a ~ above it. >>>>>> >>>>>> What do you mean by squeak text fields? I just tried it in a >>>>>> workspace in 3.7 and 3.8alpha and there it works. Which version >>>>>> of Squeak and which font did you use? >>>>> >>>>> With text field I mean any morph where you can enter text. I tried >>>>> with the new 3.7full and the standard font. By the way it has >>>>> never worked for me. I even tried the Windows version once but it >>>>> showed the same behaviour. >>>>> >>>>> Martin ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Squeak-VMdev mailing list Squ...@li... https://lists.sourceforge.net/lists/listinfo/squeak-vmdev |
|
From: Bert F. <be...@im...> - 2004-09-28 13:42:53
|
Glad to hear that :) One thing I forgot to mention against our proposal was that a new=20 primitive would allow the VM to know it's dealing with a new image. Old=20= images could still be delivered the macintosh encoding they expect. But=20= maybe the VM could detect that through other means. Don't know if=20 that's even a real issue. - Bert - Am 28.09.2004 um 14:50 schrieb Andreas Raab: > Sounds good to me. > > Cheers, > - A. > > ----- Original Message ----- From: "Bert Freudenberg" <be...@im...> > To: <squ...@li...> > Cc: "Michael R=FCger" <mi...@im...>; "Yoshiki Ohshima"=20 > <Yos...@ac...> > Sent: Tuesday, September 28, 2004 3:20 AM > Subject: [Squeak-VMdev] Proposal: VM string encoding > > > Folks, > > I discussed this with Michael a bit and we came up with this proposal: > > * Add a new system attribute 1006 returning a string describing the > expected VM string encoding (http://minnow.cc.gatech.edu/squeak/314). > * Values are "UTF-8", "macintosh", "ISO-8859-1", etc. (exact spelling > as in http://www.iana.org/assignments/character-sets). > * If attribute 1006 is not supported, assume "macintosh". > > Using an attribute instead of a primitive seems simpler than a new > primitive. > > I'm not sure but I think it's possible to have filesystem-specific = file > name encodings under Unix. If neccessary, we should add a primitive to > FilePlugin to answer the encoding for a particular directory, using = the > same values as described above. Anyway, we probably can leave that to > later. > > - Bert - > > Am 27.09.2004 um 02:54 schrieb Andreas Raab: > >> John, >> >> We talked about this in the past - we need to do something to figure=20= >> out what the primitives expect in their stringy interfaces (file=20 >> names, clipboard, etc). I'm still in favour of having a primitive=20 >> which answers the VM's expected encoding and defaults to MacRoman=20 >> (which is indeed what I think most VMs actually use). After which we=20= >> can start playing with using UTF8 or Latin1 or whatever else (I could=20= >> easily imagine that an Eastern European VM uses a different encoding=20= >> than a Far East VM). >> >> Cheers, >> - Andreas >> >> PS. I bought a Really, Really Good (tm) coffee maker today. It's just=20= >> unbelievable how good coffee can be (heh, heh). >> >> ----- Original Message ----- From: "John M McIntosh"=20 >> <jo...@sm...> >> To: "The general-purpose Squeak developers list"=20 >> <squ...@li...> >> Sent: Sunday, September 26, 2004 3:43 PM >> Subject: Re: umlaute in squeak? >> >> >> I think for this we need cut/copy/paste primitives that understand >> unicode >> Yell louder, I'm sure it exists in the OS api, just no-one has looked >> at it yet... >> This would imply that m17n would need to handle things. I'd think one >> could change the methods to >> figure out if the VM supports unicode cut/copy/paste and do the right >> thing... >> >> Perhaps one could even be convinced to allow for other types of data=20= >> on >> the clipboard (pictures?) >> >> On Sep 26, 2004, at 11:18 AM, Bert Freudenberg wrote: >> >>> Yep, there are still some open ends in m17n, mostly VM related. For=20= >>> example, cut and paste from external sources shredders umlauts=20 >>> (tested on Win and Mac), and file names in the file list do not look=20= >>> right (although, on the Mac at least they can still be accessed). >>> >>> - Bert - >>> >>> Am 26.09.2004 um 19:26 schrieb Martin Kuball: >>> >>>> Hi! >>>> >>>> After some digging in the source code I found the problem. I'm=20 >>>> using a >>>> utf8 locale and that produces 2 byte characters for the special >>>> german characters. But the vm uses only the 1st byte. This explains >>>> why I always see the same character for different umlaut = characters. >>>> They always have the same 1st byte and differ only in the 2nd byte. >>>> >>>> I will try to work out a solution (other than changing the locale, >>>> because I think it should work out of the box in as many=20 >>>> environments >>>> as possible) >>>> >>>> Martin >>>> >>>> >>>> Am Monday 20 September 2004 11:12 schrieb danil a. osipchuk: >>>>> Hi, Martin >>>>> >>>>> It seems that you are using unix vm. I've solved the issue by >>>>> editing sqUnixX11.c and setting there: >>>>> static x2sqKey_t x2sqKey=3D x2sqKeyInput; >>>>> (it's x2sqKeyPlain by default in sources on Ian site). I also have >>>>> built some fonts from TTF (russian in my case). >>>>> After rebuilding vm I've got squeak with russian fonts. >>>>> I hope that things will be less complicated when m17n project will >>>>> be included in core Squeak. Also, there are a plenty of German >>>>> squeakers here - may be they will point the shortest path. >>>>> Danil >>>>> >>>>>> Am Saturday 18 September 2004 23:13 schrieb Bernhard Pieber: >>>>>>> Martin Kuball <Mar...@we...> wrote: >>>>>>>> Is it possible to enter non 7bit characters like german umlaute >>>>>>>> into squeak text fields? When I type one of these (s=DDS...) I >>>>>>>> only get an A with a ~ above it. >>>>>>> >>>>>>> What do you mean by squeak text fields? I just tried it in a >>>>>>> workspace in 3.7 and 3.8alpha and there it works. Which version >>>>>>> of Squeak and which font did you use? >>>>>> >>>>>> With text field I mean any morph where you can enter text. I = tried >>>>>> with the new 3.7full and the standard font. By the way it has >>>>>> never worked for me. I even tried the Windows version once but it >>>>>> showed the same behaviour. >>>>>> >>>>>> Martin |
|
From: Yoshiki O. <Yos...@ac...> - 2004-09-28 23:34:17
|
Hello, At Tue, 28 Sep 2004 12:20:03 +0200, Bert Freudenberg wrote: > > Folks, > > I discussed this with Michael a bit and we came up with this proposal: > > * Add a new system attribute 1006 returning a string describing the > expected VM string encoding (http://minnow.cc.gatech.edu/squeak/314). > * Values are "UTF-8", "macintosh", "ISO-8859-1", etc. (exact spelling > as in http://www.iana.org/assignments/character-sets). > * If attribute 1006 is not supported, assume "macintosh". > > Using an attribute instead of a primitive seems simpler than a new > primitive. > > I'm not sure but I think it's possible to have filesystem-specific file > name encodings under Unix. If neccessary, we should add a primitive to > FilePlugin to answer the encoding for a particular directory, using the > same values as described above. Anyway, we probably can leave that to > later. We'll need a way to use possibly different encodings for: * File name * clipboard * keyboard input and possibly * default system encoding Assigning 1006 through 1008 would be good. I like the symbolic name idea, and vm attribute sounds good to me. Thank you for the proposal. I once signed up to implement it, but didn't quite get my hands on it. -- Yoshiki |
|
From: Bert F. <be...@im...> - 2004-10-01 21:32:58
|
Am 29.09.2004 um 01:33 schrieb Yoshiki Ohshima: > Hello, > > At Tue, 28 Sep 2004 12:20:03 +0200, > Bert Freudenberg wrote: >> >> Folks, >> >> I discussed this with Michael a bit and we came up with this proposal: >> >> * Add a new system attribute 1006 returning a string describing the >> expected VM string encoding (http://minnow.cc.gatech.edu/squeak/314). >> * Values are "UTF-8", "macintosh", "ISO-8859-1", etc. (exact spelling >> as in http://www.iana.org/assignments/character-sets). >> * If attribute 1006 is not supported, assume "macintosh". >> >> Using an attribute instead of a primitive seems simpler than a new >> primitive. >> >> I'm not sure but I think it's possible to have filesystem-specific >> file >> name encodings under Unix. If neccessary, we should add a primitive to >> FilePlugin to answer the encoding for a particular directory, using >> the >> same values as described above. Anyway, we probably can leave that to >> later. > > We'll need a way to use possibly different encodings for: > > * File name > * clipboard > * keyboard input > and possibly > * default system encoding > > Assigning 1006 through 1008 would be good. Can you describe the circumstances when these encodings are different from the default encoding? Are they at least constant over a session? Or can the encoding differ for distinct clipboard accesses or filename uses? If the latter, then a VM attribute would not be optimal. - Bert - |
|
From: Yoshiki O. <Yos...@ac...> - 2004-10-02 00:49:44
|
Bert, > > We'll need a way to use possibly different encodings for: > > > > * File name > > * clipboard > > * keyboard input > > and possibly > > * default system encoding > > > > Assigning 1006 through 1008 would be good. > > Can you describe the circumstances when these encodings are different > from the default encoding? It is rather typical in this part of the world on Unices. Older systems tend to use EUC and newer ones usually trys to use Unicode, but often those are mixed in one environment and possibly all combinations can exist. I don't know too much about Mac, but as far as I know, it has different set of file access APIs and none of them is good enough to be used exclusively. That is why (as far as I heard) the Japanese Squeak running on Mac OS X cannot use the Japanese image file name because at first the VM uses the MacRoman version of file name. > Are they at least constant over a session? Yes. We can assume this. > Or can the encoding differ for distinct clipboard accesses or filename > uses? If the latter, then a VM attribute would not be optimal. For file name, it has to be set in the earlier stage of VM invocation, but the clipboard and keyboard input can be set later. So, potentially read-write VM attributes would do it. -- Yoshiki |
|
From: Bert F. <be...@im...> - 2004-10-02 18:30:35
|
Am 02.10.2004 um 02:47 schrieb Yoshiki Ohshima: > Bert, > >>> We'll need a way to use possibly different encodings for: >>> >>> * File name >>> * clipboard >>> * keyboard input >>> and possibly >>> * default system encoding >>> >>> Assigning 1006 through 1008 would be good. >> >> Can you describe the circumstances when these encodings are different >> from the default encoding? > > It is rather typical in this part of the world on Unices. Older > systems tend to use EUC and newer ones usually trys to use Unicode, > but often those are mixed in one environment and possibly all > combinations can exist. Okay. But how should the VM know which one is actually in use? I don't know any system call to get a character set. > I don't know too much about Mac, but as far > as I know, it has different set of file access APIs and none of them > is good enough to be used exclusively. That is why (as far as I > heard) the Japanese Squeak running on Mac OS X cannot use the Japanese > image file name because at first the VM uses the MacRoman version of > file name. Hmm, can't comment on that ... John? >> Are they at least constant over a session? > > Yes. We can assume this. okay ... >> Or can the encoding differ for distinct clipboard accesses or filename >> uses? If the latter, then a VM attribute would not be optimal. > > For file name, it has to be set in the earlier stage of VM > invocation, but the clipboard and keyboard input can be set later. > So, potentially read-write VM attributes would do it. ... so why would we need to *set* it if it's constant? Can't the conversion happen on the image side? - Bert - |
|
From: Yoshiki O. <Yos...@ac...> - 2004-10-05 00:35:27
|
Bert, > Okay. But how should the VM know which one is actually in use? I don't > know any system call to get a character set. It is often the case that there is no perfectly reliable way. Reading the environment variables that the IM and window systems use is the common way. (Did I write EUC for clipboard? The native encoding was "x-compound-text"...) > >> Or can the encoding differ for distinct clipboard accesses or filename > >> uses? If the latter, then a VM attribute would not be optimal. > > > > For file name, it has to be set in the earlier stage of VM > > invocation, but the clipboard and keyboard input can be set later. > > So, potentially read-write VM attributes would do it. > > ... so why would we need to *set* it if it's constant? Can't the > conversion happen on the image side? Yes, it does and it doesn't have to be a read-write attribute. (Note the word "potentially".) But if the VM and underlying system support, both X Compound Text and UTF-8 for clipboard encoding, it would be nice if we can explicitly set to use UTF-8 so that wider set of characters can be handled. I admit that setting the encoding concept could introduce more headache, though... -- Yoshiki |