#180 extending coding cookie for sjis

Completed
closed
SciTE (236)
2
2011-11-02
2004-12-31
XhE
No

Hi guys,

as I'm studying Japanese und using latex there is no other
way than the use of Shift-JIS, as also Japanese Versions
of LaTeX (pLaTeX) don't support UTF-8/16 Encoding. :((
But I also don't want to define the encodings of my file
within a local Scite.Properties file, but on per file base.

Therefore the introduction of the utf-8 cookie was already
very very useful. I want you to ask to include just a
mapping
of

-*- coding: sjis -*-

or if you consider it better

-*- coding: shift-jis -*-

to map code.page=932
and character.set=128

that would be very fine.

If there is already a possibility to configure
scite.properties on a per file base, please let me
know.

thx

Discussion

  • Neil Hodgson

    Neil Hodgson - 2005-01-04
    • priority: 5 --> 2
    • assigned_to: nobody --> nyamatongwe
     
  • Neil Hodgson

    Neil Hodgson - 2005-01-04

    Logged In: YES
    user_id=12579

    It is unlikely I will work on this feature. A well written patch
    would be considered.

     
  • XhE

    XhE - 2005-01-04

    Logged In: YES
    user_id=1085342

    Ok, now here is a patch.

    I just define a new type of encoding beside uni8Bit and
    uniCookie.
    I called it uniSJIS. In OpenFile I now check also for uniSJIS.
    And I also have to call ReadFontProperties in order to set the
    right characterSet. That's quite all.

    Thx

     
  • XhE

    XhE - 2005-01-04

    Logged In: YES
    user_id=1085342

    sorry, not working.

    I suppose I forgot something. Write again, when it's working.

     
  • XhE

    XhE - 2005-01-04

    Final Patch File for SJIS Cookie Support

     
  • XhE

    XhE - 2005-01-04

    Logged In: YES
    user_id=1085342

    ok, finally working.

    I didn't pay attention the fact, that SJIS coding is also
    treated
    as uni8Bit but another codepage and characterset.

    So new type uni8BitSJIS in UniMode-Enumeration. And when
    it's found codepage and characterset is changed, and the
    type of uniMode, just back to uni8Bit.

    That's all.

     
  • Neil Hodgson

    Neil Hodgson - 2005-02-03

    Logged In: YES
    user_id=12579

    The main problem with the patch is that if you switch to
    another buffer and back then the character set is changed
    back and so the Japanese characters are no longer displayed.

    The patch also did not handle all the type of Unicode
    encoding correctly. A small change improved this. Diffed
    against CVS:

    Index: SciTEIO.cxx

    RCS file: /cvsroot/scintilla/scite/src/SciTEIO.cxx,v
    retrieving revision 1.106
    diff -u -r1.106 SciTEIO.cxx
    --- SciTEIO.cxx 22 Jul 2004 12:40:10 -0000 1.106
    +++ SciTEIO.cxx 3 Feb 2005 00:03:47 -0000
    @@ -378,6 +378,8 @@
    code.lowercase();
    if (code == "utf-8") {
    return uniCookie;
    + } else if (code == "sjis") {
    + return uni8BitSJIS;
    }
    }
    }
    @@ -479,7 +481,13 @@
    unicodeMode = CookieValue(l2);
    }
    }
    - if (unicodeMode != uni8Bit) {
    + if (unicodeMode == uni8BitSJIS) {
    + codePage = 932;
    + characterSet = 128;
    + unicodeMode = uni8Bit;
    + // Need to reset character styles to match modified
    character set
    + ReadFontProperties();
    + } else if (unicodeMode != uni8Bit) {
    // Override the code page if Unicode
    codePage = SC_CP_UTF8;
    } else {

     
  • XhE

    XhE - 2005-03-14

    Logged In: YES
    user_id=1085342

    I see. I think it is now fixed. On my machine this problem
    did not appear as I hadn't yet enabled buffers. I know
    tested it, and it seems to be working. But maybe it would be
    even better to make a general fix to save the characterSet
    on a per buffer base. Nonetheless it's working. Please let
    me know if you will commit it to cvs.

    Diffed against CVS:

    Index: scite/src/SciTEBase.h

    RCS file: /cvsroot/scintilla/scite/src/SciTEBase.h,v
    retrieving revision 1.231
    diff -r1.231 SciTEBase.h
    160c160
    < uniCookie=4
    ---
    > uniCookie=4,uni8BitSJIS=5
    277a278
    > int characterSet;
    283c284
    < unicodeMode(uni8Bit), fileModTime(0), foldState() {
    ---
    > unicodeMode(uni8Bit), characterSet(-1), fileModTime(0),
    foldState() {
    Index: scite/src/SciTEBuffers.cxx
    ===================================================================
    RCS file: /cvsroot/scintilla/scite/src/SciTEBuffers.cxx,v
    retrieving revision 1.122
    diff -r1.122 SciTEBuffers.cxx
    202a203,206
    > if ( bufferNext.characterSet != -1 ) {
    > characterSet = bufferNext.characterSet;
    > ReadFontProperties();
    > }
    Index: scite/src/SciTEIO.cxx
    ===================================================================
    RCS file: /cvsroot/scintilla/scite/src/SciTEIO.cxx,v
    retrieving revision 1.106
    diff -r1.106 SciTEIO.cxx
    380,381c380,383
    < return uniCookie;
    < }
    ---
    > return uniCookie;
    > } else if (code == "sjis") {
    > return uni8BitSJIS;
    > }
    482,484c484,494
    < if (unicodeMode != uni8Bit) {
    < // Override the code page if Unicode
    < codePage = SC_CP_UTF8;
    ---
    >
    > if (unicodeMode == uni8BitSJIS) {
    > codePage = 932;
    > characterSet = 128;
    > buffers.buffers[buffers.Current()].characterSet = 128;
    > unicodeMode = uni8Bit;
    > // Need to reset character styles to match modified
    character set
    > ReadFontProperties();
    > } else if (unicodeMode != uni8Bit) {
    > // Override the code page if Unicode
    > codePage = SC_CP_UTF8;

     
  • Neil Hodgson

    Neil Hodgson - 2005-03-19

    Logged In: YES
    user_id=12579

    Shouldn't the code page also be added to the buffer
    information? The character set also needs to be initialised
    in Buffer::Init.

     
  • Neil Hodgson

    Neil Hodgson - 2011-11-02

    Can use command.discover.properties.

     
  • Neil Hodgson

    Neil Hodgson - 2011-11-02
    • milestone: --> Completed
    • labels: --> SciTE
    • status: open --> closed
     

Log in to post a comment.