Download Latest Version convertcp_v8.4_x86.zip (18.8 kB)
Email in envelope

Get an email when there's a new version of CONVERTCP

Home / custom charsets
Name Modified Size InfoDownloads / Week
Parent folder
MIK.sbcs 2019-12-23 5.0 kB
readme.md 2019-12-23 1.6 kB
Totals: 2 Items   6.6 kB 0

Custom charsets are supported as long as they are single-byte charsets. That is, a Unicode code point is represented by a single byte only.

The character map of such a charset has to be defined in an ASCII text file (alternatively, in a UTF-8 text file without BOM).

The character map consists of one line for each supported code point.

Each line consists of three comma-separated numeric values
[priority], [code point], [character value]

  • priority
    Either 0 or 1.
    A charset may require code points of similar-shaped glyphs to be represented by the same character. E.g. the MIK charset requires both the Greek lowercase beta (U+03B2, β) and the German sharp s (U+00DF, ß) to be represented by character 0xE1. If a conversion from the custom charset to another charset is performed then the character translation defaults to the code point marked with priority 0.
    Make sure that the character map contains exactly one code point marked with priority 0 for each character 0x00..0xFF.

  • code point
    Unicode code point that a character shall represent.

  • character value
    The value of the byte that represents the given code point in the custom charset.

Example lines for the mentioned β and ß in the MIK character map:

0, 0x0003B2, 0xE1
1, 0x0000DF, 0xE1

To use the custom charset, pass the name of the charset file rather than the code page ID.
E.g. perform the conversion from UTF-16 LE to MIK:

convertcp utf-16 "MIK.sbcs" /i "u16.txt" /o "mik.txt"

or vice versa:

convertcp "MIK.sbcs" utf-16 /i "mik.txt" /o "u16.txt" /b
Source: readme.md, updated 2019-12-23