Menu

Encoding Chinese Characters in INPUT or TEXTAREA by RSA

Help
henryho_hk
2014-03-05
2014-03-07
  • henryho_hk

    henryho_hk - 2014-03-05

    I tried to follow the online demo at https://www.pidder.de/pidcrypt/?page=demo_rsa-encryption for RSA encryption and decryption. Unfortunately, it works with ASCII characters only but not Chinese.

    First, I changed "crypted = rsa.encrypt(input);" to "crypted = rsa.encryptRaw(pidCryptUtil.encodeBase64(input,true));". It works but the max input length becomes very short, e.g. only 62 supplementary characters for a 4096-bit key-pair.

    Then, I changed the line to "crypted = rsa.encryptRaw(pidCryptUtil.encodeUTF8(input));". Again, it stops working with Chinese character. There seems to be a bug converting the BigInteger to the byte array or the byte array to the string, due to sign promotion when a byte is casted to an integer. I put a quick fix in pkcs1unpad2() [by changing the culprit line to String.fromCharCode(b[i]&0xff)] of rsa.js and unicode supplementary character works again. The max input length also gets longer, at about 84 supplementary characters for a 4096-bit key-pair.

    Further, I changed the line to "crypted = rsa.encryptRaw(unescape(encodeURIComponent(input)));". The max input length becomes 125 (supplementary characters for a 4096-bit key-pair). Anything longer will fail the encrypt part silently without any error (even in browser's javascript console).

    Questions:

    1) I did not dig into the JS codes deeply, but I am puzzled why unescape(encodeURIComponent(input)) would allow a longer input than pidCryptUtil.encodeUTF8(input) as they look really similar to me. Is "unescape(encodeURIComponent(input))" really safe to use?

    2) Also, it looks weird that the encrypt part would fail silently without the message "Message too long for RSA". Is it due to an overflow somewhere in jsbn.js?

     

    Last edit: henryho_hk 2014-03-05
  • Jonah (pidder)

    Jonah (pidder) - 2014-03-06

    Again: Do not mess with the underlying crypto functions unless you are absolutely certain what you are doing.

    In general RSA encryption is not meant to encrypt long clear text messages.
    We recommend you use hybrid encryption, see this thread.

     
  • henryho_hk

    henryho_hk - 2014-03-07

    I also prefer not to. ^__^

    It would be great if the RSA demo page can be enhanced to perform a RSA round-trip on the unicode character "𩶘" (UTF32: 29D98; UTF16: D867 DD98; UTF8: F0 A9 B6 98) ?

     

    Last edit: henryho_hk 2014-03-07
  • Jonah (pidder)

    Jonah (pidder) - 2014-03-07

    In case of double byte characters first encode the character e.g. with UTF8 (see our UTF8-Demo) and then use the resulting UTF8 encoded string for the RSA encryption. This will give you a working round-trip.

    The UTF8 encryption reduces the available clear text size even more, so the hybrid approach still is the best way to go in our opinion (since AES-CBC-256 encrypts and decrypts the "𩶘" character just fine).

     

Log in to post a comment.