Menu

#27 punctuation interspersed with GB2312 data confuses wohenchan

open
None
5
2014-08-18
2000-11-14
peng jiang
No

especially for the GB2312.
If there is interpuntion among the input characters, wohenchan can't return correct character entries.
"俱乐部! 参观" and "俱乐部 参观" will return different results, because of the "!".
How should wohenchan handle the interpunctions?

Discussion

  • Wesley Tanaka

    Wesley Tanaka - 2000-11-15

    "Interpunction" is not a commonly used word

     
  • Wesley Tanaka

    Wesley Tanaka - 2000-11-15
    • summary: searching method doesn't handle the interpunctions --> punctuation interspersed with GB2312 data confuses wohenchan
     
  • Wesley Tanaka

    Wesley Tanaka - 2000-11-15

    Cleaning the input up is going to be dependent on input encoding. Additionally, it is a function of user interface. Therefore, it should not happen inside ConverterTableInterface.

    Perhaps there should be an InputCleaner class heirarchy with subclasses for each of the subclasses of EncodingInfoInterface

     
  • peng jiang

    peng jiang - 2000-11-16
    • assigned_to: nobody --> pjiang
     

Log in to post a comment.

MongoDB Logo MongoDB