Menu

#157 Jackcess doesn't support surrogate characters

4.0.7
closed
None
1
2024-05-14
2024-02-13
Adam Chen
No

We use Jackcess to write to Microsoft Access database. IllegalStateException was thrown when data to be saved contains surrogate characters such as emoji. Following is the call stack of the exception.

Caused by: java.lang.IllegalStateException: Surrogate pair chars are not handled
at com.healthmarketscience.jackcess.impl.GeneralLegacyIndexCodes$2.getInlineBytes(GeneralLegacyIndexCodes.java:253) ~[jackcess-2.1.0.jar:?]
at com.healthmarketscience.jackcess.impl.GeneralLegacyIndexCodes.writeNonNullIndexTextValue(GeneralLegacyIndexCodes.java:499) ~[jackcess-2.1.0.jar:?]
at com.healthmarketscience.jackcess.impl.IndexData$GenLegTextColumnDescriptor.writeNonNullValue(IndexData.java:1756) ~[jackcess-2.1.0.jar:?]
at com.healthmarketscience.jackcess.impl.IndexData$ColumnDescriptor.writeValue(IndexData.java:1523) ~[jackcess-2.1.0.jar:?]
at com.healthmarketscience.jackcess.impl.IndexData.createEntryBytes(IndexData.java:1244) ~[jackcess-2.1.0.jar:?]
at com.healthmarketscience.jackcess.impl.IndexData.prepareAddRow(IndexData.java:581) ~[jackcess-2.1.0.jar:?]
at com.healthmarketscience.jackcess.impl.IndexData.prepareAddRow(IndexData.java:559) ~[jackcess-2.1.0.jar:?]
at com.healthmarketscience.jackcess.impl.TableImpl.addRows(TableImpl.java:1586) ~[jackcess-2.1.0.jar:?]
... 21 more

Caused by: java.lang.IllegalStateException: Surrogate pair chars are not handled

We have confirmed that MS Access database itself can deal with surrogate characters well. So Jackcess should be capable of handling such characters.

Discussion

  • Adam Chen

    Adam Chen - 2024-02-13

    We are using Jackcess 2.2.3.

     
  • James Ahlborn

    James Ahlborn - 2024-02-13

    just to be clear. jackcess does support surrogate pairs in text columns in general. they are just not supported in columns which are indexed. access indexes use special encodings for each character, and i haven't deciphered the encodings for surrogate pairs yet.

    one workaround you can use in a situation like this is to create the database without the indexes in jackcess and then use access to add the indexes afterwards. obviously, this may not be feasible for all use cases.

     

    Last edit: James Ahlborn 2024-02-13
  • Adam Chen

    Adam Chen - 2024-02-13

    Thanks for the prompt response, James. Does Jackcess have a plan to investigate and fix the problem, in addition to providing the workaround?

     
    • James Ahlborn

      James Ahlborn - 2024-02-13

      working on jackcess is always time dependent. i don't have any near term plans to try to expand the index encodings. but, i'll certainly keep this bug open in case some time open up.

       
  • James Ahlborn

    James Ahlborn - 2024-05-14
    • assigned_to: James Ahlborn
    • Group: Unassigned --> 4.0.7
     
  • James Ahlborn

    James Ahlborn - 2024-05-14

    i got a little time to look at this and i think i'll be able to fix it.

     
  • James Ahlborn

    James Ahlborn - 2024-05-14

    this is fixed in the trunk and will be in the 4.0.7 release

     
  • James Ahlborn

    James Ahlborn - 2024-05-14
    • status: open --> closed
     

Log in to post a comment.

MongoDB Logo MongoDB