Menu

#107 Fonts with difference encoding can have ToUnicode, as well

SVN TRUNK
closed
nobody
None
2021-08-19
2020-11-05
No

PdfFifferenceEncoding needs to support explicit ToUnicode tables, too. PDF examples requiring this can be created, for example, at https://www.canva.com/design/play?category=tACFat6uXco

1 Attachments

Discussion

  • zyx

    zyx - 2021-08-18

    I'm sorry, but I hate to open random sites whit their cookies consents and whatever. Would it be possible to attach such file and claim what precisely is your patch fixing, please? Like: without it, PoDoFo cannot.... , but with it PoDoFo can.....

    • PdfObject* pToUnicode = nullptr);

    Use NULL instead, please, the same as the other code in the PoDoFo.

        if( m_differences.Contains( static_cast<int>(pszInput[i]), name, value ) )
           pszUtf16[i] = value;
    
    • if(m_bToUnicodeIsLoaded)
    • {
    • value = GetUnicodeValue(pszInput[i]);

    The m_toUnicode can be empty (you may check m_toUnicode.empty()).
    When there are both differences and to Unicode, then the later overwrites the value of the former. Can it happen? Might there be an else clause?

    Otherwise the patch looks good, though just reading it, not testing it in action.
    
     
  • Christopher Creutzig

    The attached file should extract text like “Wear proper.” Without the patch, I am getting random-looking character substitutions, text like “Wlaeba cr lpot.”

    GetUnicodeValue should be able to handle requests for glyhs not in m_toUnicode, which includes it being empty.

    When there are both differences and toUnicode, I expect toUnicode to take precedence, yes.

     
  • zyx

    zyx - 2021-08-19

    Thanks for the file. I checked the PDF ISO and according to the "9.10.2 Mapping Character Codes to Unicode Values" the toUnicode has a precedence over the differences. Your patch does both, but I think in a good way.

    I committed your patch (slightly modified) as [r2044].

     

    Related

    Commit: [r2044]

  • zyx

    zyx - 2021-08-19
    • status: open --> closed
     
MongoDB Logo MongoDB