Allowing tex/dvi printing of Unicode chars

  • Giuliano Lancioni

    While Unicode chars beyond OT1 (e.g., accented characters) are properly treated in the (Linux) current version of openccg, they are discarded in tex/dvi output - what you get, for instance, if you parse sentences after activating :vison in openccg.

    However, a very simple patch allows users to solve this problems: just add below line 137 in (in src/opennlp/ccg/util) the following line of code:


    (it's to be inserted just below another \usepackage directive).

    Recompile - by calling ccg-build in the main openccg folder - and anything works like a charm. Of course, developers might choice to add this patch in the main openccg distribution if they deem it useful and harmless.

    Hope this helps,


  • Giuliano Lancioni

    A small update: to include chars in extended Unicode sets (e.g., Latin Extended Additional), utf8 in previous message should be changed to utf8x


    A possible drawback of this patch is that the rendering engine hangs up if characters in some non-Latin scrit (e.g., Arabic) are included. This is a problem for myself, since I am especially interested in Arabic, so :vison should be disabled if a non-Latin script is included, but can be enabled if a transcription with non-Latin1 character is needed. Perhaps the older version should be left in the standard distribution, since it limits itself not to print the unknown character, instead than hanging.

  • Michael White

    Michael White - 2012-04-25

    Thanks for this suggestion!  It seems we should wait on changing the main code base until a way can be found to avoid the hanging issue with non-Latin scripts.


Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks