pb: accent from latex files are separated from their letter. Find a give to merge the letter and its accent.
Use the unicode category Symbol, Modifier' if cat = S,mod and overlap with other token: combine them : how??? solution in Unicode?
??: <token>content??</token>
Anonymous
You seem to have CSS turned off. Please don't fill out this field.
split is due to the fact that the letter with accent has a position before the accent due to minDupBreakOverlap?
<TOKEN sid="p4_s98" id="p4_w9" font-name="alhcmd+helvetica" bold="no" italic="no" font-size="31.8805" font-color="#000000" rotation="0" angle="0" x="281.882" y="362.132" base="385.022" width="60.2223" height="29.4895">mati</TOKEN> <TOKEN sid="p4_s99" id="p4_w10" font-name="alhcmd+helvetica" bold="no" italic="no" font-size="31.8805" font-color="#000000" rotation="0" angle="0" x="345.643" y="362.132" base="385.022" width="10.6162" height="29.4895">`</TOKEN> </TEXT> <TEXT width="79.7331" height="29.4895" id="p4_t8" x="342.117" y="362.132"> <TOKEN sid="p4_s100" id="p4_w11" font-name="alhcmd+helvetica" bold="no" italic="no" font-size="31.8805" font-color="#000000" rotation="0" angle="0" x="342.117" y="362.132" base="385.022" width="62.0076" height="29.4895">eres</TOKEN>
Use the unicode category Symbol, Modifier'
if cat = S,mod and overlap with other token: combine them : how??? solution in Unicode?
??: <token>content??</token>
split is due to the fact that the letter with accent has a position before the accent
due to minDupBreakOverlap?
<TOKEN sid="p4_s98" id="p4_w9" font-name="alhcmd+helvetica" bold="no" italic="no" font-size="31.8805" font-color="#000000" rotation="0" angle="0" x="281.882" y="362.132" base="385.022" width="60.2223" height="29.4895">mati</TOKEN>
<TOKEN sid="p4_s99" id="p4_w10" font-name="alhcmd+helvetica" bold="no" italic="no" font-size="31.8805" font-color="#000000" rotation="0" angle="0" x="345.643" y="362.132" base="385.022" width="10.6162" height="29.4895">`</TOKEN>
</TEXT>
<TEXT width="79.7331" height="29.4895" id="p4_t8" x="342.117" y="362.132">
<TOKEN sid="p4_s100" id="p4_w11" font-name="alhcmd+helvetica" bold="no" italic="no" font-size="31.8805" font-color="#000000" rotation="0" angle="0" x="342.117" y="362.132" base="385.022" width="62.0076" height="29.4895">eres</TOKEN>