OmegaT does not display the German Umlaute (öäü) correctly when using font "Dialog Input" (see screenshot attached). If I switch so another font, erverything looks fine.
A also attached the whole project for reproduction.
If it's a geneal issue with Dialog Input, this font should probably be removed from the list.
The strange thing is, that I am using this font for quite some time and never had this trouble.
Using OmegaT 3.1.9_1
Dialog Inputis a logical font defined by your JRE's fontconfig files.Not all fonts contain all glyphs; it is not reasonable to remove some fonts, especially JRE-defined ones, just become some glyphs aren't present.
You can adjust your fontconfigs if you like, or simply use a different font.
Since you say you were using this font without problem before, I suggest the issue is with your JRE or the fonts on your system.
Are you able to reproduce the behavior? The strange thing is, that eg. "ä" is displayed correctly, while "ü" is not. Both are in the same set of chars (ANSI). That's why I am asking myself if it's a bug.
No, I'm not able to reproduce the issue on my system (OS X 10.10.4, Java 1.8, OmegaT 3.5.1 trunk).
I cannot reproduce it either: Windows 7, Java 1.8, OmegaT 3.5 update 1. I close as "works for me".
Didier
I can reproduce here.
Windows 8.1 (Japanese) + Oracle Java 1.7.0_51 + OmegaT 3.5u1.
Could it be related to the source format ? I can't reproduce with the HTML source here:
https://de.wikipedia.org/wiki/Der_Spiegel
Last edit: Aaron Madlon-Kay 2015-08-02
No. The only way it could have anything to do with the source format is if OmegaT was guessing the encoding wrong, but the provided test case is a .docx which requires no guessing.
Based on Aaron's findings, my understanding is that it doesn't happen with "real" characters with umlauts, such as the ones you have in your source documents or the ones I type on my Azerty keyboard, but with "composed" ones.
I can reproduce on Windows 8.1 (Java 1.8, OmegaT 3.5).
The Windows JRE defines DialogInput as
dialoginput.plain.alphabetic=Courier Newso the issue is really with Courier New. Checking other apps, I find that Courier New will correctly showü(copied from the provided source) in most programs, but in jEdit we see the same issue. Thus it is likely Java-specific.Looking at the source document internals, the difference between the
äand theüis that the former isU+00E4 LATIN SMALL LETTER A WITH DIAERESISwhile the latter is<U+0075 LATIN SMALL LETTER U U+0308 COMBINING DIAERESIS>. For some reason the latter is getting split up, but only with this font.I have no idea why Java and Courier New are misbehaving on Windows. But the solutions are:
There may be a case for having OmegaT normalize the text on parse as well.
Last edit: Aaron Madlon-Kay 2015-08-02
I created a new ticket about the normalization issue here: [#758]
Related
Bugs:
#758Last edit: Aaron Madlon-Kay 2018-02-27
The "real" bug here (decomposed characters not rendered correctly in Courier New in Java on Windows) cannot be addressed in OmegaT, but the fix for [#758] effectively works around this particular issue.
Related
Bugs:
#758Last edit: Aaron Madlon-Kay 2018-02-27