Menu

Preview function Log in to Edit

Marc Prior

Adding a preview function to OmegaT

Background

In most modern CAT tools, the text being translated is displayed in segments rather than with its ultimate layout and formatting. However, it is often useful for the translator to be able to see the ultimate layout and formatting whilst translating. For this reason, many CAT tools provide a preview function (also termed "live preview" or "real-time preview") that shows the layout and formatting without the translator having to open the final document in a separate application, such as MS Word. OmegaT does not have an integral preview function, but it can be achieved with the use of additional software. This HowTo describes the general principles for setting up a preview function.

Function

Ideally, a viewer should do two things:

  • It should display the text with layout and formatting similar to that in the target file.
  • It should be updated (refreshed) when the text is changed in the CAT tool.

Principles

There are numerous ways of implementing the preview function, but they essentially require:

  • An external application suitable for displaying the translated text in layouted form
  • If the preview of the translated file is to be displayed in a different format to that of the translated file itself, a means of converting the translated file to the desired format
  • A means of updating the text in the preview when a change is made in the CAT tool (in our case, in OmegaT)

In some CAT tools, the preview function is simply an automated routine for opening the translated file in its native application. (The preview function in the DGT fork of OmegaT is an example.) Often however, as in the case of Microsoft Word, the native application has no refresh function. The view of the translated document is not therefore "live"; instead, the user must close and re-open the file, and navigate to the desired point in the text, each time he or she wishes to view a change made in the CAT tool. Although LibreOffice does have a "Reload" function, it is still not satisfactory, as it merely closes the file and re-opens it; the user must still navigate back to the previous point in the text, which is time-consuming.

For this reason, CAT tools often make recourse to a file viewer for a different format, and convert the text being translated into this format for display. PDF and HTML are file formats often used for this purpose by other CAT tools, and there is a wide choice of external applications, many of them free, that can be used as viewers.

In this context, a very useful feature of OmegaT is its post-processing command function. This function causes an external command, defined by the user, to be executed automatically each time the user creates the translated file(s) with Ctrl+d or Ctrl+Shift+d. This external command can for example convert a DOCX file to a PDF file for display in a PDF viewer. Examples of use of this command are provided below. (For a more comprehensive description of OmegaT's post-processing command, consult the relevant documentation.)

Finally, a mechanism must be provided to refresh the display in the external viewer automatically when the file it is displaying is modified on disk (i.e. by OmegaT). Here too, there are different means of achieving this. One is a viewer with built-in detection of a change in the file it is displaying. Another is an application that automatically refreshes the display at preset intervals.

Example 1: LibreOffice and Okular

This solution requires LibreOffice (to convert DOCX files to PDF) and the Okular file viewer (to display them), so these two applications must both be installed. Both are available for Windows (including from the Microsoft Store) and Linux.

LibreOffice is used in this example to convert DOCX files to PDF automatically by means of a command-line command. LibreOffice must be installed, but does not need to be opened. (In fact, the command-line function may not work if LibreOffice is already open, in which case it should obviously not be open.)

To instruct LibreOffice to convert the translated text to PDF, enter the post-processing command in OmegaT as follows:

Select Options > Preferences > Saving and Output, and enter the command below in the External Post-processing Command text field. This will cause the function to be used in all your OmegaT projects. Alternatively, check the Also allow per-project external commands checkbox: you can then enter the command in the Project Properties dialog (Project > Properties > External Post-processing Command) of the project concerned. In this case, the function will take effect only in this project.

In the External Post-processing Command field, enter:

soffice --headless --convert-to pdf --outdir "${targetRoot}" "${targetRoot}${fileName}"

(if you are using Linux) or

"C:\Program Files\LibreOffice\program\soffice.com" --headless --convert-to pdf --outdir "${targetRoot}" "${targetRoot}\${fileName}"

(if you are using Windows).

${targetRoot} and ${fileName} are "template variables" created by OmegaT's post-processing command function, and you can select and insert them from the list in the dialog as an alternative to typing them in yourself.

LibreOffice's command-line functionality has been improved in recent years, so use of a recent version of LibreOffice is recommended. One recent change is addition of the soffice.com executable (for Windows), which is specifically intended for purposes such as this; previous Windows versions used soffice.exe. Previous Windows versions also differed slightly in the command-line syntax, e.g. using - rather than -- in the arguments, but this has now been brought into line with the Linux version.

If the location of the LibreOffice executable is not in your execution path, you must state the full path of the executable. "C:\Program Files\LibreOffice\program\soffice.com" is the default location for Windows 10; the location may differ on your system. Place the full path in double quotes.

Hit Ctrl+Shift+d in OmegaT to create the translated document of the active document. There is no need to save the document(s) first: Ctrl+Shift+d automatically causes any changes made to be saved before the target file is created. You can also simply use Ctrl+d, but this will create all the target documents rather than just the active document, which may slow down the process if your project contains lots of larger files.

You should now see a PDF file appear in your target folder. Open this file with Okular (right-click with the mouse and select Okular). You can read through your translation in this file as you would read through an MS Word file. Whenever you wish to edit your translation, toggle back to OmegaT, make the change there, then hit Ctrl+Shift+d to create the translated file. OmegaT will then automatically instruct LibreOffice to create a new PDF file. Since this file is already open in Okular, Okular automatically refreshes the display, showing your change. The speed with which changes appear may vary according to your hardware and operating system, and of course the size of the file.

The LibreOffice converter and the command described above are also able to convert Excel (XLSX) and Powerpoint (PPTX) files to PDF format, and Okular is also able to display them.

Variants of this example are possible. Instead of LibreOffice, you can use a different command-line utility to convert DOCX files to PDF format, such as Abiword (Linux) or Pandoc (Linux and Windows). Each has its own particular command syntax, so the post-processing command you enter in OmegaT also differs depending upon your choice of utility.

You can also use a different PDF viewer in place of Okular. You can use any PDF viewer, but it should have an auto-refresh function in order to provide a "live preview". SumatraPDF (cross-platform; less fully-featured than Okular and therefore possibly faster) and Evince are two such tools. Another is the ultra-simple Zathura (Linux only), with vi-style keybindings.

Example 2: Pandoc, Firefox and auto-refresh plug-in

An alternative to viewing your translated file in a PDF viewer is to convert it to HTML and view it in a web browser. Analogous to the example above, you enter a post-processing command in OmegaT which in turn instructs the conversion utility to perform the conversion to HTML.

This example uses Pandoc to convert the DOCX file in the OmegaT project to HTML. The Firefox browser is used as the target file viewer. Firefox does not (at the time of writing) have a built-in automatic refresh function, but number of plug-ins (extensions) which provide this function are available from the Mozilla/Firefox website. Install one of these plug-ins and set a suitable refresh interval (such as 3 seconds).

Enter the Pandoc command for conversion to PDF as a post-processing command in OmegaT's Project Properties dialog, as described above. In this case, the command is:

pandoc -s -o "${targetRoot}${fileNameOnly}.html" "${targetRoot}${fileName}"

(for Linux) or

"C:\Program Files\Pandoc\pandoc.exe" -s -o "${targetRoot}${fileNameOnly}.html" "${targetRoot}\${fileName}"

(for Windows). Again, change the location of the Pandoc executable if necessary.

Analogous to Example 1, creating the target file with Ctrl+Shift+d should then result in an HTML version of the active file in OmegaT being created, which you can then open in Firefox. (Since this is HTML, a number of other files, such as images, may also be created.) Edit your translation in OmegaT and hit Ctrl+Shift+d again, and after a brief delay (dependent upon your selected refresh interval) the change should appear in the viewer window.

As in Example 1, numerous variations on this solution are possible. You can use an alternative (command-line) utility to convert to HTML. Instead of using Firefox as your viewer, you can use Chrome or Opera, or any other browser for which an auto-refresh plugin is available.

Overview of conversion utilities and display applications

Conversion utilities

Utility OS Converts from Converts to Command-line syntax Comments
LibreOffice Linux, Windows DOCX, XLSX, PPTX PDF, HTML "[soffice]" --headless --convert-to pdf --outdir "[output folder]" "[file to convert including path]" Ensure that LibreOffice
is not already running.
Pandoc Linux, Windows DOCX PDF, HTML "[pandoc]" -s -o "[converted file including path]" "[file to convert including path]" Conversion to PDF requires
an additional plug-in to be
installed, e.g.
MiKTeX https://miktex.org/
Abiword Linux DOCX PDF, HTML "[abiword]" –to=PDF –o [converted file including path] "[file to convert including path]" Faster than LibreOffice,
but conversion of formatting not
as comprehensive

Display applications

Application OS Display format Plug-ins Comments
Okular Linux, Windows PDF Windows users may prefer to use the portable version
Sumatra PDF Linux, Windows PDF
Evince Linux, Windows PDF
Zathura Linux PDF Ultra-basic
Firefox Linux, Windows HTML Auto-refresh plug-ins
Opera Linux, Windows HTML Auto-refresh plug-ins
Chrome Linux, Windows HTML Auto-refresh plug-ins
Edge Windows HTML Auto-refresh plug-ins


Related

Wiki: Home

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.