Autshumato ITE Frequently Asked Questions (FAQ)

This page contains the Autshumato ITE FAQ.

1. About Autshumato and the ITE

1.1. What is Autshumato?

The Autshumato project was initiated by the South African Department of Arts and Culture, and developments are done by the Centre for Text Technology (CTexT®) at the North-West University (Potchefstroom Campus), in collaboration with the University of Pretoria.

The general aim of this project is the development of open source machine-aided translation tools and resources for South African languages. The term "open source" implies that every application developed in this project is freely available to the general public. This definition also extends to the source code of every application.

The objective of establishing this project as an open source project adheres to the South African National Government's policy and strategy for open source implementation. This policy specifies that all new software developed for government should be based on open standards. Furthermore, government also encourages and supports the use of open content and open standards within South Africa.

The main aims for this project are:

  • The development of machine translation systems for translation between South African languages;
  • The development of open source terminology and translation software;
  • Fostering human resource development in all areas of the project; and
  • Producing research outputs (scholarly papers and articles), technology outputs (selected core technologies), and resources (selected corpora) of international quality.

The translation tools developed in this project do not only aim at meeting the needs of the Department of Arts and Culture, but also that of a wide variety of South African citizens at various levels in a developing Information Society. This project therefore strongly contributes to the more rapid promotion of a culture of multilingualism in South Africa. It also contributes to language pride, and a consciousness of the importance of promoting, preserving and developing minority languages in South Africa. By involving native speakers of the indigenous languages in this project, the shortage of people who are knowledgeable about and trained in ICT is also partially addressed, as it empowers native speakers of local languages to partake in the growing local and global HLT industry.

1.2. What is the ITE?

Autshumato Integrated Translation Environment (ITE) is a free computer-aided translation (CAT) application. It provides a single translation environment that contains translation memory, machine translation and a glossary to facilitate the translation process. The Autshumato ITE is a derived work of the popular open source OmegaT CAT application.

Although Autshumato ITE is specifically developed for the eleven official South African languages, it is in essence language independent, and can be adapted for translation between any language pair. Autshumato ITE is implemented in the Java programming language, supports open file standards and is licensed under the GNU GPL version 2 or later.

1.3. What is OmegaT?

OmegaT is a free translation memory application written in Java. It is a tool intended for professional translators. However, it does not translate for you! (Software that does this is called "machine translation", and you will have to look elsewhere for it.) OmegaT has the following features:

  • Fuzzy matching
  • Match propagation
  • Simultaneous processing of multiple-file projects
  • Simultaneous use of multiple translation memories
  • User glossaries with recognition of inflected forms
  • Document file formats include:
    • Microsoft Word, Excel, PowerPoint (.docx, .xlsx, .pptx)
    • XHTML and HTML
    • Open Document Format (LibreOffice, OpenOffice.org)
    • MediaWiki (Wikipedia)
    • Plain text
    • ...and approximately 30 other file formats
  • Unicode (UTF-8) support: Can be used with non-Latin alphabets
  • Support for right-to-left languages
  • Integrated spelling checker
  • Compatible with other translation memory applications (TMX, TTX, TXML, XLIFF, SDLXLIFF)
  • Interface to Google Translate

Find out more at: http://www.omegat.org

1.4. How much does the Autshumato ITE cost?

Anybody may download and use the complete software free of charge.

1.5. What CAN the Autshumato ITE do?

The Autshumato ITE was developed as a truly South African CAT tool. The focus of the software is to make freely available an environment that will make it easier for translators to work in the South African languages and to equip them with resources that will make their job easier and their efforts more effective.

The Autshumato ITE offers a translation environment through the popular OmegaT interface, and includes resources such as glossaries to see suggested translations for frequently used words, and translation memories to show previous translations for phrases. The target documents that are generated with the Autshumato ITE are also formatted according to the source documents so any text in bold or italics, inserted pictures and bulleted or numbered lists will remain as such in the target document. All of these resources are combined in a user-friendly interface, with the possibility of adding additional resources such as spelling checkers and even machine translation tools.

1.6. What CAN'T the Autshumato ITE do?

The Autshumato ITE was designed as a resource to aid translators in their work. It is by no means a way of replacing the valuable skills of a translator, but rather a way to save time on the repetition of work. The Autshumato ITE cannot and will never be able to deliver perfect translations on its own, and the translator should always be on the look-out for mistakes or corrections that need to be made to the suggested translations.

The Autshumato ITE is not a machine translation system that translates documents for you automatically; it can, however, connect to such services in order to aid in the translation process (please refer to the Autshumato ITE user manual or contact the developers for additional help in this regard).

The Autshumato ITE cannot automatically share translation memories and glossaries between several users over a network or the Internet. It is up to users to share and distribute these files if so desired.

As the Autshumato ITE is distributed free of charge, it does not contain spelling or grammar checkers. Please read the Autshumato ITE spelling checker download and installation procedure for complete instructions on downloading and installing spelling checkers.

1.7. What are the minimum requirements in order to run the Autshumato ITE?

Personal Computer (PC) with at least:

  • 1GHz processor,
  • 1 GB RAM,
  • 500MB available disk space,
  • 15'' monitor.

Operating system:

  • Windows 7, Windows 8,
  • Linux,
  • Mac (OS 10.6 or later).

2. Download & Installation

2.1. Where can I download the Autshumato ITE?

Follow the Download and Installation Procedure to guide you through the application download and installation on your computer.

2.2. How to install Oracle™ Java

By installing the Autshumato ITE on Windows, the installer will automatically check that the correct version of Oracle™ Java is installed. If a correct version is not found, it is installed automatically.

Mac and Linux users have to download and install Oracle™ Java manually by navigating to http://java.com/en/download/manual.jsp and selecting the appropriate installer to download.

2.3. How to install the Autshumato ITE

Follow the Download and Installation Procedure to guide you through the application download and installation on your computer.

2.4. How to install spelling checkers for the ITE

Read the Autshumato ITE spelling checker download and installation procedure for complete instructions.

3. Running the Autshumato ITE

3.1. How do I insert my own or new languages?

The Autshumato ITE enables you to define easily which languages, language codes, country codes and diacritic characters should be available in the Autshumato ITE. To add more languages than those currently available in the Autshumato ITE interface, you will have to change the omegat.language.prefs file. Read the Autshumato ITE Reset Available Languages procedure for a basic guide in resetting the available languages. The procedures below provide a more descriptive and detailed explanation on inserting, removing and altering the available languages.

3.1.1. Where do I find the omegat.language.prefs file?

This file can be found in the installation directory (the location in which you installed the Autshumato ITE).

  • On Windows 7 the default installation directory is: C:\Program Files (x86)\CTexT\Autshumato ITE

3.1.2. How do I edit the omegat.language.prefs file?

In the Autshumato ITE installation directory, you will find the "omegat.language.prefs" file. This file contains the languages that are present in the Autshumato ITE and the special characters that can easily be inserted using the Insert menu.

To customise the languages represented in the Autshumato ITE, you need to complete the following steps:

  1. Open the "omegat.language.prefs" file in a text editor application.
  2. It is recommended to make a backup of the "omegat.language.prefs" before editing.
  3. Edit the original file using any text editor, but you must save the text document in UTF-8 format.

An example of the "omegat.language.prefs" file is given below.

# Here you set the Languages and Locale Codes used in the Autshumato ITE.
# The format of this file is as follows:
# Language Name  [tab]  Locale Code  [tab]  Locale Country   [tab]  Diacritic characters (Comma separated)
# EX: "Afrikaans    afr ZA  à,á,â,ã,ä,è,é,ê,ë,í,î,ï,ó,ô,ö,ù,ú,û,ü,ý"
# Remember to make a copy before editing this file and save in UTF-8 format.
# South African languages, ie the original A-ITE list of langs:
Afrikaans   AFR ZA  à,á,â,ã,ä,è,é,ê,ë,í,î,ï,ó,ô,ö,ù,ú,û,ü,ý
English ENG GB
IsiNdebele  NBL ZA
IsiZulu ZUL ZA
IsiXhosa    XHO ZA
Sesotho SOT ZA
Siswati SSW ZA
Setswana    TSN ZA  Š,š
Sepedi  NSO ZA  Š,š
Tshivenḓa   VEN ZA  Ḓ,Ḽ,Ṋ,Ṅ,Ṱ,ḓ,ḽ,ṋ,ṅ,ṱ
Xitsonga    TSO ZA

  • The first few lines starting with the # character are considered comments and ignored by the Autshumato ITE.
  • Each language to be represented in the Autshumato ITE is on a new line and consists of the following information:
    • Language name (ex. Tshivenḓa),
    • ISO 639-2 language code (ex. ven),
    • ISO 3166 country code (ex. ZA); and
    • Language diacritics (ex. Ḓ,Ḽ,Ṋ,Ṅ,Ṱ,ḓ,ḽ,ṋ,ṅ,ṱ) separated by commas.
  • Each of the fields is separated by a [Tab].

3.1.3. How do I remove a language?

To remove a language, simply remove the line containing the language information from the "omegat.language.prefs" file; refer to section 3.1.2.

3.1.4. How do I add a language?

To add a language, create a new line in the "omegat.language.prefs" file; refer to section 3.1.2.

  1. Enter the language name as it should be displayed in the Autshumato ITE.
  2. Press the [Tab] on the keyboard.
  3. Enter the ISO 639-2 three-letter language code in lowercase.
  4. Press [Tab] again.
  5. Enter the ISO 3166 two-letter country code.
  6. If the language has specialised diacritic characters, press [Tab] again and then enter the diacritic characters, each separated by a comma.
  7. Save the file (make sure that the file is saved in the UTF-8 format), and launch the Autshumato ITE.

The newly-entered language will now be available in the application. The special characters will also be available from the Insert menu. Also note that even though a language is not in the list, you can always type in the code manually. (nr-ZA in the case of isiNdebele):

3.1.5. How do I insert (diacritic) characters for a translation?

Go to Insert, choose the language and click on the diacritic or special character that you wish to use:

After clicking on the diacritic character you chose, you should be able to see it inserted on your document:

3.2. Can I select the same source and target language when I am only editing or rewriting in plain language?


3.3. Which language combinations does the ITE support?

The ITE is language independent; it can translate between any two languages.

3.4. What is Auto-propagation?

If there is an exact match for the current source segment in the translation memory, the ITE will insert the translation automatically. Disable this option in Project -> Properties if you do not want auto-propagation.

3.5. How do I set the auto-save function if there is any?

The ITE already auto-saves every three minutes and this can be adjusted by selecting Options -> Saving and Output....

3.6. How do I create a new project?

To create a project in the Autshumato ITE, select Project -> New....

The "Create a New Project" dialog appears. You can now navigate to the folder in which you would like to create the project and specify a name for the project. Click on Save once you are satisfied with the project name and location.

Useful tip: We recommend creating a central folder on your computer for all your translation work. You can then create sub-folders for each project and organise the additional resources effectively in this central folder.

3.7. Why am I experiencing problems when I try to create a project?

You should make sure that your project name is short, descriptive and to the point. Unnecessarily long project names will only cause problems.

3.8. How do I switch between different segments without having to use the mouse?

You can either press [Enter] on the keyboard, or select Options -> Use TAB to Advance to use the [Tab] keyboard key to advance to the next segment. Similarly, [Shift] + [Tab] will activate the previous segment.

3.9. Why does the source language not show in the active segment?

  • Go to Options -> Editing Behaviour.

  • Ensure that _"The source text" option is selected.

This option makes a copy of the source language in the active segment for easy editing. Changes will only be shown when you move to the next segment. Refer to the Autshumato ITE Altering the Editing Behaviour procedure for more information on the editing behaviour settings.

3.10. How do I solve the "Out of memory" issue?

This error occurs when the Autshumato ITE does not have enough memory assigned in order to open large translation memories. By default, only 512Mb are assigned to the application, which may cause problems if you use large translation memories. The following error dialog will be shown when the error occurs:

The application will close after closing the Error dialog. Refer to the Autshumato ITE Resolving out of Memory Error procedure on how to resolve the issue.

3.11. Can I load documents that someone else translated and only edit them? Will the ITE update the TMs for the entire document?

Yes, you can. By copying the complete translation project from the other person and opening it with the ITE, you will be able to edit any of the translated segments and the translation memories will be updated.

4. Importing documents

4.1. How do I import .doc, .xls and .ppt (Microsoft® Office 2003 or earlier) documents?

Open the document in an application capable of reading such documents (Microsoft® Office 2007 or later, OpenOffice.org or LibreOffice) and then select Save As in the application. Now save the document as the relevant .docx, .xlsx or .pptx document. The saved document can now be opened in the ITE.

4.2. How do I import a PDF document?

The Autshumato PDF Extractor is a utility application that extracts text from PDF documents with the aim of making it translatable. It is also able to extract the pages of PDF documents as PNG images. It is free to anyone and is licensed under the Apache License Version 2.0.

4.3. Does the ITE replace the source file or create a copy?

A copy of the original source file is created and stored in the Source folder of the project directory.

4.4. How do I choose another file in the same project?

Go to Project -> Project Files and then choose the file you want to translate. The application will automatically open the next source document in the Editor if the last segment of the current document has been reached and you activate the next segment.

4.5. How does segmentation work?

Refer to Chapter 15 (Source Segmentation) in the OmegaT user manual.

5. Text formatting

5.1. How do I change the font size?

You can change the size or the font of the text by selecting Options -> Font.

5.2. Does the font size in the Editor have an effect on the translated document?

No. Setting the font size in the ITE only applies to the ITE editing environment; it does not affect the font type or size of the translated document.

5.3. How do I switch between upper and lower case?

You can go to Edit -> Switch Case To -> Choose the option that you require. Alternatively, use the [Shift] + [F3] keyboard shortcut to cycle between cases.

5.4. What is cycle case?

Cycle case is when the case cycles between Upper, Lower and Title case as it is needed.

5.5. Why does the spelling checker not automatically correct the Afrikaans "'n" as in Microsoft® Word?

CTexT® spelling checkers are only compatible with Microsoft® Office. Download the open source spelling checkers from http://extensions.openoffice.org/ to use with the ITE. Refer to the spelling checker download and installation procedure for more detailed instructions.

6. Formatting tags

6.1. I get an error when opening the generated translated document

At any time, if the target document gives an error about styles.xml, parsing error or error opening, then it is most likely a Tag problem. This means that somewhere in the translation a Tag was not copied correctly. This can be fixed by:

  • Removing the tags: Select Project -> Properties and check the Remove Tags option on the dialog; or
  • Using the Tag Validator (Tools -> Validate Tags) to check for tag errors in the translations.

Upon fixing all of the tag problems, the target document will generate correctly.

The follow type of error is an indication of a Tag problem:

The following steps will aid in resolving any such issues:

  1. Ensure that the ITE is open and the document that is giving the problem is the current active document.
  2. Select Project -> Properties and check the Remove Tags option on the dialog.
  3. Translate the segments that have not yet been translated.
  4. Generate the translated documents (Project -> Create Translated Documents).
  5. You should now be able to open the translated document.

1. Carefully ensure that if the source text contains formatting tags (ex: <f1>some text to translate</f1>) that all the tags are copied to the translation. Use the Edit -> Insert Next Missing Tag option to aid in the process.
2. Use the Tag Validator (Tools -> Validate Tags) to ensure that the tags were copied correctly.
3. Generate the translated documents (Project -> Create Translated Documents).
4. You should now be able to open the translated document.

6.2. How do I hide tags?

You can enable or disable this option by selecting Project -> Properties_and check the _Remove Tags option on the dialog.

6.3. How to detect and fix tag problems

Go to Tools -> Validate Tags to open the Tag Validator:

After you have clicked on Validate Tags, the screen below should appear:

6.4. Why do the tags not show for certain sentences?

Only sentence that contain in-line formatting, i.e. a single word in the sentence is bold, italic or underline, will have formatting tags. It is possible for the application to extract the formatting of most sentences and as a result they need not have formatting tags.

6.5. If you can just remove the tags, why are they there?

The formatting tags are used for in-sentence formatting, i.e. any place in the source document where the application was unable to extract the explicit formatting. To ensure that the source formatting is not lost, it is assigned specific formatting tags. The tags are then used to insert the correct formatting when creating the translated document.

6.6. What are the implications if I remove the tags?

When you remove the tags, you need to check the formatting yourself in the generated document.

7. Glossaries

7.1. How do I insert a word found in Glossary Pane?

If the TransTips are enabled (Options -> TransTips -> Enable TransTips), words in the source text that could be found in the glossaries will be underlined with a blue line. You can simply right click on the underlined word and the possible translations will be shown in the context menu (beneath the Remove translation option). Selecting a word will insert it into the translation.

Alternatively, while translating you can press [Ctrl] + [Spacebar] to open the auto-complete panel which lists the glossary entries. Select an entry using the arrow keys and press [Enter] to select and insert an entry.

7.2. How do I create a glossary item?

Select Edit -> Create Glossary Entry on the main menu to open the Create Glossary Entry dialog. Here you enter the source text, the translation and optionally a comment. The entry will then be placed in your personal glossary file that is available in the project directory under the glossary folder.

Alternatively, you can press [Ctrl] + [Shift] + [G] to create a new glossary entry.

7.3. Why do some words not match exactly to the glossary words?

You have to set the option for exact word matching at Options -> TransTips -> Exact Match.

7.4. Can I incorporate my own glossary?

See the OmegaT user manual (screen shot below) on how to create a glossary. If the glossary is built to OmegaT standards, save the glossary in the glossary folder of your project.

8. Translation memories

8.1. How do I insert a fuzzy match?

Select the appropriate fuzzy match by selecting Edit -> Select Match #1 on the main menu. To insert the selected fuzzy match, select Edit -> Insert Match on the main menu.

8.2. How do I replace text with the fuzzy match?

Select the appropriate fuzzy match by selecting Edit -> Select Match #1 on the main menu. Select Edit -> Replace with Match on the main menu to replace the current translation with the selected match.

8.3. How do I insert the machine translated text?

A machine translated text can be inserted into the current active segment by selecting Edit -> Replace with Machine Translation on the main menu. Press [Ctrl] + [M] on the keyboard to achieve the same result.

8.4. Can multiple users share and access translation memories?

Refer to the OmegaT User's Manual (Appendix C) on how to setup and operate a Team project in which the translation memories, glossaries and translated documents can be shared between several users.

8.5. Will the ITE create a new translation memory when you create a new project, or does it use the previously created translation memory?

For every new project, a new translation memory is created. It is also possible to copy another translation memory to the new project's tm folder to include work you have previously done on other projects. Refer to the OmegaT user manual Chapter 14 for more information.

9. Machine translation

9.1. Why don't I get a machine translation?

Ensure that the option for machine translation is selected. Go to Options -> Machine Translate -> Choose machine translation system. If these options are selected, make sure that your computer can connect to the Internet. Be sure to enter you required API key for specific machine translation services; refer to the OmegaT user manual (Chapter 20) for more information.
The Autshumato machine translation systems are currently only available to the Department of Arts and Culture, Government of South Africa. We can, however, design and build customised machine translation systems to cater for your organisation or company. Contact us at authsumato@nwu.ac.za for more information.

10. Create the translated document

10.1. Where can I find my translated document?

We recommend creating a central folder on your computer for all your translation work. You can then create sub-folders for each project and organise the additional resources effectively in this central folder. The steps to all your translated work are as follows:

  1. Close Autshumato ITE
  2. Go the central folder on your computer where you saved all your ITE projects:
  3. Open the project you created:

  4. Double click on the target folder:

  5. You will see your translated document as shown below:

10.2. Can I save the translated document in a different format than that of the source file?

Read Chapter 9.2 (Other file formats) in the OmegaT user manual. The document can only be translated to the same format; you need to manually save it to the desired format.

11. Document Naming Service (DNS)

11.1. Do I need to use the DNS when using the ITE?

No, it is optional.

11.2. Why can't I rename documents?

  • Ensure that the DNS is configured. (Tools -> Configure Document Naming System)
  • If it is configured, make sure there are no mistakes. Read the help guide on how to configure the DNS (Advanced features -> The Autshumato DNS).
  • Ensure that all source and target documents are closed before trying to rename.

11.3. "Please enter a valid separator character" warning.

You may not use any alphabetic (a to z) or numeric (0 to 9) characters as separators.
The following characters are also not allowed, because they are not allowed in file names:
/ \ : * ? " < > |
Press OK and enter another character or use the default character ('.').

11.4. "Field values may not contain the separator character" warning.

This error informs you that no character has been entered to serve as the separator character. When you press OK, the default separator character ('.') will be restored.

11.5. "Field value contains some illegal characters" warning.

This error informs you that no character has been entered to serve as the separator character. When you press OK, the default separator character ('.') will be restored.


FAQ: Home

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks