IcedTea-Web-Localization with OmegaT

From IcedTea

Jump to: navigation, search


1 Introduction

This is a short guide on using OmegaT for translation.

OmegaT is a free, open-source, multiplatform (written in Java) CAT tool, which can facilitate and speed up your translation work. It is not a machine translation software (although you can use MT in OmegaT if you want)! The application does not translate for you, rather it gives you the opportunity to re-use your own (or someone else's) previous translations. It is useful for translation of materials and documents, where same or similar texts tend to repeat (ie. for example software, technical documentation, product catalogs, etc.). The basic idea behind all this is a translation memory, where text units are stored and can be re-used. OmegaT and other CAT tools (such as for example Trados Studio, memoQ, Wordfast, Transit and other) parse and split the text for translation into segments (usually sentences). When you translate such a segment, it is saved in the translation memory. When you encounter a same or similar segment later in the text, OmegaT will offer you the already existing translation of that segment from the translation memory. You then only verify, if the translation is correct and appropriate in the new context and if all is OK, you can (but do not have to) use the translation again. In this way CAT tools can make your translation work easier, faster and more consistent. There are also other features, which help to make the quality of your translations better (spell checker, support for terminology databases, tag validation, etc.).

You can download the latest versions of OmegaT for free here.

Or you can use IcedTea-Web itself. For experienced users, less stable, but full of newest features:


or for all others, more stable, and with user manual:


Unluckily, sourcefoge does not have fastest connection, but luckily, once you download once, it works fine with -Xoffline switch.

2 Creating a project

Before you can start translating with OmegaT, you must first create a project. Run OmegaT and then click on Project > New..., navigate to the location, where you wish to keep the project files and create the project. In the next window select the language combination, review/specify file locations and enable sentence-level segmentation (which is recommended, unless you want to use paragraph-level segmentation for some reason). You can also turn on/off Auto-propagation of translations into repeated segments (recommended to turn it on).

Then click OK. In the next window import files for translation. A list of supported file formats can be found in documentation.

In the previously specified location OmegaT will then create the project package with folders for source files, terminology glossaries, translation memories, dictionaries and translated target files. If you have any translation memories, terminology glossaries or dictionaries you would like to use, copy appropriate files into appropriate folders.

3 Translating

By default you will see the translation window in the left part of the translation environment. Translation memory suggestions (where you can see 100% matches or partial, so called fuzzy matches with your previous translations) are in the top right part and terminology suggestions from your glossaries in the bottom right part of the translation environment.

You can customize the layout simply by dragging the windows (Glossary, TM, etc.) and dropping them, where you wish them to be. You can also adjust, what and how is displayed in the translation window in the View roll-down menu (it is useful to turn most of the display options on).

So your customized translation environment can have quite different look:

Before starting to translate go to Options > Preferences > Spellchecker. Check the Automatically check the spelling of text checkbox and install dictionary for your language using the Install new dictionary... button.

Then go to Tag Processing and adjust tag verification settings. Following settings are recommended:

You can also adjust OmegaT's editing behavior in Editor Please check at least the Allow translation to be equal to source option.

Now you can go back to the translation environment and start translating. Open the first segment in the translation window and translate the source text. Then move to the next segment by using the ctrl+N shortcut (move to next segment) or alternately ctrl+U (move to next untranslated segment). See also all available editing actions and their shortcuts under the Edit and Go to menus. Shortcuts can be customized in the appropriate configuration file.

Please note that OmegaT auto-propagates the translation of a non-unique segment to all other repetitions of that segment! If you need to use a different translation for some of the repetitions due to different context, right-click on the respective segment and select Create alternative translation.

Using the Edit > Search Project... function (ctrl+F) you can search in translation memories (concordance), source and target segments and notes. You can adjust the search settings (exact search, keyword search, regular expressions, case sensitive, etc.) according to your needs and use wildcards.

Regularly save your work by clicking Project > Save (ctrl+S).

After you translate all segments, run the QA verification with Tools > Check Issues..., review all potential errors in translation and correct the real ones (ignore false alarms).

4 Post-processing

When you finish your translation, make sure to save your work (Project > Save or ctrl+S). Then click Project > Create Translated Documents to create final translated documents. You will find them in the "target" folder in the location with project files (you specified it when creating the project).

5 Tips and tricks

5.1 Translation memories

OmegaT supports translation memories in the standard TMX format. If you want to use some existing, available translation memory, simply copy the TMX file(s) into the "tm" subfolder in the folder with project files. You can add more translation memories this way. If you want to pre-translate the file(s) for translation with translations from some translation memory, copy the TMX file into the "tm/auto" subfolder. If one of the translation memories is less reliable, you can penalize matches from this translation memory, when they are displayed in the OmegaT interface. To do this, put the translation memory into the "tm/penalty-xx" subfolder, where "xx" should be replaced by the desired penalty value (between 0 and 100) - so 100% matches from a translation memory in the "tm/penalty-10" subfolder will appear as 90% matches in the OmegaT interface.

Once you finish you translation and create final translated files, OmegaT creates three TMX files in the root directory of the project package. If you want to re-use your translation in another project or for example in an update of the translated project, use one of these three TMX files (if you will be translating in OmegaT, use translation memory with "-omegat" suffix, if you would be working with another CAT tool, use translation memory with "-level2" suffix).

More details can be found in the documentation.

5.2 Terminology glossaries

With regard to terminology glossaries, OmegaT supports the standard TBX format (only for reading). Additionally OmegaT supports glossaries in tab-delimited text files (with a .txt or .utf8 extension). There can be up to three columns in the file (1. source term, 2. target term, 3. comment). Also supported is the CSV format (it's the same as with tab-delimited files, but the separator is comma this time). Again, to use a terminology glossary, copy the TBX/TAB/UTF8/TXT file into the "glossary" subfolder (or deeper subfolders of the "glossary" folder) in the OmegaT project package. Terms in a source text, that can be found in the specified glossaries, will be displayed in the "Glossary" pane. There can be more glossaries in the project, but only one of them can be writable, which means that you can add new terms to this file during translation from the OmegaT user interface (Edit > Create Glossary Entry, or ctrl+shift+G; of course you can add new terms to any of the glossaries by editing the appropriate file in your favorite text editor).

More details can be found in the documentation.

Tip: IT terminology as used by Microsoft is freely available in the TBX format on the Microsoft Language Portal. You can easily download it and use it when translating IcedTea-Web.

Personal tools