Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Apertium: machine translation toolbox / News: Recent posts

Apertium-tagger-training-tools for Apertium 3 and UTF-8

A new version of the apertium-tagger-training-tools has been released (version 1.0.0).

Apertium-tagger-training-tools provides a set of tools to train (in an unsupervised way) the hidden-Markov-model-based part-of-speech taggers used by the Apertium MT platform; to that end, information not only from the source language, but also from the target language and from the rest of modules of the Apertium MT platform is used while training.... read more

Posted by Felipe Sánchez Martínez 2007-11-14

apertium-es-pt 1.0.3 released

Spanish-Portuguese data have been adapted to Apertium 3.0, and released as apertium-es-pt 1.0.3 .
Note that, as previous releases, apertium-es-pt has support both for Brazilian and European Portuguese.

Posted by Mikel L. Forcada 2007-10-03

Apertium 3.0 released

Apertium 3.0, the latest version of the open-source machine translation platform Apertium (packages apertium and lttoolbox), has just been released, together with language-pair data for the Spanish-Catalan pair (version 1.0.4), adapted for use with the new version.

Unlike earlier versions, which worked with single-byte character sets such as ISO-8859-1, which restricted its use to some Western languages, version 3.0 of Apertium has been completely reworked to be fully Unicode-capable: both the text to be translated and the language-pair data needed to translate them can now be encoded in Unicode.... read more

Posted by Mikel L. Forcada 2007-10-01

First release of the apertium-transfer-tools package

apertium-transfer-tools provides a set of tools for the automatic generation of shallow-transfer machine translation (MT) rules from parallel corpora. The generated transfer rules (in XML format) can be directly used by the Apertium MT platform (http://www.apertium.org).

Although this package is aimed at the generation of Apertium transfer rules it can be adapted to generate shallow-transfer rules for other MT platforms. Moreover, some of the tools it provides can be used for other purposes such as the extraction of bilingual phrase pairs or the symmetrization of previously computed alignments. ... read more

Posted by Felipe Sánchez Martínez 2007-07-05

apertium-tagger-training-tools for Apertium 2.0

A new release of the apertium-tagger-training-tools has been released (version 0.9.2).

This release does not add new features, minor changes has been done in order to make it compatible with the last version of lttoolbox and apertium (version 2.0).

Posted by Felipe Sánchez Martínez 2007-01-07

Apertium 2 packages: new engine and language-pair data

A set of new packages have been released as part of the Apertium project, as a result of work funded by the Generalitat de Catalunya (Catalan autonomous government):

* lttoolbox-2.0 and apertium-2.0: a new version of the Apertium machine translation toolbox (with backward compatibility with Apertium 1.0 language-pair data files). The new package features:

- An enhanced structural-transfer ("translation rules") module able to perform more complicated operations in order to treat less-related language pairs such as English-Catalan
- An experimental, optional lexical selection module which tries to deal with polysemous words (under development, not currently used by any language package)
- A new format for dictionaries that allows for a more powerful definition of inflection paradigms and supports multiple equivalents in bilingual dictionaries (to be used in connection with the lexical selection module).
- A new translation script that allows for easy maintenance.... read more

Posted by Sergio Ortiz 2006-12-22

A new release of Apertium (version 1.9) is launched!

This new release of Apertium includes:

* An enhanced structural-transfer module able to perform more complicated operations in order to treat less related language pairs.

* A new translation script that allows for easy maintenance.

This is the preview of the upcoming Apertium 2.0 release that will include other improvements.

Posted by Sergio Ortiz 2006-12-15

New oc-ca release 1.0

A new version (1.0) of the Aranese Occitan - Catalan linguistic data for the apertium open-source machine translation system (www.sf.net/projects/apertium/) has just been released.

Posted by Sergio Ortiz 2006-12-11

New fr-ca release 0.8

A new version (0.8) of the French - Catalan linguistic data for the apertium open-source machine translation system (www.sf.net/projects/apertium/) has just been released.

Posted by Sergio Ortiz 2006-12-04

Version 0.9 of apertium-oc-ca released

A new version (0.9) of the Occitan (Aranese) - Catalan linguistic data for the apertium open-source machine translation system (www.sf.net/projects/apertium/) has just been released.

Posted by Mikel L. Forcada 2006-10-23

Apertium packages available on Debian

Thanks to the work of Fran Tyers, Sergio Talens, and other Debian people, Apertium (http://www.apertium.org) packages have started to be available as Debian Linux packages ready to be installed as binaries for a number of platforms. They are part of the "unstable" distribution.

Currently, Debian distributes version 1.0 of the apertium and lttoolbox packages. Updated versions (1.0.3) will soon be available.... read more

Posted by Mikel L. Forcada 2006-10-18

Incomplete Swedish-Danish data available in CVS

The CVS tree of Apertium contains incomplete linguistic data to build a Swedish-Danish MT system (module apertium-sv-da). These data are incomplete and may contain a number of errors, since they have not been tested as they form part of an abandoned project. The Apertium team welcomes developers for this pair or other Scandinavian language pairs. The relatedness of these languages make it feasible to build a reasonable MT system based on Apertium.

Posted by Mikel L. Forcada 2006-10-05

apertium-eval-translator package released

This package contains a simple Perl script to evaluate Apertium-based machine translation (MT) systems.

The evaluation consists of the computation (at document level) of the word error rate (WER) and the position-independent word error rate (PER) between a translation performed by the Apertium MT system and a reference translation obtained by post-editing the system output.

This package can be easily adapted to evaluate other MT systems.

Posted by Felipe Sánchez Martínez 2006-10-04

lttoolbox-1.0.3 and apertium-1.0.3 now available

lttoolbox-1.0.3 and apertium-1.0.3 have been uploaded. They correct minor bugs observed.

Bug fixed in lttoolbox-1.0.3:
- Blank characters other than ISO-8859-1 number 32 (" ") broke multiword expressions. [Fixed]

Bug fixed in apertium-1.0.3:
- Preference rules in tagger definition file (.tsx) do not work properly. [Fixed]

Part-of-speech taggers affected should be recompiled in order to take advantage of the new apertium release.

Posted by Felipe Sánchez Martínez 2006-10-03

First version of French-Catalan language-pair data released

First release of French-Catalan language pair data for Apertium (6500 lemmata, 60 transfer rules). The French-Catalan language pair is actively being developed with support of the Generalitat de Catalunya; new versions will be released before the end of 2006.

Posted by Mikel L. Forcada 2006-09-22

apertium-tagger-training-tools package released

Apertium-tagger-training-tools is a new software package useful to train the part-of-speech taggers used within the open-source machine translation system Apertium.

Using this package you will be able to train in an unsupervised way the part-of-speech tagger for a given language using information from a another language by means of the apertium MT toolbox. In particular, the package may simplify the initial building of a machine translation system for a new pair of languages.... read more

Posted by Felipe Sánchez Martínez 2006-08-07

New language pairs; a more powerful translation engine

Apertium has recently received funding from the Generalitat de Catalunya (the government of the autonomous community of Catalonia in Spain) to develop new language pairs (Occitan-Catalan, French-Catalan) and an improved transfer architecture to include more difficult pairs such as English–Catalan. The new transfer architecture, which will be released in late 2006, will deal with polysemic words having more than one possible translation and will be able to do more extensive syntactical transformations.

Posted by Mikel L. Forcada 2006-07-25

Occitan-Catalan language data

Package apertium-oc-ca provides linguistic data for translation between the Aranese variant of Occitan and the Catalan language; this is the fourth language-pair package available for Apertium. This language pair is actively being developed; therefore, new releases with more data will soon be made available.

Posted by Mikel L. Forcada 2006-07-20

New version of lttoolbox available (1.0.2)

A new version of the lttoolbox lexical processing package (1.0.2) has been uploaded. It corrects some observed bugs.

Posted by Mikel L. Forcada 2006-07-20

New apertium components released

In the recent weeks the Apertium team has released:

The 1.0.1 version of the Apertium engine (packages lttoolbox and apertium)

The 0.9 version of the Spanish-Portuguese package (package apertium-es-pt; developers welcome!)

Documentation on how to install apertium and how to add data to an existing language pair.

Posted by Mikel L. Forcada 2006-05-16