Introduction
NIST 2003 Open Machine Translation (OpenMT) Evaluation is a
package containing source data, reference translations, and
scoring software used in the NIST 2003 OpenMT evaluation. It is
designed to help evaluate the effectiveness of machine
translation systems. The package was compiled and scoring
software was developed by researchers at NIST, making use of
newswire source data and reference translations collected and
developed by LDC. The objective of the NIST OpenMT
evaluation series is to support research in, and help advance
the state of the art of, machine translation (MT) technologies
-- technologies that translate text between human
languages. Input may include all forms of text. The goal is for
the output to be an adequate and fluent translation of the
original.
The MT evaluation series started in 2001 as part of the DARPA
TIDES
(Translingual Information Detection, Extraction) program.
Beginning with the 2006 evaluation, the evaluations have been
driven and coordinated by NIST as NIST OpenMT. These
evaluations provide an important contribution to the direction
of research efforts and the calibration of technical
capabilities in MT. The OpenMT evaluations are intended to be
of interest to all researchers working on the general problem
of automatic translation between human languages. To this end,
they are designed to be simple, to focus on core technology
issues, and to be fully supported. The 2003 task was to
evaluate translation from Chinese to English and from Arabic to
English.
Additional information about these evaluations may be found at
the NIST Open
Machine Translation (OpenMT) Evaluation web site.
Scoring Tools
This evaluation kit includes a single perl script (mteval-v09c.pl)
that may be used to produce a translation quality score for one (or
more) MT systems. The script works by comparing the system output
translation with a set of (expert) reference translations of the same
source text. Comparison is based on finding sequences of words in the
reference translations that match word sequences in the system output
translation. More information on the evaluation algorithm may be
obtained from the paper detailing the
algorithm: BLEU:
a Method for Automatic Evaluation of Machine Translation (Papineni et
al, 2002).
The included scoring script was released with the original
evaluation, intended for use with SGML-formatted data files, and is
provided to ensure compatibility of user scoring results with results
from the original evaluation. An updated scoring software package
(mteval-v13a-20091001.tar.gz), with XML support, additional options
and bug fixes, documentation, and example translations, may be
downloaded from
the NIST Multimodal
Information Group Tools website.
Data
The Chinese-language and Arabic-language source text included in
this corpus is a reorganization of data that was initially released to
the public respectively
as Multiple-Translation
Chinese (MTC) Part 4 (LDC2006T04)
and Multiple-Translation
Arabic (MTA) Part 2 (LDC2005T05). The reference translations are
a reorganized subset of data from these same Multiple-Translation
corpora. All source data for this corpus is newswire text collected
in January and February of 2003 from Agence France-Presse, and Xinhua
News Agency. For details on the methodology of the source data
collection and production of reference translations, see the
documentation for the above-mentioned corpora.
For each language, the test set consists of two files, a source and
a reference file. Each reference file contains four independent
translations of the data set. The evaluation year, source language,
test set (which, by default, is "evalset"), version of the
data, and source vs. reference file (with the latter being indicated
by "-ref") are reflected in the file name.
DARPA TIDES MT and NIST OpenMT evaluations used SGML-formatted test
data until 2008 and XML-formatted test data thereafter. The files in
this package are provided in both formats.
Sample
Sample text file containing excerpts from different xml files included in this corpus, including reference translations and source text for a single newswire document. The file is encoded in UTF-8.
Updates
There are no updates available at this time.
Content Copyright
Portions © 2003 Agence France-Presse, © 2003 Xinhua
News Agency, © 2004-2006, 2010 Trustees of the University of
Pennsylvania. |