Entity-Aware Machine Translation Leaderboard

Overview

This leaderboard showcases the performance of various systems on the EA-MT shared task, which has been organized as part of the SemEval 2025 workshop.

  • The results are still provisional and subject to change.

Task Description

The task is to translate a given input sentence from the source language (English) to the target language, where the input sentence contains named entities that may be challenging for machine translation systems to handle. The named entities may be entities that are rare, ambiguous, or unknown to the machine translation system. The task is to develop machine translation systems that can accurately translate such named entities in the input sentence to the target language.

Scoring

The leaderboard is based on three main scores:

  • M-ETA Score: A score that evaluates the translation quality of named entities in the input sentence.
  • COMET Score: A score that evaluates the translation quality at the sentence level.
  • Overall Score: The harmonic mean of the M-ETA and COMET scores.

Legend

  • ๐ŸŸ : Uses gold data, i.e., the gold Wikidata ID or information derived from it, at test time.
  • ๐Ÿ”: Uses RAG (Retrieval-Augmented Generation) for named entity translation.
  • ๐Ÿค–: Uses an LLM (Large Language Model) for named entity translation.
  • ๐Ÿ“š: The system (LLM and/or MT model) is finetuned on additional data.

Filters and Controls

Use the dropdowns and checkboxes to filter the leaderboard scores.

Team Name
System Name
LLM Name

Leaderboard Scores

You can view the leaderboard scores for each system based on the following metrics:

  • M-ETA Score: A score that evaluates the translation quality of named entities in the input sentence.
  • COMET Score: A score that evaluates the translation quality at the sentence level.
  • Overall Score: The harmonic mean of the M-ETA and COMET scores. Switch between the tabs to view the scores for each metric.

Note: You can sort the leaderboard by clicking on the column headers. For example, click on the "it_IT" column to sort by the Italian language scores.

Overall Score Leaderboard

Overall Score Leaderboard
Rank
Team
System
Uses Gold
Uses RAG
Uses LLM
LLM Name
Finetuned
ar_AE
de_DE
es_ES
fr_FR
it_IT
ja_JP
ko_KR
th_TH
tr_TR
zh_TW
overall
10
The Five Forbidden Entities
LoRA-nllb-distilled-200-distilled-600M
๐ŸŸ 
๐Ÿ”
๐Ÿค–
Llama-3.3-70B-Instruct + DeepSeek-R1
๐Ÿ“š
92.68
90.03
92.54
92.92
94.39
93.34
92.77
92.35
89.54
87.36
91.79