Motivation: Data-Lexicon Mismatch
For low-resource languages that lack labeled data, we can use bilingual lexicons to translate existing labeled task data in high-resource languages to low-resource languages with word-to-word translation.
We observe that often the words in existing task data have low lexical overlap with the words in the task-agnostic bilingual lexicons.
Therefore, we propose generating lexicon-compatible task data (Figure (a)) for translating into low-resource languages. This improves number of words translated (Figure (b) left) and maximizes the utilization of semantic information in bilingual lexicons (Figure (b) right).