upcarta
  • Sign In
  • Sign Up
  • Explore
  • Search

On the Role of Incidental Bilingualism in PaLM’s Translation Capability

  • Paper
  • May 17, 2023
  • #Naturallanguageprocessing
Eleftheria Briakou
@ebriakou
(Author)
arxiv.org
Read on arxiv.org
1 Recommender
1 Mention
Large, multilingual language models exhibit surprisingly good zero- or few-shot machine translation capabilities, despite having never seen the intentionally-included translation ex... Show More

Large, multilingual language models exhibit
surprisingly good zero- or few-shot machine
translation capabilities, despite having never
seen the intentionally-included translation examples provided to typical neural translation
systems. We investigate the role of incidental bilingualism—the unintentional consumption of bilingual signals, including translation
examples—in explaining the translation capabilities of large language models, taking the
Pathways Language Model (PaLM) as a case
study. We introduce a mixed-method approach
to measure and understand incidental bilingualism at scale. We show that PaLM is
exposed to over 30 million translation pairs
across at least 44 languages. Furthermore,
the amount of incidental bilingual content is
highly correlated with the amount of monolingual in-language content for non-English languages. We relate incidental bilingual content to zero-shot prompts and show that it
can be used to mine new prompts to improve
PaLM’s out-of-English zero-shot translation
quality. Finally, in a series of small-scale ablations, we show that its presence has a substantial impact on translation capabilities, although
this impact diminishes with model scale.

Show Less
Recommend
Post
Save
Complete
Collect
Mentions
See All
Lucas Beyer @giffmana · May 19, 2023
  • Post
  • From Twitter
See, LLMs don’t magically get skills out of thin air, as some papers suggest. This is a very nice paper taking a deep dive into one of them (translation skill) and it clearly comes from it being in the data. I think that’s great and a good motivator for training on everything!
  • upcarta ©2025
  • Home
  • About
  • Terms
  • Privacy
  • Cookies
  • @upcarta