Chris Mair

Sprachauswahl

Local machine translations between German and Italian

Recently, Enrico Pirozzi (of Learn Postgresql fame) pointed me to an intriguing set of Open Source machine translation models.

These models, developed by the Language Technology Research Group at the University of Helsinki, support numerous language pairs and are available on Hugging Face:

https://huggingface.co/Helsinki-NLP

Coincidentally, I was exploring German ⇆ Italian machine translations, so I decided to give these models a try.

First, I needed to set up a Python virtual environment. I’m using a plain Ubuntu 22.04 machine:

sudo apt-get install python3-venv

python3 -m venv .venv
source .venv/bin/activate

pip install transformers torch sentencepiece sacremoses

This process installs 5 GB 😱of Python libraries in a subdirectory called .venv. The large size is primarily due to the inclusion of torch (PyTorch).

Once the environment is ready, you can run one of these two simple examples that translate a random paragraph:

To run these:

source .venv/bin/activate
python3 mt-deit.py
python3 mt-itde.py

When you run these programs for the first time, they will download the relevant model (de-it or it-de). Each model is approximately 300 MB and will be cached in ~/.cache/huggingface/hub/.

Menü

Sprachauswahl