Local machine translations between German and Italian
Recently, Enrico Pirozzi (of Learn Postgresql fame) pointed me to an intriguing set of Open Source machine translation models.
These models, developed by the Language Technology Research Group at the University of Helsinki, support numerous language pairs and are available on Hugging Face:
https://huggingface.co/Helsinki-NLP
Coincidentally, I was exploring German ⇆ Italian machine translations, so I decided to give these models a try.
First, I needed to set up a Python virtual environment. I’m using a plain Ubuntu 22.04 machine:
sudo apt-get install python3-venv
python3 -m venv .venv
source .venv/bin/activate
pip install transformers torch sentencepiece sacremoses
This process installs 5 GB 😱of Python libraries in a subdirectory called .venv
.
The large size is primarily due to the inclusion of torch
(PyTorch).
Once the environment is ready, you can run one of these two simple examples that translate a random paragraph:
To run these:
source .venv/bin/activate
python3 mt-deit.py
python3 mt-itde.py
When you run these programs for the first time, they will
download the relevant model (de-it or it-de).
Each model is approximately 300 MB and will be cached in
~/.cache/huggingface/hub/
.