Python translating texts for multilanguage use
Background
Programming in python is straightforward. It is easy to produce texts etc.
Preparing for international use is not. However there are nice tools:
https://docs.python.org/3/library/i18n.html
https://en.wikipedia.org/wiki/Gettext
https://www.gnu.org/software/gettext/manual/gettext.html#Introduction
Installation
Download and install 64-bit Git for Windows Setup from https://git-scm.com/download/win
appi18n
Use our appi18n
library from tispy git or download
appi18n\__init__.py
appi18n\agettext.py
appi18n\alocale.py
from https://ximes-jira2.atlassian.net/secure/bbb.gp.gitviewer.BrowseGit.jspa?repoId=17&branchName=master&path=lib%2Fappi18n and place all files into a subdirectory called appi18n
in your python application folder.
Setup locale
appi18n.setlocale(category
=
locale.LC_ALL, locale
=
None
)
If locale
is given and not None
, setlocale()
modifies the locale setting for the category.
The available categories are listed at https://docs.python.org/3/library/locale.html#locale.LC_CTYPE. locale
may be a string, or an iterable of two strings (language code and encoding). If it’s an iterable, it’s converted to a locale name using the locale aliasing engine. An empty string specifies the user’s default settings. If the modification of the locale fails, the exception Error
is raised. If successful, the new locale setting is returned.
Example:
import
appi18n
import
locale
appi18n.setlocale(locale
=
'de-DE'
)
# returns 'de-DE'
locale.
str
(
1.2345
)
# returns '1,2345'
locale.format_string(
'%.2f%%'
,
1.2345
*
100
)
# returns '123,45%'
## ALTERNATIVE
appi18n.setlocale(locale
=
'')
# sets whatever has been set in the environment (e.g., TIS)
locale.
str
(
1.2345
)
# returns '1,2345' when locale setting is de-DE (i.e., LC_ALL='de-DE' environment variable is set)
Setup translation
appi18n.find(domain, localedir
=
None
, languages
=
None
, selectall
=
False
)
This function implements the standard .mo
file search algorithm. It takes a domain
, identical to what gettext
takes. Optional .
textdomain()localedir
is as in gettext
Optional .
bindtextdomain().languages
is a list of strings, where each string is a language code. If localedir
is not given, then the local appfolder or python zipapp file is used to look for .mo
files. If languages
is not given, then the following environment variables are searched:
LANGUAGE
, LC_ALL
, LC_MESSAGES
, and LANG
.
The first one returning a non-empty value is used for the languages
variable. The environment variables should contain a colon separated list of languages, which will be split on the colon to produce the expected list of language code strings. appi18n.find()
then expands and normalizes the languages, and then iterates through them, searching for an existing file built of these components:
localedir/language/LC_MESSAGES/domain.mo
The first such file name that exists is returned by appi18n.find()
. If no such file is found, then None is returned. If selectall
is given, it returns a list of all file names, in the order in which they appear in the languages list or the environment variables.
appi18n.translation(domain, localedir
=
None
, languages
=
None
,
class_
=
None
, fallback
=
False
, codeset
=
None
)
Return a Translations
instance based on the domain
, localedir
, and languages
, which are first passed to appi18n.find()
to get a list of the associated .mo
file paths (either located in the local appfolder / python zipapp file if localedir
is None
, otherwise located under localedir
). Instances with identical .mo
file names are cached. The actual class instantiated is either class_
if provided, otherwise gettext.GNUTranslations
. The class’s constructor must take a single file object argument. If provided, codeset
will change the charset used to encode translated strings in the gettext
and .GNUTranslations
.lgettext()gettext
methods. If multiple files are found, later files are used as fallbacks for earlier ones. To allow setting the fallback, .GNUTranslations
.lngettext()copy.copy()
is used to clone each translation object from the cache; the actual instance data is still shared with the cache. If no .mo
file is found, this function raises OSError
if fallback
is false (which is the default), and returns a gettext
instance if .
NullTranslations fallback
is true.
An example translation
Strings that should be translated are marked like this: _('text')
import
appi18n
appi18n.translation(
'gettexter'
, languages
=
[
'de'
], fallback
=
True
).install()
print
(_(
'Hello translatable world.'
))
print
(_(
'Hello untranslatable world.'
))
xxx
=
_(
"This needs translation."
)
print
(xxx)
yyy
=
"This won't get translated."
print
(yyy)
Step | Execute | Produces |
Start of example | Running getttexter.py without translation:
| Hello translatable world. |
Step 1: Extraction of Texts |
| produces the template file _('text') In our example, the content of
#: gettexter.py:5 #: gettexter.py:6 msgid "Hello untranslatable world." msgstr "" #: gettexter.py:8 msgid "This needs translation." msgstr "" |
Step 2: Translation |
Edit with Visual Studio Code / Notepad++ or similar good editor | Example: [...] "Content-Type: text/plain; charset=utf-8\n" #: gettexter.py:5 #: gettexter.py:6 #: gettexter.py:8 |
Step 3: Set encoding | Replace in your file (e.g. "Content-Type: text/plain; charset=CHARSET\n" with "Content-Type: text/plain; charset=utf-8\n" and save | |
Step 4: compilation | Depending on file name e.g. C:\>"c:\Program Files\Git\usr\bin\msgfmt.exe" --statistics --output= | 2 translated messages, 1 untranslated message. The binary file ( |
Now, running gettexter.py with translation file
| Produces: Hallo übersetzbare Welt. |
PO File Maintenance
Adapt gettexter.py
from above by changing it to:
import
appi18n
appi18n.translation(
'gettexter'
, languages
=
[
'de'
], fallback
=
True
).install()
print
(_(
'Hello translatable world.'
))
print
(_(
'Hello untranslatable world.'
))
xxx
=
_(
"This needs more translation."
)
print
(xxx)
yyy
=
_(
"This won't get translated."
)
print
(yyy)
xxx
has been changed to _("This needs more translation.")
yyy
has been changed to _("This won't get translated.")
Now we can update de.po
as follows:
Execute and overwrite
messages.pot
>
"c:\Program Files\Git\usr\bin\xgettext.exe"
--from-code=utf-8 --language=python --output=messages.pot gettexter.py
messages.pot
now containsmessages.pot# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2018-09-28 14:19+0200\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: 8bit\n"
#: gettexter.py:5
msgid "Hello translatable world."
msgstr ""
#: gettexter.py:6
msgid "Hello untranslatable world."
msgstr ""
#: gettexter.py:8
msgid "This needs more translation."
msgstr ""
#: gettexter.py:11
msgid "This won't get translated."
msgstr ""
Merge new
messages.pot
with oldlocale\de\LC_MESSAGES\gettexter.po
"c:\Program Files\Git\usr\bin\msgmerge.exe"
--backup=t --update locale\de\LC_MESSAGES\gettexter.po messages.pot
...
done
.
Now
has been updated and contains:locale\de\LC_MESSAGES\gettexter.po
locale\de\LC_MESSAGES\gettexter.po# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2018-09-28 14:19+0200\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
#: gettexter.py:5
msgid "Hello translatable world."
msgstr "Hallo übersetzbare Welt."
#: gettexter.py:6
msgid "Hello untranslatable world."
msgstr ""
#: gettexter.py:8
#, fuzzy
msgid "This needs more translation."
msgstr "Das braucht eine Übersetzung."
#: gettexter.py:11
#, fuzzy
msgid "This won't get translated."
msgstr "Das braucht eine Übersetzung."
- Note that entries marked with
"#, fuzzy"
are not in sync withgettexter.py
and potentially need some fix Let's pretend that above
is fine and we just want to createlocale\de\LC_MESSAGES\gettexter.po
locale\de\LC_MESSAGES\gettexter.mo
>
"c:\Program Files\Git\usr\bin\msgfmt.exe"
--statistics --output=locale\de\LC_MESSAGES\gettexter.mo locale\de\LC_MESSAGES\gettexter.po
1 translated message, 2 fuzzy translations, 1 untranslated message.
Now we can run
gettexter.py
again and get:>python gettexter.py
Hallo übersetzbare Welt.
Hello untranslatable world.
This needs
more
translation.
This won't get translated.
- Only the first entry is getting translated, the two fuzzy translations are ignored, and one translation is empty
So we fix
now as follows:locale\de\LC_MESSAGES\gettexter.po
locale\de\LC_MESSAGES\gettexter.po# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2018-09-17 13:12+0200\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
#: gettexter.py:13
msgid "Hello translatable world."
msgstr "Hallo übersetzbare Welt."
#: gettexter.py:14
msgid "Hello untranslatable world."
msgstr ""
#: gettexter.py:16
msgid "This needs more translation."
msgstr "Das braucht eine weitere Übersetzung."
#: gettexter.py:19
#, fuzzy
msgid "This won't get translated."
msgstr "Das braucht eine Übersetzung."
Note that we have removed
"#, fuzzy"
from the third translation and fixed the string, but left the 4th translation as is. After invokingmsgfmt
we obtain:>
"c:\Program Files\Git\usr\bin\msgfmt.exe"
--statistics --output=locale\de\LC_MESSAGES\gettexter.mo locale\de\LC_MESSAGES\gettexter.po
2 translated messages, 1 fuzzy translation, 1 untranslated message.
Now if we run
gettexter.py
we get the output>python gettexter.py
Hallo übersetzbare Welt.
Hello untranslatable world.
Das braucht eine weitere Übersetzung.
This won't get translated.