Background

Programming in python is straightforward. It is easy to produce texts etc.
Preparing for international use is not. However there are nice tools:

https://docs.python.org/3/library/i18n.html

https://en.wikipedia.org/wiki/Gettext

https://www.gnu.org/software/gettext/manual/gettext.html#Introduction

Installation

Download and install 64-bit Git for Windows Setup from https://git-scm.com/download/win

appi18n

Use our appi18n library from tispy git or download

  • appi18n\__init__.py
  • appi18n\agettext.py
  • appi18n\alocale.py

from https://ximes-jira2.atlassian.net/secure/bbb.gp.gitviewer.BrowseGit.jspa?repoId=17&branchName=master&path=lib%2Fappi18n and place all files into a subdirectory called appi18n in your python application folder.

Setup locale

appi18n
appi18n.setlocale(category=locale.LC_ALL, locale=None)

If locale is given and not None, setlocale() modifies the locale setting for the category. The available categories are listed at https://docs.python.org/3/library/locale.html#locale.LC_CTYPE. locale may be a string, or an iterable of two strings (language code and encoding). If it’s an iterable, it’s converted to a locale name using the locale aliasing engine. An empty string specifies the user’s default settings. If the modification of the locale fails, the exception Error is raised. If successful, the new locale setting is returned.

Example:

setlocale
import appi18n
import locale
appi18n.setlocale(locale='de-DE') # returns 'de-DE'
locale.str(1.2345) # returns '1,2345'
locale.format_string('%.2f%%', 1.2345 * 100) # returns '123,45%'
 
## ALTERNATIVE
appi18n.setlocale(locale='') # sets whatever has been set in the environment (e.g., TIS)
locale.str(1.2345) # returns '1,2345' when locale setting is de-DE (i.e., LC_ALL='de-DE' environment variable is set)

Setup translation

appi18n.find
appi18n.find(domain, localedir=None, languages=None, selectall=False)

This function implements the standard .mo file search algorithm. It takes a domain, identical to what gettext.textdomain() takes. Optional localedir is as in gettext.bindtextdomain(). Optional languages is a list of strings, where each string is a language code. If localedir is not given, then the local appfolder or python zipapp file is used to look for .mo files. If languages is not given, then the following environment variables are searched:

LANGUAGE, LC_ALL, LC_MESSAGES, and LANG.

The first one returning a non-empty value is used for the languages variable. The environment variables should contain a colon separated list of languages, which will be split on the colon to produce the expected list of language code strings. appi18n.find() then expands and normalizes the languages, and then iterates through them, searching for an existing file built of these components:

localedir/language/LC_MESSAGES/domain.mo

The first such file name that exists is returned by appi18n.find(). If no such file is found, then None is returned. If selectall is given, it returns a list of all file names, in the order in which they appear in the languages list or the environment variables.


appi18n.translation
appi18n.translation(domain, localedir=None, languages=None, class_=None, fallback=False, codeset=None)

Return a Translations instance based on the domain, localedir, and languages, which are first passed to appi18n.find() to get a list of the associated .mo file paths (either located in the local appfolder / python zipapp file if localedir is None, otherwise located under localedir). Instances with identical .mo file names are cached. The actual class instantiated is either class_ if provided, otherwise gettext.GNUTranslations. The class’s constructor must take a single file object argument. If provided, codeset will change the charset used to encode translated strings in the gettext.GNUTranslations.lgettext() and gettext.GNUTranslations.lngettext() methods. If multiple files are found, later files are used as fallbacks for earlier ones. To allow setting the fallback, copy.copy() is used to clone each translation object from the cache; the actual instance data is still shared with the cache. If no .mo file is found, this function raises OSError if fallback is false (which is the default), and returns a gettext.NullTranslations instance if fallback is true.

An example translation

Strings that should be translated are marked like this: _('text')

gettexter.py
import appi18n
 
appi18n.translation('gettexter', languages=['de'], fallback=True).install()
 
print(_('Hello translatable world.'))
print(_('Hello untranslatable world.'))
 
xxx = _("This needs translation.")
print(xxx)
 
yyy = "This won't get translated."
print(yyy)


Step

Execute

Produces

Start of example




Running getttexter.py without translation:

C:\>python gettexter.py

Hello translatable world.
Hello untranslatable world.
This needs translation.
This won't get translated.

Step 1: Extraction of Texts

C:\>"c:\Program Files\Git\usr\bin\xgettext.exe"  --from-code=utf-8 --language=python --output=messages.pot gettexter.py


 


produces the template file C:\messages.pot contains all translatable strings from gettexter.py that are written like this:

_('text')

In our example, the content of messages.pot is:

[...]

#: gettexter.py:5
msgid "Hello translatable world."
msgstr ""


#: gettexter.py:6
msgid "Hello untranslatable world."
msgstr ""

#: gettexter.py:8
msgid "This needs translation."
msgstr ""

Step 2: Translation

  • copy messages.pot to locale\de\LC_MESSAGES\gettexter.po or locale\en\LC_MESSAGES\gettexter.po (for de=Deutsch / en=Englisch / .po = Portable Object)
  • add translations to the respective msgstr
  • save in UTF-8 encoding

Edit with Visual Studio Code / Notepad++ or similar good editor
AVOID NOTEPAD as it has no UTF-8 encoding

Example:

[...]

"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: gettexter.py:5
msgid "Hello translatable world."
msgstr "Hallo übersetzbare Welt."


#: gettexter.py:6
msgid "Hello untranslatable world."
msgstr ""


#: gettexter.py:8
msgid "This needs translation."
msgstr "Das braucht eine Übersetzung."

Step 3: Set encoding
to UTF-8

Replace in your file (e.g. locale\de\LC_MESSAGES\gettexter.po): 

"Content-Type: text/plain; charset=CHARSET\n"

 with

"Content-Type: text/plain; charset=utf-8\n"

and save locale\de\LC_MESSAGES\gettexter.po in UTF-8 Encoding !!!!

Step 4: compilation

Depending on file name e.g. locale\de\LC_MESSAGES\gettexter.po

C:\>"c:\Program Files\Git\usr\bin\msgfmt.exe" --statistics --output=locale\de\LC_MESSAGES\gettexter.mo locale\de\LC_MESSAGES\gettexter.po


2 translated messages, 1 untranslated message.

The binary file locale\de\LC_MESSAGES\gettexter.mo

(.mo = Machine Object)





Now, running gettexter.py with translation file locale\de\LC_MESSAGES\gettexter.mo yields

C:\>python gettexter.py

Produces:

Hallo übersetzbare Welt.
Hello untranslatable world.
Das braucht eine Übersetzung.

This won't get translated.

PO File Maintenance

Adapt gettexter.py from above by changing it to:

gettexter.py
import appi18n
 
appi18n.translation('gettexter', languages=['de'], fallback=True).install()
 
print(_('Hello translatable world.'))
print(_('Hello untranslatable world.'))
 
xxx = _("This needs more translation.")
print(xxx)
 
yyy = _("This won't get translated.")
print(yyy)

xxx has been changed to _("This needs more translation.")

yyy has been changed to _("This won't get translated.")


Now we can update de.po as follows:

  1. Execute and overwrite messages.pot

    >"c:\Program Files\Git\usr\bin\xgettext.exe" --from-code=utf-8 --language=python --output=messages.pot gettexter.py
  2. messages.pot now contains

    messages.pot
    # SOME DESCRIPTIVE TITLE.
    # Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
    # This file is distributed under the same license as the PACKAGE package.
    # FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
    #
    #, fuzzy
    msgid ""
    msgstr ""
    "Project-Id-Version: PACKAGE VERSION\n"
    "Report-Msgid-Bugs-To: \n"
    "POT-Creation-Date: 2018-09-28 14:19+0200\n"
    "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
    "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
    "Language-Team: LANGUAGE <LL@li.org>\n"
    "Language: \n"
    "MIME-Version: 1.0\n"
    "Content-Type: text/plain; charset=CHARSET\n"
    "Content-Transfer-Encoding: 8bit\n"
     
    #: gettexter.py:5
    msgid "Hello translatable world."
    msgstr ""
     
    #: gettexter.py:6
    msgid "Hello untranslatable world."
    msgstr ""
     
    #: gettexter.py:8
    msgid "This needs more translation."
    msgstr ""
     
    #: gettexter.py:11
    msgid "This won't get translated."
    msgstr ""
  3. Merge new messages.pot with old locale\de\LC_MESSAGES\gettexter.po

    "c:\Program Files\Git\usr\bin\msgmerge.exe" --backup=t --update locale\de\LC_MESSAGES\gettexter.po messages.pot
    ... done.
  4. Now locale\de\LC_MESSAGES\gettexter.po has been updated and contains:

    locale\de\LC_MESSAGES\gettexter.po
    # SOME DESCRIPTIVE TITLE.
    # Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
    # This file is distributed under the same license as the PACKAGE package.
    # FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
    #
    #, fuzzy
    msgid ""
    msgstr ""
    "Project-Id-Version: PACKAGE VERSION\n"
    "Report-Msgid-Bugs-To: \n"
    "POT-Creation-Date: 2018-09-28 14:19+0200\n"
    "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
    "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
    "Language-Team: LANGUAGE <LL@li.org>\n"
    "Language: \n"
    "MIME-Version: 1.0\n"
    "Content-Type: text/plain; charset=utf-8\n"
    "Content-Transfer-Encoding: 8bit\n"
     
    #: gettexter.py:5
    msgid "Hello translatable world."
    msgstr "Hallo übersetzbare Welt."
     
    #: gettexter.py:6
    msgid "Hello untranslatable world."
    msgstr ""
     
    #: gettexter.py:8
    #, fuzzy
    msgid "This needs more translation."
    msgstr "Das braucht eine Übersetzung."
     
    #: gettexter.py:11
    #, fuzzy
    msgid "This won't get translated."
    msgstr "Das braucht eine Übersetzung."
  5. Note that entries marked with "#, fuzzy" are not in sync with gettexter.py and potentially need some fix

  6. Let's pretend that above locale\de\LC_MESSAGES\gettexter.po is fine and we just want to create locale\de\LC_MESSAGES\gettexter.mo, thus we execute:

    >"c:\Program Files\Git\usr\bin\msgfmt.exe" --statistics --output=locale\de\LC_MESSAGES\gettexter.mo locale\de\LC_MESSAGES\gettexter.po
    1 translated message, 2 fuzzy translations, 1 untranslated message.
  7. Now we can run gettexter.py again and get:

    >python gettexter.py
    Hallo übersetzbare Welt.
    Hello untranslatable world.
    This needs more translation.
    This won't get translated.
  8. Only the first entry is getting translated, the two fuzzy translations are ignored, and one translation is empty

  9. So we fix locale\de\LC_MESSAGES\gettexter.po now as follows:

    locale\de\LC_MESSAGES\gettexter.po
    # SOME DESCRIPTIVE TITLE.
    # Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
    # This file is distributed under the same license as the PACKAGE package.
    # FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
    #
    #, fuzzy
    msgid ""
    msgstr ""
    "Project-Id-Version: PACKAGE VERSION\n"
    "Report-Msgid-Bugs-To: \n"
    "POT-Creation-Date: 2018-09-17 13:12+0200\n"
    "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
    "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
    "Language-Team: LANGUAGE <LL@li.org>\n"
    "Language: \n"
    "MIME-Version: 1.0\n"
    "Content-Type: text/plain; charset=utf-8\n"
    "Content-Transfer-Encoding: 8bit\n"
     
    #: gettexter.py:13
    msgid "Hello translatable world."
    msgstr "Hallo übersetzbare Welt."
     
    #: gettexter.py:14
    msgid "Hello untranslatable world."
    msgstr ""
     
    #: gettexter.py:16
    msgid "This needs more translation."
    msgstr "Das braucht eine weitere Übersetzung."
     
    #: gettexter.py:19
    #, fuzzy
    msgid "This won't get translated."
    msgstr "Das braucht eine Übersetzung."
  10. Note that we have removed "#, fuzzy" from the third translation and fixed the string, but left the 4th translation as is. After invoking msgfmt we obtain:

    >"c:\Program Files\Git\usr\bin\msgfmt.exe" --statistics --output=locale\de\LC_MESSAGES\gettexter.mo locale\de\LC_MESSAGES\gettexter.po
    2 translated messages, 1 fuzzy translation, 1 untranslated message.
  11. Now if we run gettexter.py we get the output

    >python gettexter.py
    Hallo übersetzbare Welt.
    Hello untranslatable world.
    Das braucht eine weitere Übersetzung.
    This won't get translated.