Languages
rigour.langs
Language code handling
This library helps to normalise the ISO 639 codes used to describe languages from two-letter codes to three letters, and vice versa.
import rigour.langs as languagecodes
assert 'eng' == languagecodes.iso_639_alpha3('en')
assert 'eng' == languagecodes.iso_639_alpha3('ENG ')
assert 'en' == languagecodes.iso_639_alpha2('ENG ')
Uses data from: https://iso639-3.sil.org/ See also: https://www.loc.gov/standards/iso639-2/php/code_list.php
LangStr
Bases: str
A type of string that include language metadata. This is useful for handling multilingual content.
The class does not override any operators and functions, which means they will behave like a regular string.
Source code in rigour/langs/text.py
is_lang_better(candidate, baseline)
Decide if the candidate language code is 'better' than the baseline language code, according to the preferred languages list.
is_lang_better('eng', 'deu') True is_lang_better('fra', 'eng') False
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
candidate
|
str
|
The candidate language code. |
required |
baseline
|
str
|
The baseline language code. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if the candidate is better than the baseline. |
Source code in rigour/langs/__init__.py
iso_639_alpha2(code)
Convert a language identifier to an ISO 639 Part 1 code, such as "en"
or "de". For languages which do not have a two-letter identifier, or
invalid language codes, None will be returned.
Source code in rigour/langs/__init__.py
iso_639_alpha3(code)
Convert a given language identifier into an ISO 639 Part 2 code, such
as "eng" or "deu". This will accept language codes in the two- or three-
letter format, and some language names. If the given string cannot be
converted, None will be returned.
iso_639_alpha3('en') 'eng'
Source code in rigour/langs/__init__.py
list_to_alpha3(languages, synonyms=True)
Parse all the language codes in a given list into ISO 639 Part 2 codes and optionally expand them with synonyms (i.e. other names for the same language).