SquirrelMail internationalization This document should explain how SquirrelMail internationalization works and provide information about some aspects of implementation. 1. Supported languages 2. $languages array 3. XTRA_CODE functions 4. Display of different charsets 5. IMAP folder names 6. Plural forms ------------------------------- 1. Supported languages ------------------------------- Valid language codes are: * ar - Arabic, windows-1256 charset * bg_BG - Bulgarian, windows-1251 charset * bn_IN - Bengali, utf-8 charset * ca_ES - Catalan, iso-8859-1 charset * cs_CZ - Czech, iso-8859-2 charset * cy_GB - Welsh, iso-8859-1 charset * da_DK - Danish, iso-8859-1 charset * de_DE - German, iso-8859-1 charset * el_GR - Greek, iso-8859-7 charset * en_GB - British, iso-8859-15 charset * en_US - English, charset depends on $default_charset * es_ES - Spanish, iso-8859-1 charset * et_EE - Estonian, iso-8859-15 charset * eu_ES - Basque, iso-8859-1 charset * fa_IR - Farsi, utf-8 charset * fi_FI - Finnish, iso-8859-1 charset * fo_FO - Faroese, iso-8859-1 charset * fr_FR - French, iso-8859-1 charset * he_IL - Hebrew, windows-1255 charset * hr_HR - Croatian, iso-8859-2 charset * hu_HU - Hungarian, iso-8859-2 charset * id_ID - Indonesian, iso-8859-1 charset * is_IS - Icelandic, iso-8859-1 charset * it_IT - Italian, iso-8859-1 charset * ja_JP - Japanese, euc-jp charset (emails are created in iso-2022-jp) * ko_KR - Korean, euc-kr charset * lt_LT - Lithuanian, utf-8 charset * ms_MY - Malay, iso-8859-1 charset * nb_NO - Norwegian (Bokmal), iso-8859-1 charset * nl_NL - Dutch, iso-8859-1 charset * nn_NO - Norwegian (Nynorsk), iso-8859-1 charset * pl_PL - Polish, iso-8859-2 charset * pt_BR - Portuguese (Brazil), iso-8859-1 charset * pt_PT - Portuguese (Portugal), iso-8859-1 charset * ro_RO - Romanian, iso-8859-2 charset * ru_UA - Ukrainian Russian, koi8-r charset * ru_RU - Russian, utf-8 charset * sk_SK - Slovak, iso-8859-2 charset * sl_SI - Slovenian, iso-8859-2 charset * sr_YU - Serbian, iso-8859-2 charset * sv_SE - Swedish, iso-8859-1 charset * ug - Uighur, utf-8 charset (some systems don't support Uighur system locale) * th_TH - Thai, tis-620 charset * tl_PH - Tagalog, iso-8859-1 charset (main translation is missing, only some plugins are translated) * tr_TR - Turkish, iso-8859-9 charset * uk_UA - Ukrainian, koi8-u charset * zh_CN - Chinese Simplified, gb2312 charset * zh_TW - Chinese Traditional, big5 charset Charset totals: * iso-8859-1 = 21 * iso-8859-2 = 8 * utf-8 = 5 * iso-8859-15 = 2 * iso-8859-7 = 1 * iso-8859-9 = 1 * koi8-r = 1 * koi8-u = 1 * windows-1251 = 1 * windows-1255 = 1 * windows-1256 = 1 * tis-620 = 1 * gb2312 = 1 * big5 = 1 * euc-jp = 1 * euc-kr = 1 ------------------- 2. $languages array ------------------- $languages array is stored in functions/i18n.php and defines translations that are enabled in SquirrelMail. Format of array: $languages['language_code']['key'] = 'value' Possible array key names: * NAME - Translation name in English. Any 8bit symbols must be html encoded. * CHARSET - Charset used by translation * ALIAS - 'language_code' should contain short language name (iso-639). 'value' should contain name of other 'language_code' that defines translation with NAME and CHARSET keys. Entry links short language form with long form (language+country). See: http://www.loc.gov/standards/iso639-2/langhome.html and http://www.iso.org/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/list-en1.html * ALTNAME - Native translation name. Any 8bit symbols must be html encoded. Name is visible when $show_alternative_names is enabled. * LOCALE - Full locale name (in xx_XX.charset format or other format required by php gettext functions). From 1.4.4/1.5.1 'value' can contain array. If php version is older than 4.3.0, system uses only first locale name listed in array. First locale name must be compatible with FreeBSD system locale names. * DIR - Text direction. Used to define Right-to-Left languages. Possible values 'rtl' or 'ltr'. If undefined - defaults to 'ltr'. * XTRA_CODE - translation uses special functions. (see chapter 3. XTRA_CODE functions) Each 'language_code' definition requires NAME+CHARSET or ALIAS keys. Other keys are optional. ---------------------- 3. XTRA_CODE functions ---------------------- XTRA_CODE functions provide way to change interface behavior, when translation requires special handling of some SquirrelMail functions. Functions are enabled by setting XTRA_CODE option in $languages array and including appropriate functions in functions/i18n.php. First part of function name is word listed in $languages['language_code']['XTRA_CODE'] value. Second part is one of special keywords. Possible keywords: * _decode Used in src/compose.php, src/i18n.php, src/view_text.php, functions/mime.php Requires mbstring support * _encode Used in src/compose.php, src/read_body.php * _encodeheader Used in functions/mime.php Returning function * _decodeheader Used in functions/mime.php Returning function * _downloadfilename Used in functions/mime.php * _utf7_imap_encode Used in functions/imap_utf7_local.php Returning function * _utf7_imap_decode Used in functions/imap_utf7_local.php Returning function * _strimwidth Used in functions/mailbox_display.php Returning function * _wordwrap Used in functions/strings.php (sqWordWrap) -------------------------------- 4. Display of different charsets -------------------------------- When SquirrelMail generates html pages, it uses charset set by translation selected by end user. Interface can display emails encoded in different charsets. In order to display characters that might be unsupported by user's charset, SquirrelMail uses decoding functions that convert non us-ascii symbols into html entities. All decoding functions are stored in functions/decode/ directory. By default SquirrelMail includes decoding functions that support iso-8859-x, windows-125x, utf-8, us-ascii, koi8-r, koi8-u, tis-620, ns-4551_1, iso-ir-111, cp855 and cp866 charsets. Other decoding functions are distributed in separate packages. Separate packaging of decoding functions is supported from SquirrelMail 1.4.4 and 1.5.0. Some decoding functions might require php recode support. If your php installation does not support recode, you might be using slower and cpu/memory intensive functions. -------------------- 5. IMAP folder names -------------------- IMAP folder names use UTF7-IMAP charset. Folder names that are stored in conf.pl must be encoded in UTF7-IMAP charset. SquirrelMail uses internal functions that convert folder names from/to utf7-imap charset. By default those functions work with iso-8859-1 charset. Other charsets are supported only when php mbstring extension supports them. TODO: write independent implementation of charset to utf7-imap conversion. --------------- 6. Plural forms --------------- From v.1.5.1 SquirrelMail includes support of plural forms. It allows to use correct translation forms with numbers. For example. "We have %s squirrel on the roof." and "We have %s squirrels on the roof." can be written in one function call without checking actual number for squirrels. Gettext functions also deal with non English languages that might use different word forms for two, five, ten or more units. Support is provided by ngettext functions that exist in php gettext extension from php 4.2.0 and by ngettext function replacements from php-gettext classes (http://savannah.nongnu.org/projects/php-gettext). In order to use it correctly when php gettext extension does not have ngettext support, SquirrelMail uses bindtextdomain and textdomain wrappers that load missing functions. If plugins want to use ngettext functions without increasing php requirements to 4.2.0 with gettext support, they should require SquirrelMail 1.5.1, use sq_bindtextdomain function instead of bindtextdomain and use sq_textdomain function instead of text domain function. If SquirrelMail wrapper functions are used, there is no need to issue sq_bindtextdomain when plugins reverts to SquirrelMail domain. More information about ngettext and plural forms can be found at: http://www.gnu.org/software/gettext/manual/html_chapter/gettext_10.html#SEC150