- Improve recovery when EHLO not supported on legacy SMTP servers
[squirrelmail.git] / doc / i18n.txt
CommitLineData
995970c0 1
2 *************************************
3 * SquirrelMail internationalization *
4 *************************************
5
6$Date$
7$Revision$
7159c7e7 8
9This document should explain how SquirrelMail internationalization works and
10provide information about some aspects of implementation.
11
121. Supported languages
132. $languages array
143. XTRA_CODE functions
154. Display of different charsets
165. IMAP folder names
4a82caba 176. Plural forms
995970c0 187. Language setup
198. Time zones
7159c7e7 20
21-------------------------------
221. Supported languages
23-------------------------------
24Valid language codes are:
25* ar - Arabic, windows-1256 charset
26* bg_BG - Bulgarian, windows-1251 charset
f18ba212 27* bn_IN - Bengali, utf-8 charset
7159c7e7 28* ca_ES - Catalan, iso-8859-1 charset
29* cs_CZ - Czech, iso-8859-2 charset
30* cy_GB - Welsh, iso-8859-1 charset
31* da_DK - Danish, iso-8859-1 charset
32* de_DE - German, iso-8859-1 charset
33* el_GR - Greek, iso-8859-7 charset
34* en_GB - British, iso-8859-15 charset
35* en_US - English, charset depends on $default_charset
36* es_ES - Spanish, iso-8859-1 charset
37* et_EE - Estonian, iso-8859-15 charset
38* eu_ES - Basque, iso-8859-1 charset
39* fa_IR - Farsi, utf-8 charset
40* fi_FI - Finnish, iso-8859-1 charset
41* fo_FO - Faroese, iso-8859-1 charset
42* fr_FR - French, iso-8859-1 charset
43* he_IL - Hebrew, windows-1255 charset
44* hr_HR - Croatian, iso-8859-2 charset
45* hu_HU - Hungarian, iso-8859-2 charset
46* id_ID - Indonesian, iso-8859-1 charset
47* is_IS - Icelandic, iso-8859-1 charset
48* it_IT - Italian, iso-8859-1 charset
49* ja_JP - Japanese, euc-jp charset (emails are created in iso-2022-jp)
50* ko_KR - Korean, euc-kr charset
51* lt_LT - Lithuanian, utf-8 charset
52* ms_MY - Malay, iso-8859-1 charset
53* nb_NO - Norwegian (Bokmal), iso-8859-1 charset
54* nl_NL - Dutch, iso-8859-1 charset
55* nn_NO - Norwegian (Nynorsk), iso-8859-1 charset
56* pl_PL - Polish, iso-8859-2 charset
57* pt_BR - Portuguese (Brazil), iso-8859-1 charset
58* pt_PT - Portuguese (Portugal), iso-8859-1 charset
59* ro_RO - Romanian, iso-8859-2 charset
60* ru_UA - Ukrainian Russian, koi8-r charset
61* ru_RU - Russian, utf-8 charset
62* sk_SK - Slovak, iso-8859-2 charset
63* sl_SI - Slovenian, iso-8859-2 charset
64* sr_YU - Serbian, iso-8859-2 charset
65* sv_SE - Swedish, iso-8859-1 charset
66* ug - Uighur, utf-8 charset (some systems don't support Uighur system locale)
67* th_TH - Thai, tis-620 charset
68* tl_PH - Tagalog, iso-8859-1 charset (main translation is missing, only some plugins are translated)
69* tr_TR - Turkish, iso-8859-9 charset
70* uk_UA - Ukrainian, koi8-u charset
71* zh_CN - Chinese Simplified, gb2312 charset
72* zh_TW - Chinese Traditional, big5 charset
73
50d5212c 74Charset totals:
75* iso-8859-1 = 21
76* iso-8859-2 = 8
f18ba212 77* utf-8 = 5
50d5212c 78* iso-8859-15 = 2
79* iso-8859-7 = 1
80* iso-8859-9 = 1
81* koi8-r = 1
82* koi8-u = 1
83* windows-1251 = 1
84* windows-1255 = 1
85* windows-1256 = 1
86* tis-620 = 1
87* gb2312 = 1
88* big5 = 1
89* euc-jp = 1
90* euc-kr = 1
91
7159c7e7 92-------------------
932. $languages array
94-------------------
f88384ba 95$languages array is stored in functions/i18n.php and defines translations
96that are enabled in SquirrelMail.
97
98Format of array:
99 $languages['language_code']['key'] = 'value'
100
101Possible array key names:
102* NAME - Translation name in English. Any 8bit symbols must be html encoded.
103* CHARSET - Charset used by translation
104* ALIAS - 'language_code' should contain short language name
105 (iso-639). 'value' should contain name of other 'language_code'
106 that defines translation with NAME and CHARSET keys.
107 Entry links short language form with long form (language+country).
108 See: http://www.loc.gov/standards/iso639-2/langhome.html and
109 http://www.iso.org/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/list-en1.html
110* ALTNAME - Native translation name. Any 8bit symbols must be html encoded.
111 Name is visible when $show_alternative_names is enabled.
112* LOCALE - Full locale name (in xx_XX.charset format or other format required
113 by php gettext functions). From 1.4.4/1.5.1 'value' can contain
114 array. If php version is older than 4.3.0, system uses only first
115 locale name listed in array. First locale name must be compatible
116 with FreeBSD system locale names.
117* DIR - Text direction. Used to define Right-to-Left languages. Possible
118 values 'rtl' or 'ltr'. If undefined - defaults to 'ltr'.
119* XTRA_CODE - translation uses special functions. (see chapter 3. XTRA_CODE functions)
120
121Each 'language_code' definition requires NAME+CHARSET or ALIAS keys. Other keys are
122optional.
7159c7e7 123
124----------------------
1253. XTRA_CODE functions
126----------------------
f88384ba 127XTRA_CODE functions provide way to change interface behavior, when translation
128requires special handling of some SquirrelMail functions. Functions are enabled
129by setting XTRA_CODE option in $languages array and including appropriate
130functions in functions/i18n.php. First part of function name is word listed in
131$languages['language_code']['XTRA_CODE'] value. Second part is one of special
132keywords. Possible keywords:
7159c7e7 133* _decode
134Used in src/compose.php, src/i18n.php, src/view_text.php, functions/mime.php
135Requires mbstring support
136
137* _encode
138Used in src/compose.php, src/read_body.php
139
140* _encodeheader
141Used in functions/mime.php
142Returning function
143
144* _decodeheader
145Used in functions/mime.php
146Returning function
147
148* _downloadfilename
149Used in functions/mime.php
150
151* _utf7_imap_encode
152Used in functions/imap_utf7_local.php
153Returning function
154
155* _utf7_imap_decode
156Used in functions/imap_utf7_local.php
157Returning function
158
159* _strimwidth
160Used in functions/mailbox_display.php
161Returning function
162
163* _wordwrap
164Used in functions/strings.php (sqWordWrap)
165
166--------------------------------
1674. Display of different charsets
168--------------------------------
995970c0 169When SquirrelMail generates html pages, it uses charset defined in translation
f88384ba 170selected by end user. Interface can display emails encoded in different
171charsets. In order to display characters that might be unsupported by user's
172charset, SquirrelMail uses decoding functions that convert non us-ascii symbols
173into html entities. All decoding functions are stored in functions/decode/
174directory.
175
176By default SquirrelMail includes decoding functions that support iso-8859-x,
177windows-125x, utf-8, us-ascii, koi8-r, koi8-u, tis-620, ns-4551_1, iso-ir-111,
178cp855 and cp866 charsets. Other decoding functions are distributed in separate
179packages. Separate packaging of decoding functions is supported from
995970c0 180SquirrelMail 1.4.4 and 1.5.0. us-ascii decoding replaces all 8bit symbols with
181question marks. utf-8 decoding function does not enable decoding of five and six
182byte utf-8 symbols by default (code is commented) and replaces all incorrectly
183formated 8bit symbols with question marks.
f88384ba 184
995970c0 185Some decoding functions might require php recode extension or php 4.3+ mbstring
186extension. If your php installation does not support them, you might be using
187slower and cpu/memory intensive functions.
7159c7e7 188
189--------------------
1905. IMAP folder names
191--------------------
f88384ba 192IMAP folder names use UTF7-IMAP charset. Folder names that are stored in
193conf.pl must be encoded in UTF7-IMAP charset. SquirrelMail uses internal
194functions that convert folder names from/to utf7-imap charset. By default those
4a82caba 195functions work with iso-8859-1 charset. Other charsets are supported only
196when php mbstring extension supports them.
197
198TODO: write independent implementation of charset to utf7-imap conversion.
199
200---------------
2016. Plural forms
202---------------
203From v.1.5.1 SquirrelMail includes support of plural forms. It allows to use
204correct translation forms with numbers. For example. "We have %s squirrel on
205the roof." and "We have %s squirrels on the roof." can be written in one
206function call without checking actual number for squirrels. Gettext functions
207also deal with non English languages that might use different word forms for
208two, five, ten or more units.
209
210Support is provided by ngettext functions that exist in php gettext extension
211from php 4.2.0 and by ngettext function replacements from php-gettext classes
212(http://savannah.nongnu.org/projects/php-gettext). In order to use it correctly
213when php gettext extension does not have ngettext support, SquirrelMail uses
214bindtextdomain and textdomain wrappers that load missing functions.
215
216If plugins want to use ngettext functions without increasing php requirements
217to 4.2.0 with gettext support, they should require SquirrelMail 1.5.1, use
218sq_bindtextdomain function instead of bindtextdomain and use sq_textdomain
995970c0 219function instead of textdomain function. If SquirrelMail wrapper functions
4a82caba 220are used, there is no need to issue sq_bindtextdomain when plugins reverts to
221SquirrelMail domain.
222
223More information about ngettext and plural forms can be found at:
224http://www.gnu.org/software/gettext/manual/html_chapter/gettext_10.html#SEC150
995970c0 225
226-----------------
2277. Language setup
228-----------------
229SquirrelMail uses set_up_language() function to setup language environment.
230Environment is setup automatically when include/validate.php is loaded.
231
232SquirrelMail gets interface language from three places:
233 a) user preference. It is set in Options -> Display Preferences -> Language.
234 preference uses language key. If user's preferences are not available (user
235 is not logged in), system tries to extract language value from
236 'squirrelmail_language' cookie.
237 b) default squirrelmail language that is set in configuration
238 ($squirrelmail_default_language variable).
239 c) preferred language setting provided by browser. It is used only when default
240 squirrelmail language is set to empty string
241
242If language information is not available, SquirrelMail falls back to US English
243translation.
244
245-------------
2468. Time zones
247-------------
248If php install allows modifying environment variable TZ, SquirrelMail allows
249end users to select different time zone in their preferences. It can be set in
250Options -> Personal Information -> Your current timezone. Time zone is
251setup automatically when include/validate.php is loaded.
252
253If TZ variable can't be modified (php is running is safe mode and variable
254is not listed in php safe_mode_allowed_env_vars), user's time zone options are
552a9297 255not visible and interface uses default webserver's time zone.
995970c0 256
ee20a285 257SquirrelMail 1.5.0 and older store list of available time zones in
258locale/timezones.cfg. Since 1.5.1 standard times zones are moved to
259include/timezones/standard.php and time zone handling differs from older
260SquirrelMail versions. Time zone configuration is controlled in SquirrelMail
261configuration utility (conf.pl), 4. General Options -> 15. Time zone
262configuration menu option. Administrator can select standard, strict, custom
263and custom strict time zone handling.
264
265Standard handling does not differ from previous SquirrelMail versions and
266SquirrelMail uses GNU C geographical location based time zone names. Strict
267handling uses time zone codes with offset from GMT. Strict time zones should
268work on systems that don't support GNU C time zone naming. Custom and custom
269strict handling uses config/timezones.php file instead of
270include/timezones/standard.php.
271
272config/timezones.php file should store $aTimeZones array with different set of
273time zones. See default time zone set in include/timezones/standard.php.For
274example:
275
276<?php
277// World outside US border is a mirage
278
279$aTimeZones=array();
280$aTimeZones['America/New_York']['NAME']='US Eastern standard time';
281$aTimeZones['America/New_York']['TZ']='EST5EDT';
282
283$aTimeZones['America/Chicago']['NAME']='US Central standard time';
284$aTimeZones['America/Chicago']['TZ']='CST6CDT';
285
286// Oliver County, ND
287$aTimeZones['America/North_Dakota/Center']['NAME']='US, Oliver County [ND]';
288$aTimeZones['America/North_Dakota/Center']['TZ']='CST6CDT'; // CST since 1992
289
290$aTimeZones['America/Denver']['NAME']='US Mountain standard time';
291$aTimeZones['America/Denver']['TZ']='MST7MDT';
292
293$aTimeZones['America/Los_Angeles']['NAME']='US Pacific standard time';
294$aTimeZones['America/Los_Angeles']['TZ']='PST8PDT';
295
296// Aliaska
297$aTimeZones['America/Juneau']['NAME']='Aliaska, Juneau';
298$aTimeZones['America/Juneau']['TZ']='NAST9NADT';
299$aTimeZones['America/Yakutat']['NAME']='Aliaska, Yakutat';
300$aTimeZones['America/Yakutat']['TZ']='NAST9NADT';
301$aTimeZones['America/Anchorage']['NAME']='Aliaska, Anchorage';
302$aTimeZones['America/Anchorage']['TZ']='NAST9NADT';
303$aTimeZones['America/Nome']['NAME']='Aliaska, Nome';
304$aTimeZones['America/Nome']['TZ']='NAST9NADT';
305$aTimeZones['America/Adak']['NAME']='US, Aleutian Islands';
306$aTimeZones['America/Adak']['TZ']='AST10ADT';
307
308$aTimeZones['Pacific/Honolulu']['NAME']='US, Hawaii';
309$aTimeZones['Pacific/Honolulu']['TZ']='UCT10';
310$aTimeZones['America/Phoenix']['NAME']='US, Arizona';
311$aTimeZones['America/Phoenix']['TZ']='MST7'; // gmt-7
312$aTimeZones['America/Shiprock']['LINK']='America/Denver';
313
314$aTimeZones['America/Boise']['NAME']='US, South Idaho';
315$aTimeZones['America/Boise']['TZ']='MST7MDT';
316$aTimeZones['America/Indianapolis']['NAME']='US, Indiana';
317$aTimeZones['America/Indianapolis']['TZ']='EST5';
318$aTimeZones['America/Indiana/Indianapolis']['LINK']='America/Indianapolis';
319// Crawford County, Indiana
320$aTimeZones['America/Indiana/Marengo']['NAME']='US, Crawford County [IN]';
321$aTimeZones['America/Indiana/Marengo']['TZ']='EST5';
322// Starke County, Indiana
323$aTimeZones['America/Indiana/Knox']['NAME']='US, Starke County [IN]';
324$aTimeZones['America/Indiana/Knox']['TZ']='EST5';
325// Switzerland County, Indiana
326$aTimeZones['America/Indiana/Vevay']['NAME']='US, Switzerland County [IN]';
327$aTimeZones['America/Indiana/Vevay']['TZ']='EST5';
328$aTimeZones['America/Louisville']['NAME']='US, Louisville [KY]';
329$aTimeZones['America/Louisville']['TZ']='EST5EDT';
330$aTimeZones['America/Kentucky/Louisville']['LINK']='America/Louisville';
331// Wayne, Clinton, and Russell Counties, Kentucky
332$aTimeZones['America/Kentucky/Monticello']['NAME']='US, Wayne, Clinton, and Russell Counties [KY]';
333$aTimeZones['America/Kentucky/Monticello']['TZ']='EST5EDT';
334// Michigan
335$aTimeZones['America/Detroit']['NAME']='US, Michigan';
336$aTimeZones['America/Detroit']['TZ']='EST5EDT';
337// The Michigan border with Wisconsin switched from EST to CST/CDT in 1973.
338$aTimeZones['America/Menominee']['NAME']='US, Menominee [MI]';
339$aTimeZones['America/Menominee']['TZ']='CST6CDT';
340?>
341
342GNU C time zone naming should be supported by many Unix OSes. It is recommended
343way of setting time zone, because it handles historical changes and daylight
344savings specific to selected geographical location. Strict time zones might
345provide inaccurate or outdated time zone settings.
995970c0 346
347If modifications in TZ environment are visible in your webserver's logs (time
348offset is changed), make sure that you can reproduce such behavior in latest php
349version and report bug to php developers. Issue can be fixed by blocking use of
350time zone (php safe mode and TZ is not listed in safe_mode_allowed_env_vars
351setting or forced_prefs plugin) or by attaching special php script with
352putenv('TZ=some time zone') call in php auto_append_file setting (suggestion is
353not tested and you might have to fix all SquirrelMail exit calls).
354
355Please note, that use of auto_append_file provides only temporally workaround
356and does not fix your php setup. Script that runs as unprivileged user, should
357be unable to affect webserver's logging system.
358