Commit | Line | Data |
---|---|---|
84024b72 | 1 | $Cambridge: exim/doc/doc-txt/README.SIEVE,v 1.11 2007/03/21 15:15:12 ph10 Exp $ |
495ae4b0 PH |
2 | |
3 | Notes on the Sieve implementation for Exim | |
4 | ||
5 | Exim Filter Versus Sieve Filter | |
6 | ||
7 | Exim supports two incompatible filters: The traditional Exim filter and | |
8 | the Sieve filter. Since Sieve is a extensible language, it is important | |
9 | to understand "Sieve" in this context as "the specific implementation | |
10 | of Sieve for Exim". | |
11 | ||
12 | The Exim filter contains more features, such as variable expansion, and | |
13 | better integration with the host environment, like external processes | |
14 | and pipes. | |
15 | ||
16 | Sieve is a standard for interoperable filters, defined in RFC 3028, | |
17 | with multiple implementations around. If interoperability is important, | |
18 | then there is no way around it. | |
19 | ||
20 | ||
21 | Exim Implementation | |
22 | ||
bfad5236 | 23 | The Exim Sieve implementation offers the core as defined by |
84024b72 PH |
24 | draft-ietf-sieve-3028bis-10.txt (next version of RFC 3028 that |
25 | fixes specification mistakes), the "envelope" test (3028bis), the | |
26 | "fileinto" action (3028bis), the "copy" parameter (RFC 3894), the | |
27 | "vacation" action (draft-ietf-sieve-vacation-06), the "notify" action | |
28 | (draft-ietf-sieve-notify-06.), the "i;ascii-numeric" comparator (RFC 2244) | |
29 | and the subaddress parameter (draft-ietf-sieve-rfc3598bis-05). | |
495ae4b0 PH |
30 | |
31 | The Sieve filter is integrated in Exim and works very similar to the | |
32 | Exim filter: Sieve scripts are recognized by the first line containing | |
33 | "# sieve filter". When using "keep" or "fileinto" to save a mail into a | |
34 | folder, the resulting string is available as the variable $address_file | |
1c59d63b PH |
35 | in the transport that stores it. The following routers and transport |
36 | show a typical use of Sieve: | |
37 | ||
38 | begin routers | |
39 | ||
40 | localuser_verify: | |
41 | driver = accept | |
42 | domains = +localdomains | |
43 | local_part_suffix = "-*" | |
44 | local_part_suffix_optional | |
45 | check_local_user | |
46 | require_files = $home/.forward | |
47 | verify_only = true | |
48 | ||
49 | localuser_deliver: | |
50 | driver = redirect | |
51 | domains = +localdomains | |
52 | local_part_suffix = "-*" | |
53 | local_part_suffix_optional | |
54 | sieve_subaddress = "${sg{$local_part_suffix}{^-}{}}" | |
55 | sieve_useraddress = "$local_part" | |
56 | check_local_user | |
57 | require_files = $home/.forward | |
58 | file = $home/.forward | |
59 | check_ancestor | |
60 | allow_filter | |
61 | file_transport = localuser | |
62 | reply_transport = vacation | |
63 | sieve_vacation_directory = $home/mail/vacation | |
64 | verify = false | |
65 | ||
66 | begin transports | |
495ae4b0 PH |
67 | |
68 | localuser: | |
69 | driver = appendfile | |
70 | file = ${if eq{$address_file}{inbox} \ | |
71 | {/var/mail/$local_part} \ | |
72 | {${if eq{${substr_0_1:$address_file}}{/} \ | |
73 | {$address_file} \ | |
1c59d63b | 74 | {$home/mail/$address_file} \ |
495ae4b0 PH |
75 | }} \ |
76 | } | |
77 | delivery_date_add | |
78 | envelope_to_add | |
79 | return_path_add | |
80 | mode = 0600 | |
81 | ||
1c59d63b PH |
82 | vacation: |
83 | driver = autoreply | |
495ae4b0 | 84 | |
1c59d63b PH |
85 | Absolute files are stored where specified, relative files are stored |
86 | relative to $home/mail and "inbox" goes to the standard mailbox location. | |
87 | To enable "vacation", sieve_vacation_directory is set to the directory | |
88 | where vacation databases are held (don't put anything else in that | |
89 | directory) and point reply_transport to an autoreply transport. | |
90 | Setting the Sieve useraddress and subaddress allows to use the subaddress | |
91 | extension. | |
495ae4b0 PH |
92 | |
93 | ||
94 | RFC Compliance | |
95 | ||
96 | Exim requires the first line to be "# sieve filter". Of course the RFC | |
97 | does not enforce that line. Don't expect examples to work without adding | |
98 | it, though. | |
99 | ||
100 | RFC 3028 requires using CRLF to terminate the end of a line. | |
101 | The rationale was that CRLF is universally used in network protocols | |
102 | to mark the end of the line. This implementation does not embed Sieve | |
103 | in a network protocol, but uses Sieve scripts as part of the Exim MTA. | |
104 | Since all parts of Exim use \n as newline character, this implementation | |
105 | does, too. You can change this by defining the macro RFC_EOL at compile | |
106 | time to enforce CRLF being used. | |
107 | ||
495ae4b0 PH |
108 | Sieve scripts can not contain NUL characters in strings, but mail |
109 | headers could contain MIME encoded NUL characters, which could never | |
110 | be matched by Sieve scripts using exact comparisons. For that reason, | |
111 | this implementation extends the Sieve quoted string syntax with \0 | |
112 | to describe a NUL character, violating \0 being the same as 0 in | |
1c59d63b | 113 | RFC 3028. |
495ae4b0 PH |
114 | |
115 | The folder specified by "fileinto" must not contain the character | |
1c59d63b | 116 | sequence ".." to avoid security problems. RFC 3028 does not specify the |
495ae4b0 PH |
117 | syntax of folders apart from keep being equivalent to fileinto "INBOX". |
118 | This implementation uses "inbox" instead. | |
119 | ||
120 | Sieve script errors currently cause that messages are silently filed into | |
121 | "inbox". RFC 3028 requires that the user is notified of that condition. | |
122 | This may be implemented in future by adding a header line to mails that | |
123 | are filed into "inbox" due to an error in the filter. | |
124 | ||
87fcc8b9 PH |
125 | The automatic replies generated by "vacation" do not contain an updated |
126 | "references" header field. | |
127 | ||
495ae4b0 | 128 | |
495ae4b0 PH |
129 | Semantics Of Keep |
130 | ||
131 | The keep command is equivalent to fileinto "inbox": It saves the | |
132 | message and resets the implicit keep flag. It does not set the | |
133 | implicit keep flag; there is no command to set it once it has | |
134 | been reset. | |
135 | ||
136 | ||
024bd3c2 | 137 | Semantics Of Fileinto |
495ae4b0 PH |
138 | |
139 | RFC 3028 does not specify if "fileinto" tries to create a mail folder, | |
140 | in case it does not exist. This implementation allows to configure | |
141 | that aspect using the appendfile transport options "create_directory", | |
142 | "create_file" and "file_must_exist". See the appendfile transport in | |
143 | the Exim specification for details. | |
144 | ||
145 | ||
024bd3c2 PH |
146 | Allof And Anyof Test |
147 | ||
148 | RFC 3028 does not specify if these tests use shortcut/lazy evaluation. | |
149 | Exim uses shortcut evaluation. | |
150 | ||
151 | ||
152 | Action Reordering | |
153 | ||
154 | RFC 3028 does not specify if actions may be executed out of order. | |
155 | Exim may execute them out of order, e.g. messages may be filed to | |
156 | folders or forwarded in a different order than specified, because | |
157 | those actions only setup delivery, but do not execute it themselves. | |
158 | ||
159 | ||
160 | Wildcard Matching | |
161 | ||
162 | RFC 3028 is not exactly clear if comparators act on unicode characters | |
163 | or on octets containing their UTF-8 representation. As it turns out, | |
164 | many implementations go the second way. This does not make a difference | |
165 | but for wildcard matching and octet-wise comparison. Working on unicode | |
166 | means a dot matches a character. Working on UTF-8 means the dot matches | |
167 | a single octet of a multi-octet sequence. For octet-wise comparisons, | |
168 | working on UTF-8 means arbitrary byte sequences in headers can not be | |
169 | matches, as they are rarely correct UTF-8 sequences and can thus not be | |
170 | expressed as string literal. This implementation works on unicode, but | |
171 | this may be changed in case RFC3028bis specifies this issue safe and sound. | |
172 | ||
173 | ||
174 | Sieve Syntax And Semantics | |
495ae4b0 PH |
175 | |
176 | RFC 3028 confuses syntax and semantics sometimes. It uses a generic | |
1c59d63b PH |
177 | grammar as syntax for commands and tests and performs many checks during |
178 | semantic analysis. Syntax is specified by grammar rules, semantics | |
179 | by natural language, despite the latter often talking about syntax. | |
495ae4b0 PH |
180 | The intention was to provide a framework for the syntax that describes |
181 | current commands as well as future extensions, and describing commands | |
31c4e005 | 182 | by semantics. |
495ae4b0 | 183 | |
1c59d63b | 184 | The following replacement for section 8.2 gives two grammars, one for |
495ae4b0 PH |
185 | the framework, and one for specific commands, thus removing most of the |
186 | semantic analysis. Since the parser can not parse unsupported extensions, | |
1c59d63b PH |
187 | the result is strict error checking of any executed and not executed code |
188 | until "stop" is executed or the end of the script is reached. | |
495ae4b0 PH |
189 | |
190 | 8.2. Grammar | |
191 | ||
192 | The atoms of the grammar are lexical tokens. White space or comments may | |
193 | appear anywhere between lexical tokens, they are not part of the grammar. | |
194 | The grammar is specified in ABNF with two extensions to describe tagged | |
195 | arguments that can be reordered and grammar extensions: { } denotes a | |
196 | sequence of symbols that may appear in any order. Example: | |
197 | ||
1c59d63b PH |
198 | options = a b c |
199 | start = { options } | |
495ae4b0 PH |
200 | |
201 | is equivalent to: | |
202 | ||
1c59d63b | 203 | start = ( a b c ) / ( a c b ) / ( b a c ) / ( b c a ) / ( c a b ) / ( c b a ) |
495ae4b0 PH |
204 | |
205 | The symbol =) is used to append to a rule: | |
206 | ||
207 | start = a | |
208 | start =) b | |
209 | ||
210 | is equivalent to | |
211 | ||
212 | start = a b | |
213 | ||
214 | All Sieve commands, including extensions, MUST be words of the following | |
215 | generic grammar with the start symbol "start". They SHOULD be specified | |
216 | using a specific grammar, though. | |
217 | ||
218 | argument = string-list / number / tag | |
219 | arguments = *argument [test / test-list] | |
220 | block = "{" commands "}" | |
221 | commands = *command | |
222 | string = quoted-string / multi-line | |
223 | string-list = "[" string *("," string) "]" / string | |
224 | test = identifier arguments | |
225 | test-list = "(" test *("," test) ")" | |
226 | command = identifier arguments ( ";" / block ) | |
227 | start = command | |
228 | ||
229 | The basic Sieve commands are specified using the following grammar, which | |
230 | language is a subset of the generic grammar above. The start symbol is | |
231 | "start". | |
232 | ||
233 | address-part = ":localpart" / ":domain" / ":all" | |
234 | comparator = ":comparator" string | |
235 | match-type = ":is" / ":contains" / ":matches" | |
236 | string = quoted-string / multi-line | |
237 | string-list = "[" string *("," string) "]" / string | |
238 | address-test = "address" { [address-part] [comparator] [match-type] } | |
239 | string-list string-list | |
240 | test-list = "(" test *("," test) ")" | |
241 | allof-test = "allof" test-list | |
242 | anyof-test = "anyof" test-list | |
243 | exists-test = "exists" string-list | |
244 | false-test = "false" | |
245 | true=test = "true" | |
246 | header-test = "header" { [comparator] [match-type] } | |
247 | string-list string-list | |
248 | not-test = "not" test | |
249 | relop = ":over" / ":under" | |
250 | size-test = "size" relop number | |
251 | block = "{" commands "}" | |
252 | if-command = "if" test block *( "elsif" test block ) [ "else" block ] | |
253 | stop-command = "stop" { stop-options } ";" | |
254 | stop-options = | |
255 | keep-command = "keep" { keep-options } ";" | |
256 | keep-options = | |
257 | discard-command = "discard" { discard-options } ";" | |
258 | discard-options = | |
259 | redirect-command = "redirect" { redirect-options } string ";" | |
260 | redirect-options = | |
261 | require-command = "require" { require-options } string-list ";" | |
262 | require-options = | |
263 | test = address-test / allof-test / anyof-test / exists-test | |
264 | / false-test / true-test / header-test / not-test | |
265 | / size-test | |
266 | command = if-command / stop-command / keep-command | |
267 | / discard-command / redirect-command | |
268 | commands = *command | |
269 | start = *require-command commands | |
270 | ||
271 | The extensions "envelope" and "fileinto" are specified using the following | |
272 | grammar extension. | |
273 | ||
274 | envelope-test = "envelope" { [comparator] [address-part] [match-type] } | |
275 | string-list string-list | |
276 | test =/ envelope-test | |
277 | ||
278 | fileinto-command = "fileinto" { fileinto-options } string ";" | |
279 | fileinto-options = | |
280 | command =/ fileinto-command | |
281 | ||
282 | The extension "copy" is specified as: | |
283 | ||
284 | fileinto-options =) ":copy" | |
285 | redirect-options =) ":copy" | |
286 | ||
287 | ||
288 | The i;ascii-numeric Comparator | |
289 | ||
290 | RFC 2244 describes this comparator and specifies that non-numeric strings | |
291 | are considered equal with an ordinal value higher than any numeric string. | |
292 | Although not stated explicitly, this includes the empty string. A range | |
293 | of at least 2^31 is required. This implementation does not limit the | |
294 | range, because it does not convert numbers to binary representation | |
295 | before comparing them. | |
296 | ||
297 | ||
298 | The vacation extension | |
299 | ||
300 | The extension "vacation" is specified using the following grammar | |
301 | extension. | |
302 | ||
303 | vacation-command = "vacation" { vacation-options } <reason: string> | |
304 | vacation-options = [":days" number] | |
495ae4b0 | 305 | [":subject" string] |
f656d135 PH |
306 | [":from" string] |
307 | [":addresses" string-list] | |
495ae4b0 | 308 | [":mime"] |
f656d135 | 309 | [":handle" string] |
495ae4b0 PH |
310 | command =/ vacation-command |
311 | ||
312 | ||
313 | Semantics Of ":mime" | |
314 | ||
f656d135 PH |
315 | The draft does not specify how strings using MIME entities are used |
316 | to compose messages. As a result, different implementations generate | |
317 | different mails. The Exim Sieve implementation splits the reason into | |
318 | header and body. It adds the header to the mail header and uses the body | |
319 | as mail body. Be aware, that other imlementations compose a multipart | |
320 | structure with the reason as only part. Both conform to the specification | |
321 | (or lack thereof). | |
495ae4b0 PH |
322 | |
323 | ||
324 | Semantics Of Not Using ":mime" | |
325 | ||
326 | Sieve scripts are written in UTF-8, so is the reason string in this | |
327 | case. This implementation adds MIME headers to indicate that. This | |
328 | is not required by the vacation draft, which does not specify how | |
329 | the UTF-8 reason is processed to compose the resulting message. | |
330 | ||
331 | ||
495ae4b0 PH |
332 | Default Subject |
333 | ||
5ea81592 PH |
334 | The draft specifies that the default message subject is "Auto: " plus |
335 | the old subject. Using this subject is dangerous, because many mailing | |
336 | lists verify addresses by sending a secret key in the subject of a | |
337 | message, asking to reply to the message for confirmation. Using the | |
338 | default vacation subject confirms any subscription request of this kind, | |
339 | allowing to subscribe a third party to any mailing list, either to annoy | |
340 | the user or to declare spam as legitimate mail by proving to use opt-in. | |
495ae4b0 PH |
341 | |
342 | ||
343 | Rate Limiting Responses | |
344 | ||
f656d135 PH |
345 | In absence of a handle, this implementation hashes the reason, |
346 | ":subject" option, ":mime" option and ":from" option and uses the hex | |
347 | string representation as filename within the "sieve_vacation_directory" | |
348 | to store the recipient addresses for this vacation parameter set. | |
495ae4b0 PH |
349 | |
350 | The draft specifies that sites may define a minimum ":days" value than 1. | |
351 | This implementation uses 1. The maximum value MUST greater than 7, | |
352 | and SHOULD be greater than 30. This implementation uses a maximum of 31. | |
353 | ||
354 | Vacation recipient address databases older than 31 days are automatically | |
355 | removed. Users do not have to remove them manually when modifying their | |
356 | scripts. Don't put anything but vacation databases in that directory | |
357 | or you risk that it will be removed, too! | |
358 | ||
359 | ||
360 | Global Reply Address Blacklist | |
361 | ||
362 | The draft requires that each implementation offers a global black list | |
363 | of addresses that will never be replied to. Exim offers this as option | |
364 | "never_mail" in the autoreply transport. | |
84024b72 PH |
365 | |
366 | ||
367 | The enotify extension | |
368 | ||
369 | The extension "enotify" is specified using the following grammar | |
370 | extension. | |
371 | ||
372 | notify-command = "notify" { notify-options } <method: string> | |
373 | notify-options = [":from" string] | |
374 | [":importance" <"1" / "2" / "3">] | |
375 | [":options" 1*(string-list / number)] | |
376 | [":message" string] | |
377 | ||
378 | command =/ notify-command | |
379 | ||
380 | valid_notify_method = "valid_notify_method" | |
381 | <notification-uris: string-list> | |
382 | ||
383 | test =/ valid_notify_method | |
384 | ||
385 | Only the mailto URI scheme is implemented. |