Added STRIP_COMMAND=/usr/bin/strip to the FreeBSD Makefile.
[exim.git] / doc / doc-txt / README.SIEVE
1 $Cambridge: exim/doc/doc-txt/README.SIEVE,v 1.9 2005/11/21 10:09:13 ph10 Exp $
2
3 Notes on the Sieve implementation for Exim
4
5 Exim Filter Versus Sieve Filter
6
7 Exim supports two incompatible filters: The traditional Exim filter and
8 the Sieve filter. Since Sieve is a extensible language, it is important
9 to understand "Sieve" in this context as "the specific implementation
10 of Sieve for Exim".
11
12 The Exim filter contains more features, such as variable expansion, and
13 better integration with the host environment, like external processes
14 and pipes.
15
16 Sieve is a standard for interoperable filters, defined in RFC 3028,
17 with multiple implementations around. If interoperability is important,
18 then there is no way around it.
19
20
21 Exim Implementation
22
23 The Exim Sieve implementation offers the core as defined by draft
24 3028bis-4 (next version of RFC 3028 that fixes specification mistakes),
25 the "envelope" (3028bis), the "fileinto" (3028bis), the "copy" (RFC 3894)
26 and the "vacation" (draft-ietf-sieve-vacation-04.txt) extension, the
27 "i;ascii-numeric" comparator (RFC 2244).
28
29 The Sieve filter is integrated in Exim and works very similar to the
30 Exim filter: Sieve scripts are recognized by the first line containing
31 "# sieve filter". When using "keep" or "fileinto" to save a mail into a
32 folder, the resulting string is available as the variable $address_file
33 in the transport that stores it. The following routers and transport
34 show a typical use of Sieve:
35
36 begin routers
37
38 localuser_verify:
39 driver = accept
40 domains = +localdomains
41 local_part_suffix = "-*"
42 local_part_suffix_optional
43 check_local_user
44 require_files = $home/.forward
45 verify_only = true
46
47 localuser_deliver:
48 driver = redirect
49 domains = +localdomains
50 local_part_suffix = "-*"
51 local_part_suffix_optional
52 sieve_subaddress = "${sg{$local_part_suffix}{^-}{}}"
53 sieve_useraddress = "$local_part"
54 check_local_user
55 require_files = $home/.forward
56 file = $home/.forward
57 check_ancestor
58 allow_filter
59 file_transport = localuser
60 reply_transport = vacation
61 sieve_vacation_directory = $home/mail/vacation
62 verify = false
63
64 begin transports
65
66 localuser:
67 driver = appendfile
68 file = ${if eq{$address_file}{inbox} \
69 {/var/mail/$local_part} \
70 {${if eq{${substr_0_1:$address_file}}{/} \
71 {$address_file} \
72 {$home/mail/$address_file} \
73 }} \
74 }
75 delivery_date_add
76 envelope_to_add
77 return_path_add
78 mode = 0600
79
80 vacation:
81 driver = autoreply
82
83 Absolute files are stored where specified, relative files are stored
84 relative to $home/mail and "inbox" goes to the standard mailbox location.
85 To enable "vacation", sieve_vacation_directory is set to the directory
86 where vacation databases are held (don't put anything else in that
87 directory) and point reply_transport to an autoreply transport.
88 Setting the Sieve useraddress and subaddress allows to use the subaddress
89 extension.
90
91
92 RFC Compliance
93
94 Exim requires the first line to be "# sieve filter". Of course the RFC
95 does not enforce that line. Don't expect examples to work without adding
96 it, though.
97
98 RFC 3028 requires using CRLF to terminate the end of a line.
99 The rationale was that CRLF is universally used in network protocols
100 to mark the end of the line. This implementation does not embed Sieve
101 in a network protocol, but uses Sieve scripts as part of the Exim MTA.
102 Since all parts of Exim use \n as newline character, this implementation
103 does, too. You can change this by defining the macro RFC_EOL at compile
104 time to enforce CRLF being used.
105
106 Sieve scripts can not contain NUL characters in strings, but mail
107 headers could contain MIME encoded NUL characters, which could never
108 be matched by Sieve scripts using exact comparisons. For that reason,
109 this implementation extends the Sieve quoted string syntax with \0
110 to describe a NUL character, violating \0 being the same as 0 in
111 RFC 3028.
112
113 The folder specified by "fileinto" must not contain the character
114 sequence ".." to avoid security problems. RFC 3028 does not specify the
115 syntax of folders apart from keep being equivalent to fileinto "INBOX".
116 This implementation uses "inbox" instead.
117
118 Sieve script errors currently cause that messages are silently filed into
119 "inbox". RFC 3028 requires that the user is notified of that condition.
120 This may be implemented in future by adding a header line to mails that
121 are filed into "inbox" due to an error in the filter.
122
123 The automatic replies generated by "vacation" do not contain an updated
124 "references" header field.
125
126
127 Semantics Of Keep
128
129 The keep command is equivalent to fileinto "inbox": It saves the
130 message and resets the implicit keep flag. It does not set the
131 implicit keep flag; there is no command to set it once it has
132 been reset.
133
134
135 Semantics Of Fileinto
136
137 RFC 3028 does not specify if "fileinto" tries to create a mail folder,
138 in case it does not exist. This implementation allows to configure
139 that aspect using the appendfile transport options "create_directory",
140 "create_file" and "file_must_exist". See the appendfile transport in
141 the Exim specification for details.
142
143
144 Allof And Anyof Test
145
146 RFC 3028 does not specify if these tests use shortcut/lazy evaluation.
147 Exim uses shortcut evaluation.
148
149
150 Action Reordering
151
152 RFC 3028 does not specify if actions may be executed out of order.
153 Exim may execute them out of order, e.g. messages may be filed to
154 folders or forwarded in a different order than specified, because
155 those actions only setup delivery, but do not execute it themselves.
156
157
158 Wildcard Matching
159
160 RFC 3028 is not exactly clear if comparators act on unicode characters
161 or on octets containing their UTF-8 representation. As it turns out,
162 many implementations go the second way. This does not make a difference
163 but for wildcard matching and octet-wise comparison. Working on unicode
164 means a dot matches a character. Working on UTF-8 means the dot matches
165 a single octet of a multi-octet sequence. For octet-wise comparisons,
166 working on UTF-8 means arbitrary byte sequences in headers can not be
167 matches, as they are rarely correct UTF-8 sequences and can thus not be
168 expressed as string literal. This implementation works on unicode, but
169 this may be changed in case RFC3028bis specifies this issue safe and sound.
170
171
172 Sieve Syntax And Semantics
173
174 RFC 3028 confuses syntax and semantics sometimes. It uses a generic
175 grammar as syntax for commands and tests and performs many checks during
176 semantic analysis. Syntax is specified by grammar rules, semantics
177 by natural language, despite the latter often talking about syntax.
178 The intention was to provide a framework for the syntax that describes
179 current commands as well as future extensions, and describing commands
180 by semantics.
181
182 The following replacement for section 8.2 gives two grammars, one for
183 the framework, and one for specific commands, thus removing most of the
184 semantic analysis. Since the parser can not parse unsupported extensions,
185 the result is strict error checking of any executed and not executed code
186 until "stop" is executed or the end of the script is reached.
187
188 8.2. Grammar
189
190 The atoms of the grammar are lexical tokens. White space or comments may
191 appear anywhere between lexical tokens, they are not part of the grammar.
192 The grammar is specified in ABNF with two extensions to describe tagged
193 arguments that can be reordered and grammar extensions: { } denotes a
194 sequence of symbols that may appear in any order. Example:
195
196 options = a b c
197 start = { options }
198
199 is equivalent to:
200
201 start = ( a b c ) / ( a c b ) / ( b a c ) / ( b c a ) / ( c a b ) / ( c b a )
202
203 The symbol =) is used to append to a rule:
204
205 start = a
206 start =) b
207
208 is equivalent to
209
210 start = a b
211
212 All Sieve commands, including extensions, MUST be words of the following
213 generic grammar with the start symbol "start". They SHOULD be specified
214 using a specific grammar, though.
215
216 argument = string-list / number / tag
217 arguments = *argument [test / test-list]
218 block = "{" commands "}"
219 commands = *command
220 string = quoted-string / multi-line
221 string-list = "[" string *("," string) "]" / string
222 test = identifier arguments
223 test-list = "(" test *("," test) ")"
224 command = identifier arguments ( ";" / block )
225 start = command
226
227 The basic Sieve commands are specified using the following grammar, which
228 language is a subset of the generic grammar above. The start symbol is
229 "start".
230
231 address-part = ":localpart" / ":domain" / ":all"
232 comparator = ":comparator" string
233 match-type = ":is" / ":contains" / ":matches"
234 string = quoted-string / multi-line
235 string-list = "[" string *("," string) "]" / string
236 address-test = "address" { [address-part] [comparator] [match-type] }
237 string-list string-list
238 test-list = "(" test *("," test) ")"
239 allof-test = "allof" test-list
240 anyof-test = "anyof" test-list
241 exists-test = "exists" string-list
242 false-test = "false"
243 true=test = "true"
244 header-test = "header" { [comparator] [match-type] }
245 string-list string-list
246 not-test = "not" test
247 relop = ":over" / ":under"
248 size-test = "size" relop number
249 block = "{" commands "}"
250 if-command = "if" test block *( "elsif" test block ) [ "else" block ]
251 stop-command = "stop" { stop-options } ";"
252 stop-options =
253 keep-command = "keep" { keep-options } ";"
254 keep-options =
255 discard-command = "discard" { discard-options } ";"
256 discard-options =
257 redirect-command = "redirect" { redirect-options } string ";"
258 redirect-options =
259 require-command = "require" { require-options } string-list ";"
260 require-options =
261 test = address-test / allof-test / anyof-test / exists-test
262 / false-test / true-test / header-test / not-test
263 / size-test
264 command = if-command / stop-command / keep-command
265 / discard-command / redirect-command
266 commands = *command
267 start = *require-command commands
268
269 The extensions "envelope" and "fileinto" are specified using the following
270 grammar extension.
271
272 envelope-test = "envelope" { [comparator] [address-part] [match-type] }
273 string-list string-list
274 test =/ envelope-test
275
276 fileinto-command = "fileinto" { fileinto-options } string ";"
277 fileinto-options =
278 command =/ fileinto-command
279
280 The extension "copy" is specified as:
281
282 fileinto-options =) ":copy"
283 redirect-options =) ":copy"
284
285
286 The i;ascii-numeric Comparator
287
288 RFC 2244 describes this comparator and specifies that non-numeric strings
289 are considered equal with an ordinal value higher than any numeric string.
290 Although not stated explicitly, this includes the empty string. A range
291 of at least 2^31 is required. This implementation does not limit the
292 range, because it does not convert numbers to binary representation
293 before comparing them.
294
295
296 The vacation extension
297
298 The extension "vacation" is specified using the following grammar
299 extension.
300
301 vacation-command = "vacation" { vacation-options } <reason: string>
302 vacation-options = [":days" number]
303 [":subject" string]
304 [":from" string]
305 [":addresses" string-list]
306 [":mime"]
307 [":handle" string]
308 command =/ vacation-command
309
310
311 Semantics Of ":mime"
312
313 The draft does not specify how strings using MIME entities are used
314 to compose messages. As a result, different implementations generate
315 different mails. The Exim Sieve implementation splits the reason into
316 header and body. It adds the header to the mail header and uses the body
317 as mail body. Be aware, that other imlementations compose a multipart
318 structure with the reason as only part. Both conform to the specification
319 (or lack thereof).
320
321
322 Semantics Of Not Using ":mime"
323
324 Sieve scripts are written in UTF-8, so is the reason string in this
325 case. This implementation adds MIME headers to indicate that. This
326 is not required by the vacation draft, which does not specify how
327 the UTF-8 reason is processed to compose the resulting message.
328
329
330 Default Subject
331
332 The draft specifies that the default message subject is "Auto: " plus
333 the old subject. Using this subject is dangerous, because many mailing
334 lists verify addresses by sending a secret key in the subject of a
335 message, asking to reply to the message for confirmation. Using the
336 default vacation subject confirms any subscription request of this kind,
337 allowing to subscribe a third party to any mailing list, either to annoy
338 the user or to declare spam as legitimate mail by proving to use opt-in.
339
340
341 Rate Limiting Responses
342
343 In absence of a handle, this implementation hashes the reason,
344 ":subject" option, ":mime" option and ":from" option and uses the hex
345 string representation as filename within the "sieve_vacation_directory"
346 to store the recipient addresses for this vacation parameter set.
347
348 The draft specifies that sites may define a minimum ":days" value than 1.
349 This implementation uses 1. The maximum value MUST greater than 7,
350 and SHOULD be greater than 30. This implementation uses a maximum of 31.
351
352 Vacation recipient address databases older than 31 days are automatically
353 removed. Users do not have to remove them manually when modifying their
354 scripts. Don't put anything but vacation databases in that directory
355 or you risk that it will be removed, too!
356
357
358 Global Reply Address Blacklist
359
360 The draft requires that each implementation offers a global black list
361 of addresses that will never be replied to. Exim offers this as option
362 "never_mail" in the autoreply transport.