Add RCPT error text to unexpected DATA error.
[exim.git] / doc / doc-txt / README.SIEVE
CommitLineData
bfad5236 1$Cambridge: exim/doc/doc-txt/README.SIEVE,v 1.10 2006/04/25 10:44:57 ph10 Exp $
495ae4b0
PH
2
3 Notes on the Sieve implementation for Exim
4
5Exim Filter Versus Sieve Filter
6
7Exim supports two incompatible filters: The traditional Exim filter and
8the Sieve filter. Since Sieve is a extensible language, it is important
9to understand "Sieve" in this context as "the specific implementation
10of Sieve for Exim".
11
12The Exim filter contains more features, such as variable expansion, and
13better integration with the host environment, like external processes
14and pipes.
15
16Sieve is a standard for interoperable filters, defined in RFC 3028,
17with multiple implementations around. If interoperability is important,
18then there is no way around it.
19
20
21Exim Implementation
22
bfad5236
PH
23The Exim Sieve implementation offers the core as defined by
24draft-ietf-sieve-3028bis-05.txt (next version of RFC 3028 that fixes
25specification mistakes), the "envelope" test (3028bis), the "fileinto"
26action (3028bis), the "copy" action (RFC 3894), the "vacation" action
27(draft-ietf-sieve-vacation-05.txt) and the "i;ascii-numeric" comparator
28extension (RFC 2244).
495ae4b0
PH
29
30The Sieve filter is integrated in Exim and works very similar to the
31Exim filter: Sieve scripts are recognized by the first line containing
32"# sieve filter". When using "keep" or "fileinto" to save a mail into a
33folder, the resulting string is available as the variable $address_file
1c59d63b
PH
34in the transport that stores it. The following routers and transport
35show a typical use of Sieve:
36
37begin routers
38
39localuser_verify:
40 driver = accept
41 domains = +localdomains
42 local_part_suffix = "-*"
43 local_part_suffix_optional
44 check_local_user
45 require_files = $home/.forward
46 verify_only = true
47
48localuser_deliver:
49 driver = redirect
50 domains = +localdomains
51 local_part_suffix = "-*"
52 local_part_suffix_optional
53 sieve_subaddress = "${sg{$local_part_suffix}{^-}{}}"
54 sieve_useraddress = "$local_part"
55 check_local_user
56 require_files = $home/.forward
57 file = $home/.forward
58 check_ancestor
59 allow_filter
60 file_transport = localuser
61 reply_transport = vacation
62 sieve_vacation_directory = $home/mail/vacation
63 verify = false
64
65begin transports
495ae4b0
PH
66
67localuser:
68 driver = appendfile
69 file = ${if eq{$address_file}{inbox} \
70 {/var/mail/$local_part} \
71 {${if eq{${substr_0_1:$address_file}}{/} \
72 {$address_file} \
1c59d63b 73 {$home/mail/$address_file} \
495ae4b0
PH
74 }} \
75 }
76 delivery_date_add
77 envelope_to_add
78 return_path_add
79 mode = 0600
80
1c59d63b
PH
81vacation:
82 driver = autoreply
495ae4b0 83
1c59d63b
PH
84Absolute files are stored where specified, relative files are stored
85relative to $home/mail and "inbox" goes to the standard mailbox location.
86To enable "vacation", sieve_vacation_directory is set to the directory
87where vacation databases are held (don't put anything else in that
88directory) and point reply_transport to an autoreply transport.
89Setting the Sieve useraddress and subaddress allows to use the subaddress
90extension.
495ae4b0
PH
91
92
93RFC Compliance
94
95Exim requires the first line to be "# sieve filter". Of course the RFC
96does not enforce that line. Don't expect examples to work without adding
97it, though.
98
99RFC 3028 requires using CRLF to terminate the end of a line.
100The rationale was that CRLF is universally used in network protocols
101to mark the end of the line. This implementation does not embed Sieve
102in a network protocol, but uses Sieve scripts as part of the Exim MTA.
103Since all parts of Exim use \n as newline character, this implementation
104does, too. You can change this by defining the macro RFC_EOL at compile
105time to enforce CRLF being used.
106
495ae4b0
PH
107Sieve scripts can not contain NUL characters in strings, but mail
108headers could contain MIME encoded NUL characters, which could never
109be matched by Sieve scripts using exact comparisons. For that reason,
110this implementation extends the Sieve quoted string syntax with \0
111to describe a NUL character, violating \0 being the same as 0 in
1c59d63b 112RFC 3028.
495ae4b0
PH
113
114The folder specified by "fileinto" must not contain the character
1c59d63b 115sequence ".." to avoid security problems. RFC 3028 does not specify the
495ae4b0
PH
116syntax of folders apart from keep being equivalent to fileinto "INBOX".
117This implementation uses "inbox" instead.
118
119Sieve script errors currently cause that messages are silently filed into
120"inbox". RFC 3028 requires that the user is notified of that condition.
121This may be implemented in future by adding a header line to mails that
122are filed into "inbox" due to an error in the filter.
123
87fcc8b9
PH
124The automatic replies generated by "vacation" do not contain an updated
125"references" header field.
126
495ae4b0 127
495ae4b0
PH
128Semantics Of Keep
129
130The keep command is equivalent to fileinto "inbox": It saves the
131message and resets the implicit keep flag. It does not set the
132implicit keep flag; there is no command to set it once it has
133been reset.
134
135
024bd3c2 136Semantics Of Fileinto
495ae4b0
PH
137
138RFC 3028 does not specify if "fileinto" tries to create a mail folder,
139in case it does not exist. This implementation allows to configure
140that aspect using the appendfile transport options "create_directory",
141"create_file" and "file_must_exist". See the appendfile transport in
142the Exim specification for details.
143
144
024bd3c2
PH
145Allof And Anyof Test
146
147RFC 3028 does not specify if these tests use shortcut/lazy evaluation.
148Exim uses shortcut evaluation.
149
150
151Action Reordering
152
153RFC 3028 does not specify if actions may be executed out of order.
154Exim may execute them out of order, e.g. messages may be filed to
155folders or forwarded in a different order than specified, because
156those actions only setup delivery, but do not execute it themselves.
157
158
159Wildcard Matching
160
161RFC 3028 is not exactly clear if comparators act on unicode characters
162or on octets containing their UTF-8 representation. As it turns out,
163many implementations go the second way. This does not make a difference
164but for wildcard matching and octet-wise comparison. Working on unicode
165means a dot matches a character. Working on UTF-8 means the dot matches
166a single octet of a multi-octet sequence. For octet-wise comparisons,
167working on UTF-8 means arbitrary byte sequences in headers can not be
168matches, as they are rarely correct UTF-8 sequences and can thus not be
169expressed as string literal. This implementation works on unicode, but
170this may be changed in case RFC3028bis specifies this issue safe and sound.
171
172
173Sieve Syntax And Semantics
495ae4b0
PH
174
175RFC 3028 confuses syntax and semantics sometimes. It uses a generic
1c59d63b
PH
176grammar as syntax for commands and tests and performs many checks during
177semantic analysis. Syntax is specified by grammar rules, semantics
178by natural language, despite the latter often talking about syntax.
495ae4b0
PH
179The intention was to provide a framework for the syntax that describes
180current commands as well as future extensions, and describing commands
31c4e005 181by semantics.
495ae4b0 182
1c59d63b 183The following replacement for section 8.2 gives two grammars, one for
495ae4b0
PH
184the framework, and one for specific commands, thus removing most of the
185semantic analysis. Since the parser can not parse unsupported extensions,
1c59d63b
PH
186the result is strict error checking of any executed and not executed code
187until "stop" is executed or the end of the script is reached.
495ae4b0
PH
188
1898.2. Grammar
190
191The atoms of the grammar are lexical tokens. White space or comments may
192appear anywhere between lexical tokens, they are not part of the grammar.
193The grammar is specified in ABNF with two extensions to describe tagged
194arguments that can be reordered and grammar extensions: { } denotes a
195sequence of symbols that may appear in any order. Example:
196
1c59d63b
PH
197 options = a b c
198 start = { options }
495ae4b0
PH
199
200is equivalent to:
201
1c59d63b 202 start = ( a b c ) / ( a c b ) / ( b a c ) / ( b c a ) / ( c a b ) / ( c b a )
495ae4b0
PH
203
204The symbol =) is used to append to a rule:
205
206 start = a
207 start =) b
208
209is equivalent to
210
211 start = a b
212
213All Sieve commands, including extensions, MUST be words of the following
214generic grammar with the start symbol "start". They SHOULD be specified
215using a specific grammar, though.
216
217 argument = string-list / number / tag
218 arguments = *argument [test / test-list]
219 block = "{" commands "}"
220 commands = *command
221 string = quoted-string / multi-line
222 string-list = "[" string *("," string) "]" / string
223 test = identifier arguments
224 test-list = "(" test *("," test) ")"
225 command = identifier arguments ( ";" / block )
226 start = command
227
228The basic Sieve commands are specified using the following grammar, which
229language is a subset of the generic grammar above. The start symbol is
230"start".
231
232 address-part = ":localpart" / ":domain" / ":all"
233 comparator = ":comparator" string
234 match-type = ":is" / ":contains" / ":matches"
235 string = quoted-string / multi-line
236 string-list = "[" string *("," string) "]" / string
237 address-test = "address" { [address-part] [comparator] [match-type] }
238 string-list string-list
239 test-list = "(" test *("," test) ")"
240 allof-test = "allof" test-list
241 anyof-test = "anyof" test-list
242 exists-test = "exists" string-list
243 false-test = "false"
244 true=test = "true"
245 header-test = "header" { [comparator] [match-type] }
246 string-list string-list
247 not-test = "not" test
248 relop = ":over" / ":under"
249 size-test = "size" relop number
250 block = "{" commands "}"
251 if-command = "if" test block *( "elsif" test block ) [ "else" block ]
252 stop-command = "stop" { stop-options } ";"
253 stop-options =
254 keep-command = "keep" { keep-options } ";"
255 keep-options =
256 discard-command = "discard" { discard-options } ";"
257 discard-options =
258 redirect-command = "redirect" { redirect-options } string ";"
259 redirect-options =
260 require-command = "require" { require-options } string-list ";"
261 require-options =
262 test = address-test / allof-test / anyof-test / exists-test
263 / false-test / true-test / header-test / not-test
264 / size-test
265 command = if-command / stop-command / keep-command
266 / discard-command / redirect-command
267 commands = *command
268 start = *require-command commands
269
270The extensions "envelope" and "fileinto" are specified using the following
271grammar extension.
272
273 envelope-test = "envelope" { [comparator] [address-part] [match-type] }
274 string-list string-list
275 test =/ envelope-test
276
277 fileinto-command = "fileinto" { fileinto-options } string ";"
278 fileinto-options =
279 command =/ fileinto-command
280
281The extension "copy" is specified as:
282
283 fileinto-options =) ":copy"
284 redirect-options =) ":copy"
285
286
287The i;ascii-numeric Comparator
288
289RFC 2244 describes this comparator and specifies that non-numeric strings
290are considered equal with an ordinal value higher than any numeric string.
291Although not stated explicitly, this includes the empty string. A range
292of at least 2^31 is required. This implementation does not limit the
293range, because it does not convert numbers to binary representation
294before comparing them.
295
296
297The vacation extension
298
299The extension "vacation" is specified using the following grammar
300extension.
301
302 vacation-command = "vacation" { vacation-options } <reason: string>
303 vacation-options = [":days" number]
495ae4b0 304 [":subject" string]
f656d135
PH
305 [":from" string]
306 [":addresses" string-list]
495ae4b0 307 [":mime"]
f656d135 308 [":handle" string]
495ae4b0
PH
309 command =/ vacation-command
310
311
312Semantics Of ":mime"
313
f656d135
PH
314The draft does not specify how strings using MIME entities are used
315to compose messages. As a result, different implementations generate
316different mails. The Exim Sieve implementation splits the reason into
317header and body. It adds the header to the mail header and uses the body
318as mail body. Be aware, that other imlementations compose a multipart
319structure with the reason as only part. Both conform to the specification
320(or lack thereof).
495ae4b0
PH
321
322
323Semantics Of Not Using ":mime"
324
325Sieve scripts are written in UTF-8, so is the reason string in this
326case. This implementation adds MIME headers to indicate that. This
327is not required by the vacation draft, which does not specify how
328the UTF-8 reason is processed to compose the resulting message.
329
330
495ae4b0
PH
331Default Subject
332
5ea81592
PH
333The draft specifies that the default message subject is "Auto: " plus
334the old subject. Using this subject is dangerous, because many mailing
335lists verify addresses by sending a secret key in the subject of a
336message, asking to reply to the message for confirmation. Using the
337default vacation subject confirms any subscription request of this kind,
338allowing to subscribe a third party to any mailing list, either to annoy
339the user or to declare spam as legitimate mail by proving to use opt-in.
495ae4b0
PH
340
341
342Rate Limiting Responses
343
f656d135
PH
344In absence of a handle, this implementation hashes the reason,
345":subject" option, ":mime" option and ":from" option and uses the hex
346string representation as filename within the "sieve_vacation_directory"
347to store the recipient addresses for this vacation parameter set.
495ae4b0
PH
348
349The draft specifies that sites may define a minimum ":days" value than 1.
350This implementation uses 1. The maximum value MUST greater than 7,
351and SHOULD be greater than 30. This implementation uses a maximum of 31.
352
353Vacation recipient address databases older than 31 days are automatically
354removed. Users do not have to remove them manually when modifying their
355scripts. Don't put anything but vacation databases in that directory
356or you risk that it will be removed, too!
357
358
359Global Reply Address Blacklist
360
361The draft requires that each implementation offers a global black list
362of addresses that will never be replied to. Exim offers this as option
363"never_mail" in the autoreply transport.