Commit | Line | Data |
---|---|---|
e05f33e0 PH |
1 | Conformance with RFCs |
2 | --------------------- | |
3 | ||
4 | Exim is written to follow the rules laid down in the RFCs. However, there are | |
5 | some circumstances where it either extends what is specified, or chooses not to | |
6 | follow them strictly, for various reasons. Sometimes variations are controlled | |
7 | by an option, which may default on or off. This document lists the variations | |
8 | from the latest email RFCs, and discusses their background and implications. | |
9 | ||
10 | Last Updated: 25 January 1999 | |
11 | ||
12 | ||
13 | 1. RFC 822 | |
14 | ---------- | |
15 | ||
16 | The original specification of the format of Internet mail messages is RFC 822, | |
17 | later clarified and modified by RFC 1123. At the time of writing (January 1999) | |
18 | a new RFC (currently known as draft-ietf-drums-msg-fmt-07) which updates and | |
19 | consolidates all the material related to the message format is at a late stage | |
20 | of drafting, and is expected to become an Internet Standard in due course. | |
21 | ||
22 | The following is (I hope) a complete list of major variations from the draft | |
23 | RFC. References in square brackets are to the -07 draft. | |
24 | ||
25 | ||
26 | 1.1 Line termination [2.1, 2.3] | |
27 | ------------------------------- | |
28 | ||
29 | [Lines are terminated by CRLF; isolated CR and LF are not permitted.] | |
30 | ||
31 | The CRLF requirement has to be interpreted carefully, because the RFC also says | |
32 | that it does not cover the internal format "used by sites". Exim keeps messages | |
33 | on its spool in Unix format, using only LF as the line terminator, and also | |
34 | does local deliveries using only LF. I believe this is compliant with the RFC, | |
35 | as these are both "internal formats". | |
36 | ||
37 | Messages sent out by SMTP have CRLF line terminators. However, isolated CR | |
38 | characters are treated as any other data characters, because Exim is eight-bit | |
39 | clean (see 1.2 below). | |
40 | ||
41 | See 2.1 below for a discussion of line terminators in incoming messages. | |
42 | ||
43 | ||
44 | 1.2 Eight-bit characters [2.1] | |
45 | ------------------------------ | |
46 | ||
47 | [Messages consist of 7-bit characters.] | |
48 | ||
49 | Exim is eight-bit clean. It does not do any processing of the characters in the | |
50 | body of a message. | |
51 | ||
52 | ||
53 | 1.3 Maximum line length [2.1, 2.3] | |
54 | ---------------------------------- | |
55 | ||
56 | [The maximum length of a line is 998 characters.] | |
57 | ||
58 | Exim does not enforce any limit on line length. | |
59 | ||
60 | ||
61 | 1.4 The "phrase" part of an address [3.4] | |
62 | ----------------------------------------- | |
63 | ||
64 | [The phrase is a sequence of "words"; a word is an "atom" or a quoted string.] | |
65 | ||
66 | The characters that can be used in an "atom" do not include the full stop | |
67 | (dot, period). Thus a header line such as | |
68 | ||
69 | To: John Q. Public <jqp@anywhere.org> | |
70 | ||
71 | is syntactically invalid under a strict interpretation of the RFC because the | |
72 | dot in the phrase part is not quoted. However, many MTAs do not enforce this | |
73 | restriction, so Exim was changed to be relaxed about it as well. In fact, the | |
74 | draft RFC is moving towards allowing this. In section [4.1], which is defining | |
75 | "obsolete" syntax that programs must accept (but not generate), it says this: | |
76 | ||
77 | The period character is added to obs-phrase. | |
78 | ||
79 | Note: The period character in obs-phrase is not a form that was allowed | |
80 | in earlier versions of this or any other standard. Period (nor any other | |
81 | character from specials) was not allowed in phrase because it introduced | |
82 | a parsing difficulty distinguishing between phrases and portions of an | |
83 | addr-spec (see section 4.4). It appears here because the period | |
84 | character is currently used in many messages in the display-name portion | |
85 | of addresses, especially for initials in names, and therefore must be | |
86 | interpreted properly. In the future, period may appear in the regular | |
87 | syntax of phrase. | |
88 | ||
89 | ||
90 | 1.5 Source routed addresses [4.4] | |
91 | --------------------------------- | |
92 | ||
93 | [Source routed addresses are always enclosed in <>.] | |
94 | ||
95 | Source routed addresses are declared obsolete in the draft RFC, but MTAs are | |
96 | still required to handle them. Strictly, a source-routed address must be | |
97 | enclosed in <> characters, so a header such as | |
98 | ||
99 | From: @a,@b:c@d | |
100 | ||
101 | is syntactally invalid. Exim does not enforce this restriction. | |
102 | ||
103 | ||
104 | 1.6 Local parts [3.4.1] | |
105 | ----------------------- | |
106 | ||
107 | [Dots in unquoted local parts may not be consecutive or at either end.] | |
108 | ||
109 | Exim allows unquoted local parts to begin or end with a dot (period, full | |
110 | stop), and it also permits two consecutive dots in a local part. | |
111 | ||
112 | ||
113 | ||
114 | 2. RFC 821 | |
115 | ---------- | |
116 | ||
117 | The original specification of SMTP is RFC 821, later clarified and modified by | |
118 | RFC 1123. Domain name system requirements and their implications for mail are | |
119 | covered in RFCs 1035 and 974. A scheme for extending the SMTP protocol is | |
120 | described in RFC 1869, and there are subsequent RFCs specifying particular | |
121 | extensions. | |
122 | ||
123 | At the time of writing (January 1999) a new RFC (currently known as | |
124 | draft-ietf-drums-smtpupd-09) which updates and consolidates all the material | |
125 | connected with SMTP message transmission is at a late stage of drafting, and is | |
126 | expected to become an Internet Standard in due course. | |
127 | ||
128 | The new draft is written using the terms MUST, SHOULD, and MAY, which, when | |
129 | written in capital letters, have precise meanings. To quote from the draft: | |
130 | ||
131 | "MUST" or "MUST NOT" identify absolute requirements for conformance to | |
132 | this specification. Implementations that do not conform to them lie | |
133 | outside the scope of this specification and often will not | |
134 | interoperate properly with SMTP implementations that do conform. | |
135 | Implementations that are fully conforming also adhere to all "SHOULD" | |
136 | and "SHOULD NOT" requirements. Implementations that adhere to all | |
137 | "MUST" ("MUST NOT") but not to all of these are considered to be | |
138 | partially conforming. Such implementations may interoperate properly | |
139 | with fully conforming ones and with each other, but this will | |
140 | typically be the case only if great care is taken. Consequently, an | |
141 | implementation should violate "SHOULD" ("SHOULD NOT") requirements | |
142 | only under exceptional and well-understood circumstances. | |
143 | ||
144 | The implementation of Exim is intended to conform to the spirit of this | |
145 | paragraph. The following is (I hope) a complete list of major variations | |
146 | from the draft RFC. In addition to the items listed here, there are other minor | |
147 | extensions such as the tolerance of white space in places where it is not | |
148 | strictly permitted by the RFC. References in square brackets are to the -09 | |
149 | draft sections, and brief summaries of the RFC requirement are also given in | |
150 | square brackets. | |
151 | ||
152 | ||
153 | 2.1 Line termination [2.3.7, 4.1.1.4] | |
154 | ------------------------------------- | |
155 | ||
156 | [SMTP lines are terminated by CRLF.] | |
157 | ||
158 | Exim recognizes LF without CR as a line terminator in all forms of input. For | |
159 | SMTP input, any preceding CR is discarded. An early version of Exim followed | |
160 | the RFC strictly, and did not recognize LF without CR in SMTP input. However, | |
161 | it seems that sites on the net send out messages with just LF terminators, | |
162 | despite the warnings in the RFCs, and other MTAs handle this, so Exim was | |
163 | changed. However, there is a compile time macro called STRICT_CRLF which can be | |
164 | set to restore the strict behaviour, though this is undocumented. | |
165 | ||
166 | ||
167 | 2.2 Eight-bit characters [2.4.1] | |
168 | -------------------------------- | |
169 | ||
170 | [SMTP transmits only 7-bit characters.] | |
171 | ||
172 | Exim is eight-bit clean, and makes no attempt to modify the data in a message | |
173 | in any way. In particular, for messages containing characters with the top bit | |
174 | set, it neither tries to negotiate 8-bit transmission, nor converts such | |
175 | characters into an encoded form. In other words, it adopts the "just send 8" | |
176 | strategy. It can be configured to send out 8BITMIME in its response to EHLO | |
177 | (which it does not do by default), and it recognizes the 8BITMIME keyword on | |
178 | incoming messages, but neither of these affect its handling of message data. | |
179 | "Just send 8" is the strategy of a number of MTAs; it is argued that it | |
180 | achieves what the user wants more often than other strategies. | |
181 | ||
182 | ||
8a330e5a | 183 | 2.3 Closing the connection [4.1.1.10] |
e05f33e0 PH |
184 | ------------------------------------- |
185 | ||
186 | [Client must wait for response to QUIT before closing the connection.] | |
187 | ||
188 | Exim closes the connection immediately after sending QUIT, without waiting for | |
189 | the reply. There was a lot of discussion about this on one of the mailing | |
190 | lists. The conclusion was that this behaviour is fine on Unix systems, which | |
191 | have TCP/IP implementations that close down the underlying channel tidily even | |
192 | when the associated process has terminated. Indeed, not waiting may be | |
193 | beneficial, as it moves the TIME_WAIT state (waiting to ensure there's no more | |
194 | data in transit) from the server to the client system. On some other operating | |
195 | systems (I understand) it is a disaster to terminate the sending process | |
196 | without waiting for the QUIT response, because all the data about the | |
197 | connection lives in the client's process space, and is therefore thrown away | |
198 | before the response arrives. The subsequent arrival of the response then causes | |
199 | bad behaviour. | |
200 | ||
201 | ||
8a330e5a | 202 | 2.4 IPv6 address literals [4.1.2] |
e05f33e0 PH |
203 | --------------------------------- |
204 | ||
205 | [IPv6 address literals are introduced by "IPv6".] | |
206 | ||
207 | Exim recognizes IPv6 literals as just the colon-separated hexadecimal form of | |
208 | an IPv6 address, for example 1080:0:0:0:8:800:200C:417A, without the need for a | |
209 | prefix. At present, it does not even recognize the prefix. When IPv6 becomes | |
210 | more widespread, Exim will follow whatever the common usage is. | |
211 | ||
212 | ||
8a330e5a | 213 | 2.5 Underscores in domain names [4.1.2] |
e05f33e0 PH |
214 | --------------------------------------- |
215 | ||
216 | [Underscores are not legal in domain names.] | |
217 | ||
218 | RFC 822 allows all characters except specials, space, and controls in domain | |
219 | names, but the SMTP RFCs are stricter, allowing only letters, digits, and | |
220 | hyphen. Exim is compliant when checking incoming addresses in SMTP commands, | |
221 | but it is more relaxed by default when checking domain names that are supplied | |
222 | by EHLO or HELO commands, because many client workstations get set up with | |
223 | underscores in their names. There is an option that can be set to cause Exim to | |
224 | refuse underscores. (There are also options to specify certain hosts from which | |
225 | it will accept any old junk after EHLO or HELO. Such is the woeful state of | |
226 | some SMTP clients.) | |
227 | ||
228 | ||
8a330e5a | 229 | 2.6 Removal of return-path headers [4.4] |
e05f33e0 PH |
230 | ---------------------------------------- |
231 | ||
232 | [Relaying MTAs should not remove return-path.] | |
233 | ||
234 | Exim removes Return-Path: headers from all messages, if return_path_remove is | |
235 | set (the default). It does not attempt to determine if it is being a relay or | |
236 | not. Indeed, for some messages it might be both a relay and a final destination | |
237 | MTA for the same message. | |
238 | ||
239 | ||
8a330e5a | 240 | 2.7 Randomizing the order of addresses of multihomed hosts [5] |
e05f33e0 PH |
241 | -------------------------------------------------------------- |
242 | ||
243 | [Multihomed host addresses should not be randomized.] | |
244 | ||
245 | Exim does randomize a list of several addresses for a single host, because | |
246 | caching in resolvers will defeat the round-robinning that many namerservers | |
247 | use. (Note: this is not the same as randomizing equal-valued MX records. That | |
248 | is required by the RFC.) | |
249 | ||
250 | ||
8a330e5a | 251 | 2.8 Handling "MX points to self" [5] |
e05f33e0 PH |
252 | ------------------------------------ |
253 | ||
254 | [MX points to self must be treated as an error.] | |
255 | ||
256 | The RFC doesn't allow for the possibility of special-purpose routing in the | |
257 | case when the lowest numbered MX record points to the local host. The default | |
258 | Exim configuration is compliant, but it is possible to configure Exim to behave | |
259 | differently, and there are several situations where this can be useful. | |
260 | ||
261 | ||
8a330e5a | 262 | 2.9 Source routing [6.1] |
e05f33e0 PH |
263 | ------------------------- |
264 | ||
265 | [Source routes should be stripped.] | |
266 | ||
267 | The new RFC has moved forward in deprecating source-routed email addresses. | |
268 | Exim does not strip them down by default, but can be made to do so by setting | |
269 | collapse_source_routes. However, even when it is not stripping them down, it | |
270 | does not add host routing to reverse-paths when processing a source-routed | |
271 | forward-path. | |
272 | ||
273 | ||
8a330e5a | 274 | 2.10 Loop detection [6.2] |
e05f33e0 PH |
275 | ------------------------- |
276 | ||
277 | [Loop count for Received: headers should be at least 100.] | |
278 | ||
279 | Exim's default setting of the received_headers_max option is 30. Most messages | |
280 | these days seem to accumulate less than half a dozen Received: headers, and | |
281 | even a couple of forwardings don't bring this anywhere near 30. | |
282 | ||
283 | ||
8a330e5a | 284 | 2.11 Addition of missing headers [6.3] |
e05f33e0 PH |
285 | -------------------------------------- |
286 | ||
287 | [Missing headers may be added, and domains qualified, only if client is | |
288 | identified.] | |
289 | ||
290 | Exim always adds Message-Id: and Date: headers if these are missing, whatever | |
291 | the source of the message, and likewise when it expands non-fully-qualified | |
292 | domains, it does so independently of the message's source. | |
293 | ||
294 | ||
8a330e5a | 295 | 2.12 Syntax of MAIL and RCPT commands [4.1.1.2, 4.1.1.3] |
e05f33e0 PH |
296 | -------------------------------------------------------- |
297 | ||
298 | Exim is more relaxed than the RFC requires: | |
299 | ||
300 | (1) Trailing white space is ignored. | |
301 | ||
302 | (2) It permits white space after the "FROM" and "TO" keywords. | |
303 | ||
304 | (3) It does not insist on the address being enclosed in <> characters. In fact, | |
305 | it recognizes addresses in RFC 822 format here, except that domain | |
306 | components are restricted to containing only letters, digits, and hyphens. | |
307 | ||
308 | (4) Local parts are permitted to contain null components, that is, may start or | |
309 | end with an unquoted full stop (period) or contain two consecutive | |
310 | unquoted full stops. | |
311 | ||
312 | ||
8a330e5a | 313 | 2.13 Non-fully-qualified domains [2.3.5] |
e05f33e0 PH |
314 | ---------------------------------------- |
315 | ||
316 | [All domains must be fully qualified.] | |
317 | ||
318 | A domain that is not fully qualified has some of its trailing components | |
319 | missing, and is normally a local alias of some sort, for example, just a | |
320 | single-component host name. | |
321 | ||
322 | Exim can be configured to "widen" non-fully-qualified domains, either by using | |
323 | the facilities of the DNS resolver, or by an explicit list of widening strings. | |
324 | When this is done, it applies to addresses received by SMTP from other hosts, | |
325 | as well as to locally-originated addresses. Address re-writing could also be | |
326 | used for this purpose. | |
327 | ||
328 | ||
8a330e5a | 329 | 2.14 Unqualified addresses [4.1.2] |
e05f33e0 PH |
330 | ---------------------------------- |
331 | ||
332 | [Addresses in SMTP commands must include domains.] | |
333 | ||
334 | An unqualified address consists of a local part without a domain. Do not | |
335 | confuse "qualified address" and "qualified domain". A qualified address may | |
336 | include a non-fully-qualified domain. | |
337 | ||
338 | There is one exception to the RFC rule: it is required that the unqualified | |
339 | address "<postmaster>" always be accepted. Apart from this, Exim rejects | |
340 | domainless addresses in SMTP commands by default, but it can be configured with | |
341 | a list of hosts and/or networks that are permitted to send addresses without | |
342 | domains in SMTP commands. Any such address that is accepted (including | |
343 | <postmaster>) is qualified by adding the value of the qualify_domain option. | |
344 | ||
345 | ||
8a330e5a | 346 | 2.15 VRFY and EXPN [3.5.1, 3.5.2, 3.5.3, 7.3] |
e05f33e0 PH |
347 | --------------------------------------------- |
348 | ||
349 | [VRFY and EXPN should be supported.] | |
350 | ||
351 | Exim does not support VRFY and EXPN by default, but a list of hosts and | |
352 | networks for which they are permitted can be given. | |
353 | ||
354 | ||
8a330e5a | 355 | 2.16 Checking of EHLO/HELO commands [4.1.4] |
e05f33e0 PH |
356 | ------------------------------------------- |
357 | ||
358 | [Client must send EHLO. Server must not refuse message if EHLO/HELO check | |
359 | fails.] | |
360 | ||
361 | Exim, as a client, always sends EHLO or HELO (see 2.3 above). As a server, it | |
362 | does not insist on there having been a valid EHLO or HELO command before the | |
363 | start of a message transaction. Any EHLO or HELO command that is received is | |
364 | rejected only if it contains a syntax error. That is, it is never rejected on | |
365 | the basis of any validation checking that may be performed on the data it | |
366 | contains. | |
367 | ||
368 | However, Exim can be configured to insist that (a) there is valid EHLO/HELO | |
369 | command before any message transaction and (b) the domain in that command | |
370 | matches the domain obtained by looking up the IP address of the sending host. | |
371 | It is possible to specify exception lists of hosts and/or networks for which | |
372 | this check does not apply. | |
373 | ||
374 | ||
375 | 2.18 Format of delivery error messages [3.7] | |
376 | -------------------------------------------- | |
377 | ||
378 | [Standard report formats should be used if possible.] | |
379 | ||
380 | Exim's delivery failure reports do not conform to the format described in RFC | |
381 | 1894. | |
382 | ||
383 | ||
384 | ## End ## |