Commit | Line | Data |
---|---|---|
e05f33e0 PH |
1 | Conformance with RFCs |
2 | --------------------- | |
3 | ||
4 | Exim is written to follow the rules laid down in the RFCs. However, there are | |
5 | some circumstances where it either extends what is specified, or chooses not to | |
6 | follow them strictly, for various reasons. Sometimes variations are controlled | |
7 | by an option, which may default on or off. This document lists the variations | |
8 | from the latest email RFCs, and discusses their background and implications. | |
9 | ||
10 | Last Updated: 25 January 1999 | |
11 | ||
12 | ||
13 | 1. RFC 822 | |
14 | ---------- | |
15 | ||
16 | The original specification of the format of Internet mail messages is RFC 822, | |
17 | later clarified and modified by RFC 1123. At the time of writing (January 1999) | |
18 | a new RFC (currently known as draft-ietf-drums-msg-fmt-07) which updates and | |
19 | consolidates all the material related to the message format is at a late stage | |
20 | of drafting, and is expected to become an Internet Standard in due course. | |
21 | ||
22 | The following is (I hope) a complete list of major variations from the draft | |
23 | RFC. References in square brackets are to the -07 draft. | |
24 | ||
25 | ||
26 | 1.1 Line termination [2.1, 2.3] | |
27 | ------------------------------- | |
28 | ||
29 | [Lines are terminated by CRLF; isolated CR and LF are not permitted.] | |
30 | ||
31 | The CRLF requirement has to be interpreted carefully, because the RFC also says | |
32 | that it does not cover the internal format "used by sites". Exim keeps messages | |
33 | on its spool in Unix format, using only LF as the line terminator, and also | |
34 | does local deliveries using only LF. I believe this is compliant with the RFC, | |
35 | as these are both "internal formats". | |
36 | ||
37 | Messages sent out by SMTP have CRLF line terminators. However, isolated CR | |
38 | characters are treated as any other data characters, because Exim is eight-bit | |
39 | clean (see 1.2 below). | |
40 | ||
41 | See 2.1 below for a discussion of line terminators in incoming messages. | |
42 | ||
43 | ||
44 | 1.2 Eight-bit characters [2.1] | |
45 | ------------------------------ | |
46 | ||
47 | [Messages consist of 7-bit characters.] | |
48 | ||
49 | Exim is eight-bit clean. It does not do any processing of the characters in the | |
50 | body of a message. | |
51 | ||
52 | ||
53 | 1.3 Maximum line length [2.1, 2.3] | |
54 | ---------------------------------- | |
55 | ||
56 | [The maximum length of a line is 998 characters.] | |
57 | ||
58 | Exim does not enforce any limit on line length. | |
59 | ||
60 | ||
61 | 1.4 The "phrase" part of an address [3.4] | |
62 | ----------------------------------------- | |
63 | ||
64 | [The phrase is a sequence of "words"; a word is an "atom" or a quoted string.] | |
65 | ||
66 | The characters that can be used in an "atom" do not include the full stop | |
67 | (dot, period). Thus a header line such as | |
68 | ||
69 | To: John Q. Public <jqp@anywhere.org> | |
70 | ||
71 | is syntactically invalid under a strict interpretation of the RFC because the | |
72 | dot in the phrase part is not quoted. However, many MTAs do not enforce this | |
73 | restriction, so Exim was changed to be relaxed about it as well. In fact, the | |
74 | draft RFC is moving towards allowing this. In section [4.1], which is defining | |
75 | "obsolete" syntax that programs must accept (but not generate), it says this: | |
76 | ||
77 | The period character is added to obs-phrase. | |
78 | ||
79 | Note: The period character in obs-phrase is not a form that was allowed | |
80 | in earlier versions of this or any other standard. Period (nor any other | |
81 | character from specials) was not allowed in phrase because it introduced | |
82 | a parsing difficulty distinguishing between phrases and portions of an | |
83 | addr-spec (see section 4.4). It appears here because the period | |
84 | character is currently used in many messages in the display-name portion | |
85 | of addresses, especially for initials in names, and therefore must be | |
86 | interpreted properly. In the future, period may appear in the regular | |
87 | syntax of phrase. | |
88 | ||
89 | ||
90 | 1.5 Source routed addresses [4.4] | |
91 | --------------------------------- | |
92 | ||
93 | [Source routed addresses are always enclosed in <>.] | |
94 | ||
95 | Source routed addresses are declared obsolete in the draft RFC, but MTAs are | |
96 | still required to handle them. Strictly, a source-routed address must be | |
97 | enclosed in <> characters, so a header such as | |
98 | ||
99 | From: @a,@b:c@d | |
100 | ||
101 | is syntactally invalid. Exim does not enforce this restriction. | |
102 | ||
103 | ||
104 | 1.6 Local parts [3.4.1] | |
105 | ----------------------- | |
106 | ||
107 | [Dots in unquoted local parts may not be consecutive or at either end.] | |
108 | ||
109 | Exim allows unquoted local parts to begin or end with a dot (period, full | |
110 | stop), and it also permits two consecutive dots in a local part. | |
111 | ||
112 | ||
113 | ||
114 | 2. RFC 821 | |
115 | ---------- | |
116 | ||
117 | The original specification of SMTP is RFC 821, later clarified and modified by | |
118 | RFC 1123. Domain name system requirements and their implications for mail are | |
119 | covered in RFCs 1035 and 974. A scheme for extending the SMTP protocol is | |
120 | described in RFC 1869, and there are subsequent RFCs specifying particular | |
121 | extensions. | |
122 | ||
123 | At the time of writing (January 1999) a new RFC (currently known as | |
124 | draft-ietf-drums-smtpupd-09) which updates and consolidates all the material | |
125 | connected with SMTP message transmission is at a late stage of drafting, and is | |
126 | expected to become an Internet Standard in due course. | |
127 | ||
128 | The new draft is written using the terms MUST, SHOULD, and MAY, which, when | |
129 | written in capital letters, have precise meanings. To quote from the draft: | |
130 | ||
131 | "MUST" or "MUST NOT" identify absolute requirements for conformance to | |
132 | this specification. Implementations that do not conform to them lie | |
133 | outside the scope of this specification and often will not | |
134 | interoperate properly with SMTP implementations that do conform. | |
135 | Implementations that are fully conforming also adhere to all "SHOULD" | |
136 | and "SHOULD NOT" requirements. Implementations that adhere to all | |
137 | "MUST" ("MUST NOT") but not to all of these are considered to be | |
138 | partially conforming. Such implementations may interoperate properly | |
139 | with fully conforming ones and with each other, but this will | |
140 | typically be the case only if great care is taken. Consequently, an | |
141 | implementation should violate "SHOULD" ("SHOULD NOT") requirements | |
142 | only under exceptional and well-understood circumstances. | |
143 | ||
144 | The implementation of Exim is intended to conform to the spirit of this | |
145 | paragraph. The following is (I hope) a complete list of major variations | |
146 | from the draft RFC. In addition to the items listed here, there are other minor | |
147 | extensions such as the tolerance of white space in places where it is not | |
148 | strictly permitted by the RFC. References in square brackets are to the -09 | |
149 | draft sections, and brief summaries of the RFC requirement are also given in | |
150 | square brackets. | |
151 | ||
152 | ||
153 | 2.1 Line termination [2.3.7, 4.1.1.4] | |
154 | ------------------------------------- | |
155 | ||
156 | [SMTP lines are terminated by CRLF.] | |
157 | ||
158 | Exim recognizes LF without CR as a line terminator in all forms of input. For | |
159 | SMTP input, any preceding CR is discarded. An early version of Exim followed | |
160 | the RFC strictly, and did not recognize LF without CR in SMTP input. However, | |
161 | it seems that sites on the net send out messages with just LF terminators, | |
162 | despite the warnings in the RFCs, and other MTAs handle this, so Exim was | |
163 | changed. However, there is a compile time macro called STRICT_CRLF which can be | |
164 | set to restore the strict behaviour, though this is undocumented. | |
165 | ||
166 | ||
167 | 2.2 Eight-bit characters [2.4.1] | |
168 | -------------------------------- | |
169 | ||
170 | [SMTP transmits only 7-bit characters.] | |
171 | ||
172 | Exim is eight-bit clean, and makes no attempt to modify the data in a message | |
173 | in any way. In particular, for messages containing characters with the top bit | |
174 | set, it neither tries to negotiate 8-bit transmission, nor converts such | |
175 | characters into an encoded form. In other words, it adopts the "just send 8" | |
176 | strategy. It can be configured to send out 8BITMIME in its response to EHLO | |
177 | (which it does not do by default), and it recognizes the 8BITMIME keyword on | |
178 | incoming messages, but neither of these affect its handling of message data. | |
179 | "Just send 8" is the strategy of a number of MTAs; it is argued that it | |
180 | achieves what the user wants more often than other strategies. | |
181 | ||
182 | ||
183 | 2.3 Use of EHLO/HELO [3.2] | |
184 | -------------------------- | |
185 | ||
186 | [Client MTAs should always start with EHLO, not HELO.] | |
187 | ||
188 | Exim sends EHLO only when it finds the string "ESMTP" in an SMTP greeting | |
189 | message. If EHLO is refused with a 5xx return code, it then reverts to HELO as | |
190 | required, but it does not contain logic for converting to HELO on other errors | |
191 | such as loss of connection or timeout after EHLO. That is one reason why it | |
192 | doesn't always send EHLO; there are reported to be ancient SMTP servers out | |
193 | there which collapse on receiving EHLO. (There is also at least one server | |
194 | whose banner reads "<host name> ignores ESMTP", but it is RFC 821 compliant in | |
195 | that it responds with 5O0 to EHLO, so Exim successfully reverts to HELO.) | |
196 | ||
197 | ||
198 | 2.4 Closing the connection [4.1.1.10] | |
199 | ------------------------------------- | |
200 | ||
201 | [Client must wait for response to QUIT before closing the connection.] | |
202 | ||
203 | Exim closes the connection immediately after sending QUIT, without waiting for | |
204 | the reply. There was a lot of discussion about this on one of the mailing | |
205 | lists. The conclusion was that this behaviour is fine on Unix systems, which | |
206 | have TCP/IP implementations that close down the underlying channel tidily even | |
207 | when the associated process has terminated. Indeed, not waiting may be | |
208 | beneficial, as it moves the TIME_WAIT state (waiting to ensure there's no more | |
209 | data in transit) from the server to the client system. On some other operating | |
210 | systems (I understand) it is a disaster to terminate the sending process | |
211 | without waiting for the QUIT response, because all the data about the | |
212 | connection lives in the client's process space, and is therefore thrown away | |
213 | before the response arrives. The subsequent arrival of the response then causes | |
214 | bad behaviour. | |
215 | ||
216 | ||
217 | 2.5 IPv6 address literals [4.1.2] | |
218 | --------------------------------- | |
219 | ||
220 | [IPv6 address literals are introduced by "IPv6".] | |
221 | ||
222 | Exim recognizes IPv6 literals as just the colon-separated hexadecimal form of | |
223 | an IPv6 address, for example 1080:0:0:0:8:800:200C:417A, without the need for a | |
224 | prefix. At present, it does not even recognize the prefix. When IPv6 becomes | |
225 | more widespread, Exim will follow whatever the common usage is. | |
226 | ||
227 | ||
228 | 2.6 Underscores in domain names [4.1.2] | |
229 | --------------------------------------- | |
230 | ||
231 | [Underscores are not legal in domain names.] | |
232 | ||
233 | RFC 822 allows all characters except specials, space, and controls in domain | |
234 | names, but the SMTP RFCs are stricter, allowing only letters, digits, and | |
235 | hyphen. Exim is compliant when checking incoming addresses in SMTP commands, | |
236 | but it is more relaxed by default when checking domain names that are supplied | |
237 | by EHLO or HELO commands, because many client workstations get set up with | |
238 | underscores in their names. There is an option that can be set to cause Exim to | |
239 | refuse underscores. (There are also options to specify certain hosts from which | |
240 | it will accept any old junk after EHLO or HELO. Such is the woeful state of | |
241 | some SMTP clients.) | |
242 | ||
243 | ||
244 | 2.7 Removal of return-path headers [4.4] | |
245 | ---------------------------------------- | |
246 | ||
247 | [Relaying MTAs should not remove return-path.] | |
248 | ||
249 | Exim removes Return-Path: headers from all messages, if return_path_remove is | |
250 | set (the default). It does not attempt to determine if it is being a relay or | |
251 | not. Indeed, for some messages it might be both a relay and a final destination | |
252 | MTA for the same message. | |
253 | ||
254 | ||
255 | 2.8 Randomizing the order of addresses of multihomed hosts [5] | |
256 | -------------------------------------------------------------- | |
257 | ||
258 | [Multihomed host addresses should not be randomized.] | |
259 | ||
260 | Exim does randomize a list of several addresses for a single host, because | |
261 | caching in resolvers will defeat the round-robinning that many namerservers | |
262 | use. (Note: this is not the same as randomizing equal-valued MX records. That | |
263 | is required by the RFC.) | |
264 | ||
265 | ||
266 | 2.9 Handling "MX points to self" [5] | |
267 | ------------------------------------ | |
268 | ||
269 | [MX points to self must be treated as an error.] | |
270 | ||
271 | The RFC doesn't allow for the possibility of special-purpose routing in the | |
272 | case when the lowest numbered MX record points to the local host. The default | |
273 | Exim configuration is compliant, but it is possible to configure Exim to behave | |
274 | differently, and there are several situations where this can be useful. | |
275 | ||
276 | ||
277 | 2.10 Source routing [6.1] | |
278 | ------------------------- | |
279 | ||
280 | [Source routes should be stripped.] | |
281 | ||
282 | The new RFC has moved forward in deprecating source-routed email addresses. | |
283 | Exim does not strip them down by default, but can be made to do so by setting | |
284 | collapse_source_routes. However, even when it is not stripping them down, it | |
285 | does not add host routing to reverse-paths when processing a source-routed | |
286 | forward-path. | |
287 | ||
288 | ||
289 | 2.11 Loop detection [6.2] | |
290 | ------------------------- | |
291 | ||
292 | [Loop count for Received: headers should be at least 100.] | |
293 | ||
294 | Exim's default setting of the received_headers_max option is 30. Most messages | |
295 | these days seem to accumulate less than half a dozen Received: headers, and | |
296 | even a couple of forwardings don't bring this anywhere near 30. | |
297 | ||
298 | ||
299 | 2.12 Addition of missing headers [6.3] | |
300 | -------------------------------------- | |
301 | ||
302 | [Missing headers may be added, and domains qualified, only if client is | |
303 | identified.] | |
304 | ||
305 | Exim always adds Message-Id: and Date: headers if these are missing, whatever | |
306 | the source of the message, and likewise when it expands non-fully-qualified | |
307 | domains, it does so independently of the message's source. | |
308 | ||
309 | ||
310 | 2.13 Syntax of MAIL and RCPT commands [4.1.1.2, 4.1.1.3] | |
311 | -------------------------------------------------------- | |
312 | ||
313 | Exim is more relaxed than the RFC requires: | |
314 | ||
315 | (1) Trailing white space is ignored. | |
316 | ||
317 | (2) It permits white space after the "FROM" and "TO" keywords. | |
318 | ||
319 | (3) It does not insist on the address being enclosed in <> characters. In fact, | |
320 | it recognizes addresses in RFC 822 format here, except that domain | |
321 | components are restricted to containing only letters, digits, and hyphens. | |
322 | ||
323 | (4) Local parts are permitted to contain null components, that is, may start or | |
324 | end with an unquoted full stop (period) or contain two consecutive | |
325 | unquoted full stops. | |
326 | ||
327 | ||
328 | 2.14 Non-fully-qualified domains [2.3.5] | |
329 | ---------------------------------------- | |
330 | ||
331 | [All domains must be fully qualified.] | |
332 | ||
333 | A domain that is not fully qualified has some of its trailing components | |
334 | missing, and is normally a local alias of some sort, for example, just a | |
335 | single-component host name. | |
336 | ||
337 | Exim can be configured to "widen" non-fully-qualified domains, either by using | |
338 | the facilities of the DNS resolver, or by an explicit list of widening strings. | |
339 | When this is done, it applies to addresses received by SMTP from other hosts, | |
340 | as well as to locally-originated addresses. Address re-writing could also be | |
341 | used for this purpose. | |
342 | ||
343 | ||
344 | 2.15 Unqualified addresses [4.1.2] | |
345 | ---------------------------------- | |
346 | ||
347 | [Addresses in SMTP commands must include domains.] | |
348 | ||
349 | An unqualified address consists of a local part without a domain. Do not | |
350 | confuse "qualified address" and "qualified domain". A qualified address may | |
351 | include a non-fully-qualified domain. | |
352 | ||
353 | There is one exception to the RFC rule: it is required that the unqualified | |
354 | address "<postmaster>" always be accepted. Apart from this, Exim rejects | |
355 | domainless addresses in SMTP commands by default, but it can be configured with | |
356 | a list of hosts and/or networks that are permitted to send addresses without | |
357 | domains in SMTP commands. Any such address that is accepted (including | |
358 | <postmaster>) is qualified by adding the value of the qualify_domain option. | |
359 | ||
360 | ||
361 | 2.16 VRFY and EXPN [3.5.1, 3.5.2, 3.5.3, 7.3] | |
362 | --------------------------------------------- | |
363 | ||
364 | [VRFY and EXPN should be supported.] | |
365 | ||
366 | Exim does not support VRFY and EXPN by default, but a list of hosts and | |
367 | networks for which they are permitted can be given. | |
368 | ||
369 | ||
370 | 2.17 Checking of EHLO/HELO commands [4.1.4] | |
371 | ------------------------------------------- | |
372 | ||
373 | [Client must send EHLO. Server must not refuse message if EHLO/HELO check | |
374 | fails.] | |
375 | ||
376 | Exim, as a client, always sends EHLO or HELO (see 2.3 above). As a server, it | |
377 | does not insist on there having been a valid EHLO or HELO command before the | |
378 | start of a message transaction. Any EHLO or HELO command that is received is | |
379 | rejected only if it contains a syntax error. That is, it is never rejected on | |
380 | the basis of any validation checking that may be performed on the data it | |
381 | contains. | |
382 | ||
383 | However, Exim can be configured to insist that (a) there is valid EHLO/HELO | |
384 | command before any message transaction and (b) the domain in that command | |
385 | matches the domain obtained by looking up the IP address of the sending host. | |
386 | It is possible to specify exception lists of hosts and/or networks for which | |
387 | this check does not apply. | |
388 | ||
389 | ||
390 | 2.18 Format of delivery error messages [3.7] | |
391 | -------------------------------------------- | |
392 | ||
393 | [Standard report formats should be used if possible.] | |
394 | ||
395 | Exim's delivery failure reports do not conform to the format described in RFC | |
396 | 1894. | |
397 | ||
398 | ||
399 | ## End ## |