Commit | Line | Data |
---|---|---|
e05f33e0 PH |
1 | $Cambridge: exim/doc/doc-misc/RFC.conform,v 1.1 2004/10/08 10:38:47 ph10 Exp $ |
2 | ||
3 | Conformance with RFCs | |
4 | --------------------- | |
5 | ||
6 | Exim is written to follow the rules laid down in the RFCs. However, there are | |
7 | some circumstances where it either extends what is specified, or chooses not to | |
8 | follow them strictly, for various reasons. Sometimes variations are controlled | |
9 | by an option, which may default on or off. This document lists the variations | |
10 | from the latest email RFCs, and discusses their background and implications. | |
11 | ||
12 | Last Updated: 25 January 1999 | |
13 | ||
14 | ||
15 | 1. RFC 822 | |
16 | ---------- | |
17 | ||
18 | The original specification of the format of Internet mail messages is RFC 822, | |
19 | later clarified and modified by RFC 1123. At the time of writing (January 1999) | |
20 | a new RFC (currently known as draft-ietf-drums-msg-fmt-07) which updates and | |
21 | consolidates all the material related to the message format is at a late stage | |
22 | of drafting, and is expected to become an Internet Standard in due course. | |
23 | ||
24 | The following is (I hope) a complete list of major variations from the draft | |
25 | RFC. References in square brackets are to the -07 draft. | |
26 | ||
27 | ||
28 | 1.1 Line termination [2.1, 2.3] | |
29 | ------------------------------- | |
30 | ||
31 | [Lines are terminated by CRLF; isolated CR and LF are not permitted.] | |
32 | ||
33 | The CRLF requirement has to be interpreted carefully, because the RFC also says | |
34 | that it does not cover the internal format "used by sites". Exim keeps messages | |
35 | on its spool in Unix format, using only LF as the line terminator, and also | |
36 | does local deliveries using only LF. I believe this is compliant with the RFC, | |
37 | as these are both "internal formats". | |
38 | ||
39 | Messages sent out by SMTP have CRLF line terminators. However, isolated CR | |
40 | characters are treated as any other data characters, because Exim is eight-bit | |
41 | clean (see 1.2 below). | |
42 | ||
43 | See 2.1 below for a discussion of line terminators in incoming messages. | |
44 | ||
45 | ||
46 | 1.2 Eight-bit characters [2.1] | |
47 | ------------------------------ | |
48 | ||
49 | [Messages consist of 7-bit characters.] | |
50 | ||
51 | Exim is eight-bit clean. It does not do any processing of the characters in the | |
52 | body of a message. | |
53 | ||
54 | ||
55 | 1.3 Maximum line length [2.1, 2.3] | |
56 | ---------------------------------- | |
57 | ||
58 | [The maximum length of a line is 998 characters.] | |
59 | ||
60 | Exim does not enforce any limit on line length. | |
61 | ||
62 | ||
63 | 1.4 The "phrase" part of an address [3.4] | |
64 | ----------------------------------------- | |
65 | ||
66 | [The phrase is a sequence of "words"; a word is an "atom" or a quoted string.] | |
67 | ||
68 | The characters that can be used in an "atom" do not include the full stop | |
69 | (dot, period). Thus a header line such as | |
70 | ||
71 | To: John Q. Public <jqp@anywhere.org> | |
72 | ||
73 | is syntactically invalid under a strict interpretation of the RFC because the | |
74 | dot in the phrase part is not quoted. However, many MTAs do not enforce this | |
75 | restriction, so Exim was changed to be relaxed about it as well. In fact, the | |
76 | draft RFC is moving towards allowing this. In section [4.1], which is defining | |
77 | "obsolete" syntax that programs must accept (but not generate), it says this: | |
78 | ||
79 | The period character is added to obs-phrase. | |
80 | ||
81 | Note: The period character in obs-phrase is not a form that was allowed | |
82 | in earlier versions of this or any other standard. Period (nor any other | |
83 | character from specials) was not allowed in phrase because it introduced | |
84 | a parsing difficulty distinguishing between phrases and portions of an | |
85 | addr-spec (see section 4.4). It appears here because the period | |
86 | character is currently used in many messages in the display-name portion | |
87 | of addresses, especially for initials in names, and therefore must be | |
88 | interpreted properly. In the future, period may appear in the regular | |
89 | syntax of phrase. | |
90 | ||
91 | ||
92 | 1.5 Source routed addresses [4.4] | |
93 | --------------------------------- | |
94 | ||
95 | [Source routed addresses are always enclosed in <>.] | |
96 | ||
97 | Source routed addresses are declared obsolete in the draft RFC, but MTAs are | |
98 | still required to handle them. Strictly, a source-routed address must be | |
99 | enclosed in <> characters, so a header such as | |
100 | ||
101 | From: @a,@b:c@d | |
102 | ||
103 | is syntactally invalid. Exim does not enforce this restriction. | |
104 | ||
105 | ||
106 | 1.6 Local parts [3.4.1] | |
107 | ----------------------- | |
108 | ||
109 | [Dots in unquoted local parts may not be consecutive or at either end.] | |
110 | ||
111 | Exim allows unquoted local parts to begin or end with a dot (period, full | |
112 | stop), and it also permits two consecutive dots in a local part. | |
113 | ||
114 | ||
115 | ||
116 | 2. RFC 821 | |
117 | ---------- | |
118 | ||
119 | The original specification of SMTP is RFC 821, later clarified and modified by | |
120 | RFC 1123. Domain name system requirements and their implications for mail are | |
121 | covered in RFCs 1035 and 974. A scheme for extending the SMTP protocol is | |
122 | described in RFC 1869, and there are subsequent RFCs specifying particular | |
123 | extensions. | |
124 | ||
125 | At the time of writing (January 1999) a new RFC (currently known as | |
126 | draft-ietf-drums-smtpupd-09) which updates and consolidates all the material | |
127 | connected with SMTP message transmission is at a late stage of drafting, and is | |
128 | expected to become an Internet Standard in due course. | |
129 | ||
130 | The new draft is written using the terms MUST, SHOULD, and MAY, which, when | |
131 | written in capital letters, have precise meanings. To quote from the draft: | |
132 | ||
133 | "MUST" or "MUST NOT" identify absolute requirements for conformance to | |
134 | this specification. Implementations that do not conform to them lie | |
135 | outside the scope of this specification and often will not | |
136 | interoperate properly with SMTP implementations that do conform. | |
137 | Implementations that are fully conforming also adhere to all "SHOULD" | |
138 | and "SHOULD NOT" requirements. Implementations that adhere to all | |
139 | "MUST" ("MUST NOT") but not to all of these are considered to be | |
140 | partially conforming. Such implementations may interoperate properly | |
141 | with fully conforming ones and with each other, but this will | |
142 | typically be the case only if great care is taken. Consequently, an | |
143 | implementation should violate "SHOULD" ("SHOULD NOT") requirements | |
144 | only under exceptional and well-understood circumstances. | |
145 | ||
146 | The implementation of Exim is intended to conform to the spirit of this | |
147 | paragraph. The following is (I hope) a complete list of major variations | |
148 | from the draft RFC. In addition to the items listed here, there are other minor | |
149 | extensions such as the tolerance of white space in places where it is not | |
150 | strictly permitted by the RFC. References in square brackets are to the -09 | |
151 | draft sections, and brief summaries of the RFC requirement are also given in | |
152 | square brackets. | |
153 | ||
154 | ||
155 | 2.1 Line termination [2.3.7, 4.1.1.4] | |
156 | ------------------------------------- | |
157 | ||
158 | [SMTP lines are terminated by CRLF.] | |
159 | ||
160 | Exim recognizes LF without CR as a line terminator in all forms of input. For | |
161 | SMTP input, any preceding CR is discarded. An early version of Exim followed | |
162 | the RFC strictly, and did not recognize LF without CR in SMTP input. However, | |
163 | it seems that sites on the net send out messages with just LF terminators, | |
164 | despite the warnings in the RFCs, and other MTAs handle this, so Exim was | |
165 | changed. However, there is a compile time macro called STRICT_CRLF which can be | |
166 | set to restore the strict behaviour, though this is undocumented. | |
167 | ||
168 | ||
169 | 2.2 Eight-bit characters [2.4.1] | |
170 | -------------------------------- | |
171 | ||
172 | [SMTP transmits only 7-bit characters.] | |
173 | ||
174 | Exim is eight-bit clean, and makes no attempt to modify the data in a message | |
175 | in any way. In particular, for messages containing characters with the top bit | |
176 | set, it neither tries to negotiate 8-bit transmission, nor converts such | |
177 | characters into an encoded form. In other words, it adopts the "just send 8" | |
178 | strategy. It can be configured to send out 8BITMIME in its response to EHLO | |
179 | (which it does not do by default), and it recognizes the 8BITMIME keyword on | |
180 | incoming messages, but neither of these affect its handling of message data. | |
181 | "Just send 8" is the strategy of a number of MTAs; it is argued that it | |
182 | achieves what the user wants more often than other strategies. | |
183 | ||
184 | ||
185 | 2.3 Use of EHLO/HELO [3.2] | |
186 | -------------------------- | |
187 | ||
188 | [Client MTAs should always start with EHLO, not HELO.] | |
189 | ||
190 | Exim sends EHLO only when it finds the string "ESMTP" in an SMTP greeting | |
191 | message. If EHLO is refused with a 5xx return code, it then reverts to HELO as | |
192 | required, but it does not contain logic for converting to HELO on other errors | |
193 | such as loss of connection or timeout after EHLO. That is one reason why it | |
194 | doesn't always send EHLO; there are reported to be ancient SMTP servers out | |
195 | there which collapse on receiving EHLO. (There is also at least one server | |
196 | whose banner reads "<host name> ignores ESMTP", but it is RFC 821 compliant in | |
197 | that it responds with 5O0 to EHLO, so Exim successfully reverts to HELO.) | |
198 | ||
199 | ||
200 | 2.4 Closing the connection [4.1.1.10] | |
201 | ------------------------------------- | |
202 | ||
203 | [Client must wait for response to QUIT before closing the connection.] | |
204 | ||
205 | Exim closes the connection immediately after sending QUIT, without waiting for | |
206 | the reply. There was a lot of discussion about this on one of the mailing | |
207 | lists. The conclusion was that this behaviour is fine on Unix systems, which | |
208 | have TCP/IP implementations that close down the underlying channel tidily even | |
209 | when the associated process has terminated. Indeed, not waiting may be | |
210 | beneficial, as it moves the TIME_WAIT state (waiting to ensure there's no more | |
211 | data in transit) from the server to the client system. On some other operating | |
212 | systems (I understand) it is a disaster to terminate the sending process | |
213 | without waiting for the QUIT response, because all the data about the | |
214 | connection lives in the client's process space, and is therefore thrown away | |
215 | before the response arrives. The subsequent arrival of the response then causes | |
216 | bad behaviour. | |
217 | ||
218 | ||
219 | 2.5 IPv6 address literals [4.1.2] | |
220 | --------------------------------- | |
221 | ||
222 | [IPv6 address literals are introduced by "IPv6".] | |
223 | ||
224 | Exim recognizes IPv6 literals as just the colon-separated hexadecimal form of | |
225 | an IPv6 address, for example 1080:0:0:0:8:800:200C:417A, without the need for a | |
226 | prefix. At present, it does not even recognize the prefix. When IPv6 becomes | |
227 | more widespread, Exim will follow whatever the common usage is. | |
228 | ||
229 | ||
230 | 2.6 Underscores in domain names [4.1.2] | |
231 | --------------------------------------- | |
232 | ||
233 | [Underscores are not legal in domain names.] | |
234 | ||
235 | RFC 822 allows all characters except specials, space, and controls in domain | |
236 | names, but the SMTP RFCs are stricter, allowing only letters, digits, and | |
237 | hyphen. Exim is compliant when checking incoming addresses in SMTP commands, | |
238 | but it is more relaxed by default when checking domain names that are supplied | |
239 | by EHLO or HELO commands, because many client workstations get set up with | |
240 | underscores in their names. There is an option that can be set to cause Exim to | |
241 | refuse underscores. (There are also options to specify certain hosts from which | |
242 | it will accept any old junk after EHLO or HELO. Such is the woeful state of | |
243 | some SMTP clients.) | |
244 | ||
245 | ||
246 | 2.7 Removal of return-path headers [4.4] | |
247 | ---------------------------------------- | |
248 | ||
249 | [Relaying MTAs should not remove return-path.] | |
250 | ||
251 | Exim removes Return-Path: headers from all messages, if return_path_remove is | |
252 | set (the default). It does not attempt to determine if it is being a relay or | |
253 | not. Indeed, for some messages it might be both a relay and a final destination | |
254 | MTA for the same message. | |
255 | ||
256 | ||
257 | 2.8 Randomizing the order of addresses of multihomed hosts [5] | |
258 | -------------------------------------------------------------- | |
259 | ||
260 | [Multihomed host addresses should not be randomized.] | |
261 | ||
262 | Exim does randomize a list of several addresses for a single host, because | |
263 | caching in resolvers will defeat the round-robinning that many namerservers | |
264 | use. (Note: this is not the same as randomizing equal-valued MX records. That | |
265 | is required by the RFC.) | |
266 | ||
267 | ||
268 | 2.9 Handling "MX points to self" [5] | |
269 | ------------------------------------ | |
270 | ||
271 | [MX points to self must be treated as an error.] | |
272 | ||
273 | The RFC doesn't allow for the possibility of special-purpose routing in the | |
274 | case when the lowest numbered MX record points to the local host. The default | |
275 | Exim configuration is compliant, but it is possible to configure Exim to behave | |
276 | differently, and there are several situations where this can be useful. | |
277 | ||
278 | ||
279 | 2.10 Source routing [6.1] | |
280 | ------------------------- | |
281 | ||
282 | [Source routes should be stripped.] | |
283 | ||
284 | The new RFC has moved forward in deprecating source-routed email addresses. | |
285 | Exim does not strip them down by default, but can be made to do so by setting | |
286 | collapse_source_routes. However, even when it is not stripping them down, it | |
287 | does not add host routing to reverse-paths when processing a source-routed | |
288 | forward-path. | |
289 | ||
290 | ||
291 | 2.11 Loop detection [6.2] | |
292 | ------------------------- | |
293 | ||
294 | [Loop count for Received: headers should be at least 100.] | |
295 | ||
296 | Exim's default setting of the received_headers_max option is 30. Most messages | |
297 | these days seem to accumulate less than half a dozen Received: headers, and | |
298 | even a couple of forwardings don't bring this anywhere near 30. | |
299 | ||
300 | ||
301 | 2.12 Addition of missing headers [6.3] | |
302 | -------------------------------------- | |
303 | ||
304 | [Missing headers may be added, and domains qualified, only if client is | |
305 | identified.] | |
306 | ||
307 | Exim always adds Message-Id: and Date: headers if these are missing, whatever | |
308 | the source of the message, and likewise when it expands non-fully-qualified | |
309 | domains, it does so independently of the message's source. | |
310 | ||
311 | ||
312 | 2.13 Syntax of MAIL and RCPT commands [4.1.1.2, 4.1.1.3] | |
313 | -------------------------------------------------------- | |
314 | ||
315 | Exim is more relaxed than the RFC requires: | |
316 | ||
317 | (1) Trailing white space is ignored. | |
318 | ||
319 | (2) It permits white space after the "FROM" and "TO" keywords. | |
320 | ||
321 | (3) It does not insist on the address being enclosed in <> characters. In fact, | |
322 | it recognizes addresses in RFC 822 format here, except that domain | |
323 | components are restricted to containing only letters, digits, and hyphens. | |
324 | ||
325 | (4) Local parts are permitted to contain null components, that is, may start or | |
326 | end with an unquoted full stop (period) or contain two consecutive | |
327 | unquoted full stops. | |
328 | ||
329 | ||
330 | 2.14 Non-fully-qualified domains [2.3.5] | |
331 | ---------------------------------------- | |
332 | ||
333 | [All domains must be fully qualified.] | |
334 | ||
335 | A domain that is not fully qualified has some of its trailing components | |
336 | missing, and is normally a local alias of some sort, for example, just a | |
337 | single-component host name. | |
338 | ||
339 | Exim can be configured to "widen" non-fully-qualified domains, either by using | |
340 | the facilities of the DNS resolver, or by an explicit list of widening strings. | |
341 | When this is done, it applies to addresses received by SMTP from other hosts, | |
342 | as well as to locally-originated addresses. Address re-writing could also be | |
343 | used for this purpose. | |
344 | ||
345 | ||
346 | 2.15 Unqualified addresses [4.1.2] | |
347 | ---------------------------------- | |
348 | ||
349 | [Addresses in SMTP commands must include domains.] | |
350 | ||
351 | An unqualified address consists of a local part without a domain. Do not | |
352 | confuse "qualified address" and "qualified domain". A qualified address may | |
353 | include a non-fully-qualified domain. | |
354 | ||
355 | There is one exception to the RFC rule: it is required that the unqualified | |
356 | address "<postmaster>" always be accepted. Apart from this, Exim rejects | |
357 | domainless addresses in SMTP commands by default, but it can be configured with | |
358 | a list of hosts and/or networks that are permitted to send addresses without | |
359 | domains in SMTP commands. Any such address that is accepted (including | |
360 | <postmaster>) is qualified by adding the value of the qualify_domain option. | |
361 | ||
362 | ||
363 | 2.16 VRFY and EXPN [3.5.1, 3.5.2, 3.5.3, 7.3] | |
364 | --------------------------------------------- | |
365 | ||
366 | [VRFY and EXPN should be supported.] | |
367 | ||
368 | Exim does not support VRFY and EXPN by default, but a list of hosts and | |
369 | networks for which they are permitted can be given. | |
370 | ||
371 | ||
372 | 2.17 Checking of EHLO/HELO commands [4.1.4] | |
373 | ------------------------------------------- | |
374 | ||
375 | [Client must send EHLO. Server must not refuse message if EHLO/HELO check | |
376 | fails.] | |
377 | ||
378 | Exim, as a client, always sends EHLO or HELO (see 2.3 above). As a server, it | |
379 | does not insist on there having been a valid EHLO or HELO command before the | |
380 | start of a message transaction. Any EHLO or HELO command that is received is | |
381 | rejected only if it contains a syntax error. That is, it is never rejected on | |
382 | the basis of any validation checking that may be performed on the data it | |
383 | contains. | |
384 | ||
385 | However, Exim can be configured to insist that (a) there is valid EHLO/HELO | |
386 | command before any message transaction and (b) the domain in that command | |
387 | matches the domain obtained by looking up the IP address of the sending host. | |
388 | It is possible to specify exception lists of hosts and/or networks for which | |
389 | this check does not apply. | |
390 | ||
391 | ||
392 | 2.18 Format of delivery error messages [3.7] | |
393 | -------------------------------------------- | |
394 | ||
395 | [Standard report formats should be used if possible.] | |
396 | ||
397 | Exim's delivery failure reports do not conform to the format described in RFC | |
398 | 1894. | |
399 | ||
400 | ||
401 | ## End ## |