Start
[exim.git] / doc / doc-misc / RFC.conform
CommitLineData
e05f33e0
PH
1$Cambridge: exim/doc/doc-misc/RFC.conform,v 1.1 2004/10/08 10:38:47 ph10 Exp $
2
3Conformance with RFCs
4---------------------
5
6Exim is written to follow the rules laid down in the RFCs. However, there are
7some circumstances where it either extends what is specified, or chooses not to
8follow them strictly, for various reasons. Sometimes variations are controlled
9by an option, which may default on or off. This document lists the variations
10from the latest email RFCs, and discusses their background and implications.
11
12Last Updated: 25 January 1999
13
14
151. RFC 822
16----------
17
18The original specification of the format of Internet mail messages is RFC 822,
19later clarified and modified by RFC 1123. At the time of writing (January 1999)
20a new RFC (currently known as draft-ietf-drums-msg-fmt-07) which updates and
21consolidates all the material related to the message format is at a late stage
22of drafting, and is expected to become an Internet Standard in due course.
23
24The following is (I hope) a complete list of major variations from the draft
25RFC. References in square brackets are to the -07 draft.
26
27
281.1 Line termination [2.1, 2.3]
29-------------------------------
30
31[Lines are terminated by CRLF; isolated CR and LF are not permitted.]
32
33The CRLF requirement has to be interpreted carefully, because the RFC also says
34that it does not cover the internal format "used by sites". Exim keeps messages
35on its spool in Unix format, using only LF as the line terminator, and also
36does local deliveries using only LF. I believe this is compliant with the RFC,
37as these are both "internal formats".
38
39Messages sent out by SMTP have CRLF line terminators. However, isolated CR
40characters are treated as any other data characters, because Exim is eight-bit
41clean (see 1.2 below).
42
43See 2.1 below for a discussion of line terminators in incoming messages.
44
45
461.2 Eight-bit characters [2.1]
47------------------------------
48
49[Messages consist of 7-bit characters.]
50
51Exim is eight-bit clean. It does not do any processing of the characters in the
52body of a message.
53
54
551.3 Maximum line length [2.1, 2.3]
56----------------------------------
57
58[The maximum length of a line is 998 characters.]
59
60Exim does not enforce any limit on line length.
61
62
631.4 The "phrase" part of an address [3.4]
64-----------------------------------------
65
66[The phrase is a sequence of "words"; a word is an "atom" or a quoted string.]
67
68The characters that can be used in an "atom" do not include the full stop
69(dot, period). Thus a header line such as
70
71 To: John Q. Public <jqp@anywhere.org>
72
73is syntactically invalid under a strict interpretation of the RFC because the
74dot in the phrase part is not quoted. However, many MTAs do not enforce this
75restriction, so Exim was changed to be relaxed about it as well. In fact, the
76draft RFC is moving towards allowing this. In section [4.1], which is defining
77"obsolete" syntax that programs must accept (but not generate), it says this:
78
79 The period character is added to obs-phrase.
80
81 Note: The period character in obs-phrase is not a form that was allowed
82 in earlier versions of this or any other standard. Period (nor any other
83 character from specials) was not allowed in phrase because it introduced
84 a parsing difficulty distinguishing between phrases and portions of an
85 addr-spec (see section 4.4). It appears here because the period
86 character is currently used in many messages in the display-name portion
87 of addresses, especially for initials in names, and therefore must be
88 interpreted properly. In the future, period may appear in the regular
89 syntax of phrase.
90
91
921.5 Source routed addresses [4.4]
93---------------------------------
94
95[Source routed addresses are always enclosed in <>.]
96
97Source routed addresses are declared obsolete in the draft RFC, but MTAs are
98still required to handle them. Strictly, a source-routed address must be
99enclosed in <> characters, so a header such as
100
101 From: @a,@b:c@d
102
103is syntactally invalid. Exim does not enforce this restriction.
104
105
1061.6 Local parts [3.4.1]
107-----------------------
108
109[Dots in unquoted local parts may not be consecutive or at either end.]
110
111Exim allows unquoted local parts to begin or end with a dot (period, full
112stop), and it also permits two consecutive dots in a local part.
113
114
115
1162. RFC 821
117----------
118
119The original specification of SMTP is RFC 821, later clarified and modified by
120RFC 1123. Domain name system requirements and their implications for mail are
121covered in RFCs 1035 and 974. A scheme for extending the SMTP protocol is
122described in RFC 1869, and there are subsequent RFCs specifying particular
123extensions.
124
125At the time of writing (January 1999) a new RFC (currently known as
126draft-ietf-drums-smtpupd-09) which updates and consolidates all the material
127connected with SMTP message transmission is at a late stage of drafting, and is
128expected to become an Internet Standard in due course.
129
130The new draft is written using the terms MUST, SHOULD, and MAY, which, when
131written in capital letters, have precise meanings. To quote from the draft:
132
133 "MUST" or "MUST NOT" identify absolute requirements for conformance to
134 this specification. Implementations that do not conform to them lie
135 outside the scope of this specification and often will not
136 interoperate properly with SMTP implementations that do conform.
137 Implementations that are fully conforming also adhere to all "SHOULD"
138 and "SHOULD NOT" requirements. Implementations that adhere to all
139 "MUST" ("MUST NOT") but not to all of these are considered to be
140 partially conforming. Such implementations may interoperate properly
141 with fully conforming ones and with each other, but this will
142 typically be the case only if great care is taken. Consequently, an
143 implementation should violate "SHOULD" ("SHOULD NOT") requirements
144 only under exceptional and well-understood circumstances.
145
146The implementation of Exim is intended to conform to the spirit of this
147paragraph. The following is (I hope) a complete list of major variations
148from the draft RFC. In addition to the items listed here, there are other minor
149extensions such as the tolerance of white space in places where it is not
150strictly permitted by the RFC. References in square brackets are to the -09
151draft sections, and brief summaries of the RFC requirement are also given in
152square brackets.
153
154
1552.1 Line termination [2.3.7, 4.1.1.4]
156-------------------------------------
157
158[SMTP lines are terminated by CRLF.]
159
160Exim recognizes LF without CR as a line terminator in all forms of input. For
161SMTP input, any preceding CR is discarded. An early version of Exim followed
162the RFC strictly, and did not recognize LF without CR in SMTP input. However,
163it seems that sites on the net send out messages with just LF terminators,
164despite the warnings in the RFCs, and other MTAs handle this, so Exim was
165changed. However, there is a compile time macro called STRICT_CRLF which can be
166set to restore the strict behaviour, though this is undocumented.
167
168
1692.2 Eight-bit characters [2.4.1]
170--------------------------------
171
172[SMTP transmits only 7-bit characters.]
173
174Exim is eight-bit clean, and makes no attempt to modify the data in a message
175in any way. In particular, for messages containing characters with the top bit
176set, it neither tries to negotiate 8-bit transmission, nor converts such
177characters into an encoded form. In other words, it adopts the "just send 8"
178strategy. It can be configured to send out 8BITMIME in its response to EHLO
179(which it does not do by default), and it recognizes the 8BITMIME keyword on
180incoming messages, but neither of these affect its handling of message data.
181"Just send 8" is the strategy of a number of MTAs; it is argued that it
182achieves what the user wants more often than other strategies.
183
184
1852.3 Use of EHLO/HELO [3.2]
186--------------------------
187
188[Client MTAs should always start with EHLO, not HELO.]
189
190Exim sends EHLO only when it finds the string "ESMTP" in an SMTP greeting
191message. If EHLO is refused with a 5xx return code, it then reverts to HELO as
192required, but it does not contain logic for converting to HELO on other errors
193such as loss of connection or timeout after EHLO. That is one reason why it
194doesn't always send EHLO; there are reported to be ancient SMTP servers out
195there which collapse on receiving EHLO. (There is also at least one server
196whose banner reads "<host name> ignores ESMTP", but it is RFC 821 compliant in
197that it responds with 5O0 to EHLO, so Exim successfully reverts to HELO.)
198
199
2002.4 Closing the connection [4.1.1.10]
201-------------------------------------
202
203[Client must wait for response to QUIT before closing the connection.]
204
205Exim closes the connection immediately after sending QUIT, without waiting for
206the reply. There was a lot of discussion about this on one of the mailing
207lists. The conclusion was that this behaviour is fine on Unix systems, which
208have TCP/IP implementations that close down the underlying channel tidily even
209when the associated process has terminated. Indeed, not waiting may be
210beneficial, as it moves the TIME_WAIT state (waiting to ensure there's no more
211data in transit) from the server to the client system. On some other operating
212systems (I understand) it is a disaster to terminate the sending process
213without waiting for the QUIT response, because all the data about the
214connection lives in the client's process space, and is therefore thrown away
215before the response arrives. The subsequent arrival of the response then causes
216bad behaviour.
217
218
2192.5 IPv6 address literals [4.1.2]
220---------------------------------
221
222[IPv6 address literals are introduced by "IPv6".]
223
224Exim recognizes IPv6 literals as just the colon-separated hexadecimal form of
225an IPv6 address, for example 1080:0:0:0:8:800:200C:417A, without the need for a
226prefix. At present, it does not even recognize the prefix. When IPv6 becomes
227more widespread, Exim will follow whatever the common usage is.
228
229
2302.6 Underscores in domain names [4.1.2]
231---------------------------------------
232
233[Underscores are not legal in domain names.]
234
235RFC 822 allows all characters except specials, space, and controls in domain
236names, but the SMTP RFCs are stricter, allowing only letters, digits, and
237hyphen. Exim is compliant when checking incoming addresses in SMTP commands,
238but it is more relaxed by default when checking domain names that are supplied
239by EHLO or HELO commands, because many client workstations get set up with
240underscores in their names. There is an option that can be set to cause Exim to
241refuse underscores. (There are also options to specify certain hosts from which
242it will accept any old junk after EHLO or HELO. Such is the woeful state of
243some SMTP clients.)
244
245
2462.7 Removal of return-path headers [4.4]
247----------------------------------------
248
249[Relaying MTAs should not remove return-path.]
250
251Exim removes Return-Path: headers from all messages, if return_path_remove is
252set (the default). It does not attempt to determine if it is being a relay or
253not. Indeed, for some messages it might be both a relay and a final destination
254MTA for the same message.
255
256
2572.8 Randomizing the order of addresses of multihomed hosts [5]
258--------------------------------------------------------------
259
260[Multihomed host addresses should not be randomized.]
261
262Exim does randomize a list of several addresses for a single host, because
263caching in resolvers will defeat the round-robinning that many namerservers
264use. (Note: this is not the same as randomizing equal-valued MX records. That
265is required by the RFC.)
266
267
2682.9 Handling "MX points to self" [5]
269------------------------------------
270
271[MX points to self must be treated as an error.]
272
273The RFC doesn't allow for the possibility of special-purpose routing in the
274case when the lowest numbered MX record points to the local host. The default
275Exim configuration is compliant, but it is possible to configure Exim to behave
276differently, and there are several situations where this can be useful.
277
278
2792.10 Source routing [6.1]
280-------------------------
281
282[Source routes should be stripped.]
283
284The new RFC has moved forward in deprecating source-routed email addresses.
285Exim does not strip them down by default, but can be made to do so by setting
286collapse_source_routes. However, even when it is not stripping them down, it
287does not add host routing to reverse-paths when processing a source-routed
288forward-path.
289
290
2912.11 Loop detection [6.2]
292-------------------------
293
294[Loop count for Received: headers should be at least 100.]
295
296Exim's default setting of the received_headers_max option is 30. Most messages
297these days seem to accumulate less than half a dozen Received: headers, and
298even a couple of forwardings don't bring this anywhere near 30.
299
300
3012.12 Addition of missing headers [6.3]
302--------------------------------------
303
304[Missing headers may be added, and domains qualified, only if client is
305identified.]
306
307Exim always adds Message-Id: and Date: headers if these are missing, whatever
308the source of the message, and likewise when it expands non-fully-qualified
309domains, it does so independently of the message's source.
310
311
3122.13 Syntax of MAIL and RCPT commands [4.1.1.2, 4.1.1.3]
313--------------------------------------------------------
314
315Exim is more relaxed than the RFC requires:
316
317(1) Trailing white space is ignored.
318
319(2) It permits white space after the "FROM" and "TO" keywords.
320
321(3) It does not insist on the address being enclosed in <> characters. In fact,
322 it recognizes addresses in RFC 822 format here, except that domain
323 components are restricted to containing only letters, digits, and hyphens.
324
325(4) Local parts are permitted to contain null components, that is, may start or
326 end with an unquoted full stop (period) or contain two consecutive
327 unquoted full stops.
328
329
3302.14 Non-fully-qualified domains [2.3.5]
331----------------------------------------
332
333[All domains must be fully qualified.]
334
335A domain that is not fully qualified has some of its trailing components
336missing, and is normally a local alias of some sort, for example, just a
337single-component host name.
338
339Exim can be configured to "widen" non-fully-qualified domains, either by using
340the facilities of the DNS resolver, or by an explicit list of widening strings.
341When this is done, it applies to addresses received by SMTP from other hosts,
342as well as to locally-originated addresses. Address re-writing could also be
343used for this purpose.
344
345
3462.15 Unqualified addresses [4.1.2]
347----------------------------------
348
349[Addresses in SMTP commands must include domains.]
350
351An unqualified address consists of a local part without a domain. Do not
352confuse "qualified address" and "qualified domain". A qualified address may
353include a non-fully-qualified domain.
354
355There is one exception to the RFC rule: it is required that the unqualified
356address "<postmaster>" always be accepted. Apart from this, Exim rejects
357domainless addresses in SMTP commands by default, but it can be configured with
358a list of hosts and/or networks that are permitted to send addresses without
359domains in SMTP commands. Any such address that is accepted (including
360<postmaster>) is qualified by adding the value of the qualify_domain option.
361
362
3632.16 VRFY and EXPN [3.5.1, 3.5.2, 3.5.3, 7.3]
364---------------------------------------------
365
366[VRFY and EXPN should be supported.]
367
368Exim does not support VRFY and EXPN by default, but a list of hosts and
369networks for which they are permitted can be given.
370
371
3722.17 Checking of EHLO/HELO commands [4.1.4]
373-------------------------------------------
374
375[Client must send EHLO. Server must not refuse message if EHLO/HELO check
376fails.]
377
378Exim, as a client, always sends EHLO or HELO (see 2.3 above). As a server, it
379does not insist on there having been a valid EHLO or HELO command before the
380start of a message transaction. Any EHLO or HELO command that is received is
381rejected only if it contains a syntax error. That is, it is never rejected on
382the basis of any validation checking that may be performed on the data it
383contains.
384
385However, Exim can be configured to insist that (a) there is valid EHLO/HELO
386command before any message transaction and (b) the domain in that command
387matches the domain obtained by looking up the IP address of the sending host.
388It is possible to specify exception lists of hosts and/or networks for which
389this check does not apply.
390
391
3922.18 Format of delivery error messages [3.7]
393--------------------------------------------
394
395[Standard report formats should be used if possible.]
396
397Exim's delivery failure reports do not conform to the format described in RFC
3981894.
399
400
401## End ##