Unix socket struct naming: avoid "sun" due to conflict on Solaris
[exim.git] / doc / doc-misc / RFC.conform
CommitLineData
e05f33e0
PH
1Conformance with RFCs
2---------------------
3
4Exim is written to follow the rules laid down in the RFCs. However, there are
5some circumstances where it either extends what is specified, or chooses not to
6follow them strictly, for various reasons. Sometimes variations are controlled
7by an option, which may default on or off. This document lists the variations
8from the latest email RFCs, and discusses their background and implications.
9
10Last Updated: 25 January 1999
11
12
131. RFC 822
14----------
15
16The original specification of the format of Internet mail messages is RFC 822,
17later clarified and modified by RFC 1123. At the time of writing (January 1999)
18a new RFC (currently known as draft-ietf-drums-msg-fmt-07) which updates and
19consolidates all the material related to the message format is at a late stage
20of drafting, and is expected to become an Internet Standard in due course.
21
22The following is (I hope) a complete list of major variations from the draft
23RFC. References in square brackets are to the -07 draft.
24
25
261.1 Line termination [2.1, 2.3]
27-------------------------------
28
29[Lines are terminated by CRLF; isolated CR and LF are not permitted.]
30
31The CRLF requirement has to be interpreted carefully, because the RFC also says
32that it does not cover the internal format "used by sites". Exim keeps messages
33on its spool in Unix format, using only LF as the line terminator, and also
34does local deliveries using only LF. I believe this is compliant with the RFC,
35as these are both "internal formats".
36
37Messages sent out by SMTP have CRLF line terminators. However, isolated CR
38characters are treated as any other data characters, because Exim is eight-bit
39clean (see 1.2 below).
40
41See 2.1 below for a discussion of line terminators in incoming messages.
42
43
441.2 Eight-bit characters [2.1]
45------------------------------
46
47[Messages consist of 7-bit characters.]
48
49Exim is eight-bit clean. It does not do any processing of the characters in the
50body of a message.
51
52
531.3 Maximum line length [2.1, 2.3]
54----------------------------------
55
56[The maximum length of a line is 998 characters.]
57
58Exim does not enforce any limit on line length.
59
60
611.4 The "phrase" part of an address [3.4]
62-----------------------------------------
63
64[The phrase is a sequence of "words"; a word is an "atom" or a quoted string.]
65
66The characters that can be used in an "atom" do not include the full stop
67(dot, period). Thus a header line such as
68
69 To: John Q. Public <jqp@anywhere.org>
70
71is syntactically invalid under a strict interpretation of the RFC because the
72dot in the phrase part is not quoted. However, many MTAs do not enforce this
73restriction, so Exim was changed to be relaxed about it as well. In fact, the
74draft RFC is moving towards allowing this. In section [4.1], which is defining
75"obsolete" syntax that programs must accept (but not generate), it says this:
76
77 The period character is added to obs-phrase.
78
79 Note: The period character in obs-phrase is not a form that was allowed
80 in earlier versions of this or any other standard. Period (nor any other
81 character from specials) was not allowed in phrase because it introduced
82 a parsing difficulty distinguishing between phrases and portions of an
83 addr-spec (see section 4.4). It appears here because the period
84 character is currently used in many messages in the display-name portion
85 of addresses, especially for initials in names, and therefore must be
86 interpreted properly. In the future, period may appear in the regular
87 syntax of phrase.
88
89
901.5 Source routed addresses [4.4]
91---------------------------------
92
93[Source routed addresses are always enclosed in <>.]
94
95Source routed addresses are declared obsolete in the draft RFC, but MTAs are
96still required to handle them. Strictly, a source-routed address must be
97enclosed in <> characters, so a header such as
98
99 From: @a,@b:c@d
100
4c04137d 101is syntactically invalid. Exim does not enforce this restriction.
e05f33e0
PH
102
103
1041.6 Local parts [3.4.1]
105-----------------------
106
107[Dots in unquoted local parts may not be consecutive or at either end.]
108
109Exim allows unquoted local parts to begin or end with a dot (period, full
110stop), and it also permits two consecutive dots in a local part.
111
112
113
1142. RFC 821
115----------
116
117The original specification of SMTP is RFC 821, later clarified and modified by
118RFC 1123. Domain name system requirements and their implications for mail are
119covered in RFCs 1035 and 974. A scheme for extending the SMTP protocol is
120described in RFC 1869, and there are subsequent RFCs specifying particular
121extensions.
122
123At the time of writing (January 1999) a new RFC (currently known as
124draft-ietf-drums-smtpupd-09) which updates and consolidates all the material
125connected with SMTP message transmission is at a late stage of drafting, and is
126expected to become an Internet Standard in due course.
127
128The new draft is written using the terms MUST, SHOULD, and MAY, which, when
129written in capital letters, have precise meanings. To quote from the draft:
130
131 "MUST" or "MUST NOT" identify absolute requirements for conformance to
132 this specification. Implementations that do not conform to them lie
133 outside the scope of this specification and often will not
134 interoperate properly with SMTP implementations that do conform.
135 Implementations that are fully conforming also adhere to all "SHOULD"
136 and "SHOULD NOT" requirements. Implementations that adhere to all
137 "MUST" ("MUST NOT") but not to all of these are considered to be
138 partially conforming. Such implementations may interoperate properly
139 with fully conforming ones and with each other, but this will
140 typically be the case only if great care is taken. Consequently, an
141 implementation should violate "SHOULD" ("SHOULD NOT") requirements
142 only under exceptional and well-understood circumstances.
143
144The implementation of Exim is intended to conform to the spirit of this
145paragraph. The following is (I hope) a complete list of major variations
146from the draft RFC. In addition to the items listed here, there are other minor
147extensions such as the tolerance of white space in places where it is not
148strictly permitted by the RFC. References in square brackets are to the -09
149draft sections, and brief summaries of the RFC requirement are also given in
150square brackets.
151
152
1532.1 Line termination [2.3.7, 4.1.1.4]
154-------------------------------------
155
156[SMTP lines are terminated by CRLF.]
157
158Exim recognizes LF without CR as a line terminator in all forms of input. For
159SMTP input, any preceding CR is discarded. An early version of Exim followed
160the RFC strictly, and did not recognize LF without CR in SMTP input. However,
161it seems that sites on the net send out messages with just LF terminators,
162despite the warnings in the RFCs, and other MTAs handle this, so Exim was
163changed. However, there is a compile time macro called STRICT_CRLF which can be
164set to restore the strict behaviour, though this is undocumented.
165
166
1672.2 Eight-bit characters [2.4.1]
168--------------------------------
169
170[SMTP transmits only 7-bit characters.]
171
172Exim is eight-bit clean, and makes no attempt to modify the data in a message
173in any way. In particular, for messages containing characters with the top bit
174set, it neither tries to negotiate 8-bit transmission, nor converts such
175characters into an encoded form. In other words, it adopts the "just send 8"
176strategy. It can be configured to send out 8BITMIME in its response to EHLO
177(which it does not do by default), and it recognizes the 8BITMIME keyword on
178incoming messages, but neither of these affect its handling of message data.
179"Just send 8" is the strategy of a number of MTAs; it is argued that it
180achieves what the user wants more often than other strategies.
181
182
8a330e5a 1832.3 Closing the connection [4.1.1.10]
e05f33e0
PH
184-------------------------------------
185
186[Client must wait for response to QUIT before closing the connection.]
187
188Exim closes the connection immediately after sending QUIT, without waiting for
189the reply. There was a lot of discussion about this on one of the mailing
190lists. The conclusion was that this behaviour is fine on Unix systems, which
191have TCP/IP implementations that close down the underlying channel tidily even
192when the associated process has terminated. Indeed, not waiting may be
193beneficial, as it moves the TIME_WAIT state (waiting to ensure there's no more
194data in transit) from the server to the client system. On some other operating
195systems (I understand) it is a disaster to terminate the sending process
196without waiting for the QUIT response, because all the data about the
197connection lives in the client's process space, and is therefore thrown away
198before the response arrives. The subsequent arrival of the response then causes
199bad behaviour.
200
201
8a330e5a 2022.4 IPv6 address literals [4.1.2]
e05f33e0
PH
203---------------------------------
204
205[IPv6 address literals are introduced by "IPv6".]
206
207Exim recognizes IPv6 literals as just the colon-separated hexadecimal form of
208an IPv6 address, for example 1080:0:0:0:8:800:200C:417A, without the need for a
209prefix. At present, it does not even recognize the prefix. When IPv6 becomes
210more widespread, Exim will follow whatever the common usage is.
211
212
8a330e5a 2132.5 Underscores in domain names [4.1.2]
e05f33e0
PH
214---------------------------------------
215
216[Underscores are not legal in domain names.]
217
218RFC 822 allows all characters except specials, space, and controls in domain
219names, but the SMTP RFCs are stricter, allowing only letters, digits, and
220hyphen. Exim is compliant when checking incoming addresses in SMTP commands,
221but it is more relaxed by default when checking domain names that are supplied
222by EHLO or HELO commands, because many client workstations get set up with
223underscores in their names. There is an option that can be set to cause Exim to
224refuse underscores. (There are also options to specify certain hosts from which
225it will accept any old junk after EHLO or HELO. Such is the woeful state of
226some SMTP clients.)
227
228
8a330e5a 2292.6 Removal of return-path headers [4.4]
e05f33e0
PH
230----------------------------------------
231
232[Relaying MTAs should not remove return-path.]
233
234Exim removes Return-Path: headers from all messages, if return_path_remove is
235set (the default). It does not attempt to determine if it is being a relay or
236not. Indeed, for some messages it might be both a relay and a final destination
237MTA for the same message.
238
239
8a330e5a 2402.7 Randomizing the order of addresses of multihomed hosts [5]
e05f33e0
PH
241--------------------------------------------------------------
242
243[Multihomed host addresses should not be randomized.]
244
245Exim does randomize a list of several addresses for a single host, because
4c04137d 246caching in resolvers will defeat the round-robinning that many nameservers
e05f33e0
PH
247use. (Note: this is not the same as randomizing equal-valued MX records. That
248is required by the RFC.)
249
250
8a330e5a 2512.8 Handling "MX points to self" [5]
e05f33e0
PH
252------------------------------------
253
254[MX points to self must be treated as an error.]
255
256The RFC doesn't allow for the possibility of special-purpose routing in the
257case when the lowest numbered MX record points to the local host. The default
258Exim configuration is compliant, but it is possible to configure Exim to behave
259differently, and there are several situations where this can be useful.
260
261
8a330e5a 2622.9 Source routing [6.1]
e05f33e0
PH
263-------------------------
264
265[Source routes should be stripped.]
266
267The new RFC has moved forward in deprecating source-routed email addresses.
268Exim does not strip them down by default, but can be made to do so by setting
269collapse_source_routes. However, even when it is not stripping them down, it
270does not add host routing to reverse-paths when processing a source-routed
271forward-path.
272
273
8a330e5a 2742.10 Loop detection [6.2]
e05f33e0
PH
275-------------------------
276
277[Loop count for Received: headers should be at least 100.]
278
279Exim's default setting of the received_headers_max option is 30. Most messages
280these days seem to accumulate less than half a dozen Received: headers, and
281even a couple of forwardings don't bring this anywhere near 30.
282
283
8a330e5a 2842.11 Addition of missing headers [6.3]
e05f33e0
PH
285--------------------------------------
286
287[Missing headers may be added, and domains qualified, only if client is
288identified.]
289
290Exim always adds Message-Id: and Date: headers if these are missing, whatever
291the source of the message, and likewise when it expands non-fully-qualified
292domains, it does so independently of the message's source.
293
294
8a330e5a 2952.12 Syntax of MAIL and RCPT commands [4.1.1.2, 4.1.1.3]
e05f33e0
PH
296--------------------------------------------------------
297
298Exim is more relaxed than the RFC requires:
299
300(1) Trailing white space is ignored.
301
302(2) It permits white space after the "FROM" and "TO" keywords.
303
304(3) It does not insist on the address being enclosed in <> characters. In fact,
305 it recognizes addresses in RFC 822 format here, except that domain
306 components are restricted to containing only letters, digits, and hyphens.
307
308(4) Local parts are permitted to contain null components, that is, may start or
309 end with an unquoted full stop (period) or contain two consecutive
310 unquoted full stops.
311
312
8a330e5a 3132.13 Non-fully-qualified domains [2.3.5]
e05f33e0
PH
314----------------------------------------
315
316[All domains must be fully qualified.]
317
318A domain that is not fully qualified has some of its trailing components
319missing, and is normally a local alias of some sort, for example, just a
320single-component host name.
321
322Exim can be configured to "widen" non-fully-qualified domains, either by using
323the facilities of the DNS resolver, or by an explicit list of widening strings.
324When this is done, it applies to addresses received by SMTP from other hosts,
325as well as to locally-originated addresses. Address re-writing could also be
326used for this purpose.
327
328
8a330e5a 3292.14 Unqualified addresses [4.1.2]
e05f33e0
PH
330----------------------------------
331
332[Addresses in SMTP commands must include domains.]
333
334An unqualified address consists of a local part without a domain. Do not
335confuse "qualified address" and "qualified domain". A qualified address may
336include a non-fully-qualified domain.
337
338There is one exception to the RFC rule: it is required that the unqualified
339address "<postmaster>" always be accepted. Apart from this, Exim rejects
340domainless addresses in SMTP commands by default, but it can be configured with
341a list of hosts and/or networks that are permitted to send addresses without
342domains in SMTP commands. Any such address that is accepted (including
343<postmaster>) is qualified by adding the value of the qualify_domain option.
344
345
8a330e5a 3462.15 VRFY and EXPN [3.5.1, 3.5.2, 3.5.3, 7.3]
e05f33e0
PH
347---------------------------------------------
348
349[VRFY and EXPN should be supported.]
350
351Exim does not support VRFY and EXPN by default, but a list of hosts and
352networks for which they are permitted can be given.
353
354
8a330e5a 3552.16 Checking of EHLO/HELO commands [4.1.4]
e05f33e0
PH
356-------------------------------------------
357
358[Client must send EHLO. Server must not refuse message if EHLO/HELO check
359fails.]
360
361Exim, as a client, always sends EHLO or HELO (see 2.3 above). As a server, it
362does not insist on there having been a valid EHLO or HELO command before the
363start of a message transaction. Any EHLO or HELO command that is received is
364rejected only if it contains a syntax error. That is, it is never rejected on
365the basis of any validation checking that may be performed on the data it
366contains.
367
368However, Exim can be configured to insist that (a) there is valid EHLO/HELO
369command before any message transaction and (b) the domain in that command
370matches the domain obtained by looking up the IP address of the sending host.
371It is possible to specify exception lists of hosts and/or networks for which
372this check does not apply.
373
374
4045b3d3 3752.17 Format of delivery error messages [3.7]
e05f33e0
PH
376--------------------------------------------
377
378[Standard report formats should be used if possible.]
379
4045b3d3
JH
380Exim's delivery failure reports are MIME format, and might be RFC1894
381conformant, but this has not been verified.
e05f33e0
PH
382
383
384## End ##