| 1 | Conformance with RFCs |
| 2 | --------------------- |
| 3 | |
| 4 | Exim is written to follow the rules laid down in the RFCs. However, there are |
| 5 | some circumstances where it either extends what is specified, or chooses not to |
| 6 | follow them strictly, for various reasons. Sometimes variations are controlled |
| 7 | by an option, which may default on or off. This document lists the variations |
| 8 | from the latest email RFCs, and discusses their background and implications. |
| 9 | |
| 10 | Last Updated: 25 January 1999 |
| 11 | |
| 12 | |
| 13 | 1. RFC 822 |
| 14 | ---------- |
| 15 | |
| 16 | The original specification of the format of Internet mail messages is RFC 822, |
| 17 | later clarified and modified by RFC 1123. At the time of writing (January 1999) |
| 18 | a new RFC (currently known as draft-ietf-drums-msg-fmt-07) which updates and |
| 19 | consolidates all the material related to the message format is at a late stage |
| 20 | of drafting, and is expected to become an Internet Standard in due course. |
| 21 | |
| 22 | The following is (I hope) a complete list of major variations from the draft |
| 23 | RFC. References in square brackets are to the -07 draft. |
| 24 | |
| 25 | |
| 26 | 1.1 Line termination [2.1, 2.3] |
| 27 | ------------------------------- |
| 28 | |
| 29 | [Lines are terminated by CRLF; isolated CR and LF are not permitted.] |
| 30 | |
| 31 | The CRLF requirement has to be interpreted carefully, because the RFC also says |
| 32 | that it does not cover the internal format "used by sites". Exim keeps messages |
| 33 | on its spool in Unix format, using only LF as the line terminator, and also |
| 34 | does local deliveries using only LF. I believe this is compliant with the RFC, |
| 35 | as these are both "internal formats". |
| 36 | |
| 37 | Messages sent out by SMTP have CRLF line terminators. However, isolated CR |
| 38 | characters are treated as any other data characters, because Exim is eight-bit |
| 39 | clean (see 1.2 below). |
| 40 | |
| 41 | See 2.1 below for a discussion of line terminators in incoming messages. |
| 42 | |
| 43 | |
| 44 | 1.2 Eight-bit characters [2.1] |
| 45 | ------------------------------ |
| 46 | |
| 47 | [Messages consist of 7-bit characters.] |
| 48 | |
| 49 | Exim is eight-bit clean. It does not do any processing of the characters in the |
| 50 | body of a message. |
| 51 | |
| 52 | |
| 53 | 1.3 Maximum line length [2.1, 2.3] |
| 54 | ---------------------------------- |
| 55 | |
| 56 | [The maximum length of a line is 998 characters.] |
| 57 | |
| 58 | Exim does not enforce any limit on line length. |
| 59 | |
| 60 | |
| 61 | 1.4 The "phrase" part of an address [3.4] |
| 62 | ----------------------------------------- |
| 63 | |
| 64 | [The phrase is a sequence of "words"; a word is an "atom" or a quoted string.] |
| 65 | |
| 66 | The characters that can be used in an "atom" do not include the full stop |
| 67 | (dot, period). Thus a header line such as |
| 68 | |
| 69 | To: John Q. Public <jqp@anywhere.org> |
| 70 | |
| 71 | is syntactically invalid under a strict interpretation of the RFC because the |
| 72 | dot in the phrase part is not quoted. However, many MTAs do not enforce this |
| 73 | restriction, so Exim was changed to be relaxed about it as well. In fact, the |
| 74 | draft RFC is moving towards allowing this. In section [4.1], which is defining |
| 75 | "obsolete" syntax that programs must accept (but not generate), it says this: |
| 76 | |
| 77 | The period character is added to obs-phrase. |
| 78 | |
| 79 | Note: The period character in obs-phrase is not a form that was allowed |
| 80 | in earlier versions of this or any other standard. Period (nor any other |
| 81 | character from specials) was not allowed in phrase because it introduced |
| 82 | a parsing difficulty distinguishing between phrases and portions of an |
| 83 | addr-spec (see section 4.4). It appears here because the period |
| 84 | character is currently used in many messages in the display-name portion |
| 85 | of addresses, especially for initials in names, and therefore must be |
| 86 | interpreted properly. In the future, period may appear in the regular |
| 87 | syntax of phrase. |
| 88 | |
| 89 | |
| 90 | 1.5 Source routed addresses [4.4] |
| 91 | --------------------------------- |
| 92 | |
| 93 | [Source routed addresses are always enclosed in <>.] |
| 94 | |
| 95 | Source routed addresses are declared obsolete in the draft RFC, but MTAs are |
| 96 | still required to handle them. Strictly, a source-routed address must be |
| 97 | enclosed in <> characters, so a header such as |
| 98 | |
| 99 | From: @a,@b:c@d |
| 100 | |
| 101 | is syntactically invalid. Exim does not enforce this restriction. |
| 102 | |
| 103 | |
| 104 | 1.6 Local parts [3.4.1] |
| 105 | ----------------------- |
| 106 | |
| 107 | [Dots in unquoted local parts may not be consecutive or at either end.] |
| 108 | |
| 109 | Exim allows unquoted local parts to begin or end with a dot (period, full |
| 110 | stop), and it also permits two consecutive dots in a local part. |
| 111 | |
| 112 | |
| 113 | |
| 114 | 2. RFC 821 |
| 115 | ---------- |
| 116 | |
| 117 | The original specification of SMTP is RFC 821, later clarified and modified by |
| 118 | RFC 1123. Domain name system requirements and their implications for mail are |
| 119 | covered in RFCs 1035 and 974. A scheme for extending the SMTP protocol is |
| 120 | described in RFC 1869, and there are subsequent RFCs specifying particular |
| 121 | extensions. |
| 122 | |
| 123 | At the time of writing (January 1999) a new RFC (currently known as |
| 124 | draft-ietf-drums-smtpupd-09) which updates and consolidates all the material |
| 125 | connected with SMTP message transmission is at a late stage of drafting, and is |
| 126 | expected to become an Internet Standard in due course. |
| 127 | |
| 128 | The new draft is written using the terms MUST, SHOULD, and MAY, which, when |
| 129 | written in capital letters, have precise meanings. To quote from the draft: |
| 130 | |
| 131 | "MUST" or "MUST NOT" identify absolute requirements for conformance to |
| 132 | this specification. Implementations that do not conform to them lie |
| 133 | outside the scope of this specification and often will not |
| 134 | interoperate properly with SMTP implementations that do conform. |
| 135 | Implementations that are fully conforming also adhere to all "SHOULD" |
| 136 | and "SHOULD NOT" requirements. Implementations that adhere to all |
| 137 | "MUST" ("MUST NOT") but not to all of these are considered to be |
| 138 | partially conforming. Such implementations may interoperate properly |
| 139 | with fully conforming ones and with each other, but this will |
| 140 | typically be the case only if great care is taken. Consequently, an |
| 141 | implementation should violate "SHOULD" ("SHOULD NOT") requirements |
| 142 | only under exceptional and well-understood circumstances. |
| 143 | |
| 144 | The implementation of Exim is intended to conform to the spirit of this |
| 145 | paragraph. The following is (I hope) a complete list of major variations |
| 146 | from the draft RFC. In addition to the items listed here, there are other minor |
| 147 | extensions such as the tolerance of white space in places where it is not |
| 148 | strictly permitted by the RFC. References in square brackets are to the -09 |
| 149 | draft sections, and brief summaries of the RFC requirement are also given in |
| 150 | square brackets. |
| 151 | |
| 152 | |
| 153 | 2.1 Line termination [2.3.7, 4.1.1.4] |
| 154 | ------------------------------------- |
| 155 | |
| 156 | [SMTP lines are terminated by CRLF.] |
| 157 | |
| 158 | Exim recognizes LF without CR as a line terminator in all forms of input. For |
| 159 | SMTP input, any preceding CR is discarded. An early version of Exim followed |
| 160 | the RFC strictly, and did not recognize LF without CR in SMTP input. However, |
| 161 | it seems that sites on the net send out messages with just LF terminators, |
| 162 | despite the warnings in the RFCs, and other MTAs handle this, so Exim was |
| 163 | changed. However, there is a compile time macro called STRICT_CRLF which can be |
| 164 | set to restore the strict behaviour, though this is undocumented. |
| 165 | |
| 166 | |
| 167 | 2.2 Eight-bit characters [2.4.1] |
| 168 | -------------------------------- |
| 169 | |
| 170 | [SMTP transmits only 7-bit characters.] |
| 171 | |
| 172 | Exim is eight-bit clean, and makes no attempt to modify the data in a message |
| 173 | in any way. In particular, for messages containing characters with the top bit |
| 174 | set, it neither tries to negotiate 8-bit transmission, nor converts such |
| 175 | characters into an encoded form. In other words, it adopts the "just send 8" |
| 176 | strategy. It can be configured to send out 8BITMIME in its response to EHLO |
| 177 | (which it does not do by default), and it recognizes the 8BITMIME keyword on |
| 178 | incoming messages, but neither of these affect its handling of message data. |
| 179 | "Just send 8" is the strategy of a number of MTAs; it is argued that it |
| 180 | achieves what the user wants more often than other strategies. |
| 181 | |
| 182 | |
| 183 | 2.3 Closing the connection [4.1.1.10] |
| 184 | ------------------------------------- |
| 185 | |
| 186 | [Client must wait for response to QUIT before closing the connection.] |
| 187 | |
| 188 | Exim closes the connection immediately after sending QUIT, without waiting for |
| 189 | the reply. There was a lot of discussion about this on one of the mailing |
| 190 | lists. The conclusion was that this behaviour is fine on Unix systems, which |
| 191 | have TCP/IP implementations that close down the underlying channel tidily even |
| 192 | when the associated process has terminated. Indeed, not waiting may be |
| 193 | beneficial, as it moves the TIME_WAIT state (waiting to ensure there's no more |
| 194 | data in transit) from the server to the client system. On some other operating |
| 195 | systems (I understand) it is a disaster to terminate the sending process |
| 196 | without waiting for the QUIT response, because all the data about the |
| 197 | connection lives in the client's process space, and is therefore thrown away |
| 198 | before the response arrives. The subsequent arrival of the response then causes |
| 199 | bad behaviour. |
| 200 | |
| 201 | |
| 202 | 2.4 IPv6 address literals [4.1.2] |
| 203 | --------------------------------- |
| 204 | |
| 205 | [IPv6 address literals are introduced by "IPv6".] |
| 206 | |
| 207 | Exim recognizes IPv6 literals as just the colon-separated hexadecimal form of |
| 208 | an IPv6 address, for example 1080:0:0:0:8:800:200C:417A, without the need for a |
| 209 | prefix. At present, it does not even recognize the prefix. When IPv6 becomes |
| 210 | more widespread, Exim will follow whatever the common usage is. |
| 211 | |
| 212 | |
| 213 | 2.5 Underscores in domain names [4.1.2] |
| 214 | --------------------------------------- |
| 215 | |
| 216 | [Underscores are not legal in domain names.] |
| 217 | |
| 218 | RFC 822 allows all characters except specials, space, and controls in domain |
| 219 | names, but the SMTP RFCs are stricter, allowing only letters, digits, and |
| 220 | hyphen. Exim is compliant when checking incoming addresses in SMTP commands, |
| 221 | but it is more relaxed by default when checking domain names that are supplied |
| 222 | by EHLO or HELO commands, because many client workstations get set up with |
| 223 | underscores in their names. There is an option that can be set to cause Exim to |
| 224 | refuse underscores. (There are also options to specify certain hosts from which |
| 225 | it will accept any old junk after EHLO or HELO. Such is the woeful state of |
| 226 | some SMTP clients.) |
| 227 | |
| 228 | |
| 229 | 2.6 Removal of return-path headers [4.4] |
| 230 | ---------------------------------------- |
| 231 | |
| 232 | [Relaying MTAs should not remove return-path.] |
| 233 | |
| 234 | Exim removes Return-Path: headers from all messages, if return_path_remove is |
| 235 | set (the default). It does not attempt to determine if it is being a relay or |
| 236 | not. Indeed, for some messages it might be both a relay and a final destination |
| 237 | MTA for the same message. |
| 238 | |
| 239 | |
| 240 | 2.7 Randomizing the order of addresses of multihomed hosts [5] |
| 241 | -------------------------------------------------------------- |
| 242 | |
| 243 | [Multihomed host addresses should not be randomized.] |
| 244 | |
| 245 | Exim does randomize a list of several addresses for a single host, because |
| 246 | caching in resolvers will defeat the round-robinning that many nameservers |
| 247 | use. (Note: this is not the same as randomizing equal-valued MX records. That |
| 248 | is required by the RFC.) |
| 249 | |
| 250 | |
| 251 | 2.8 Handling "MX points to self" [5] |
| 252 | ------------------------------------ |
| 253 | |
| 254 | [MX points to self must be treated as an error.] |
| 255 | |
| 256 | The RFC doesn't allow for the possibility of special-purpose routing in the |
| 257 | case when the lowest numbered MX record points to the local host. The default |
| 258 | Exim configuration is compliant, but it is possible to configure Exim to behave |
| 259 | differently, and there are several situations where this can be useful. |
| 260 | |
| 261 | |
| 262 | 2.9 Source routing [6.1] |
| 263 | ------------------------- |
| 264 | |
| 265 | [Source routes should be stripped.] |
| 266 | |
| 267 | The new RFC has moved forward in deprecating source-routed email addresses. |
| 268 | Exim does not strip them down by default, but can be made to do so by setting |
| 269 | collapse_source_routes. However, even when it is not stripping them down, it |
| 270 | does not add host routing to reverse-paths when processing a source-routed |
| 271 | forward-path. |
| 272 | |
| 273 | |
| 274 | 2.10 Loop detection [6.2] |
| 275 | ------------------------- |
| 276 | |
| 277 | [Loop count for Received: headers should be at least 100.] |
| 278 | |
| 279 | Exim's default setting of the received_headers_max option is 30. Most messages |
| 280 | these days seem to accumulate less than half a dozen Received: headers, and |
| 281 | even a couple of forwardings don't bring this anywhere near 30. |
| 282 | |
| 283 | |
| 284 | 2.11 Addition of missing headers [6.3] |
| 285 | -------------------------------------- |
| 286 | |
| 287 | [Missing headers may be added, and domains qualified, only if client is |
| 288 | identified.] |
| 289 | |
| 290 | Exim always adds Message-Id: and Date: headers if these are missing, whatever |
| 291 | the source of the message, and likewise when it expands non-fully-qualified |
| 292 | domains, it does so independently of the message's source. |
| 293 | |
| 294 | |
| 295 | 2.12 Syntax of MAIL and RCPT commands [4.1.1.2, 4.1.1.3] |
| 296 | -------------------------------------------------------- |
| 297 | |
| 298 | Exim is more relaxed than the RFC requires: |
| 299 | |
| 300 | (1) Trailing white space is ignored. |
| 301 | |
| 302 | (2) It permits white space after the "FROM" and "TO" keywords. |
| 303 | |
| 304 | (3) It does not insist on the address being enclosed in <> characters. In fact, |
| 305 | it recognizes addresses in RFC 822 format here, except that domain |
| 306 | components are restricted to containing only letters, digits, and hyphens. |
| 307 | |
| 308 | (4) Local parts are permitted to contain null components, that is, may start or |
| 309 | end with an unquoted full stop (period) or contain two consecutive |
| 310 | unquoted full stops. |
| 311 | |
| 312 | |
| 313 | 2.13 Non-fully-qualified domains [2.3.5] |
| 314 | ---------------------------------------- |
| 315 | |
| 316 | [All domains must be fully qualified.] |
| 317 | |
| 318 | A domain that is not fully qualified has some of its trailing components |
| 319 | missing, and is normally a local alias of some sort, for example, just a |
| 320 | single-component host name. |
| 321 | |
| 322 | Exim can be configured to "widen" non-fully-qualified domains, either by using |
| 323 | the facilities of the DNS resolver, or by an explicit list of widening strings. |
| 324 | When this is done, it applies to addresses received by SMTP from other hosts, |
| 325 | as well as to locally-originated addresses. Address re-writing could also be |
| 326 | used for this purpose. |
| 327 | |
| 328 | |
| 329 | 2.14 Unqualified addresses [4.1.2] |
| 330 | ---------------------------------- |
| 331 | |
| 332 | [Addresses in SMTP commands must include domains.] |
| 333 | |
| 334 | An unqualified address consists of a local part without a domain. Do not |
| 335 | confuse "qualified address" and "qualified domain". A qualified address may |
| 336 | include a non-fully-qualified domain. |
| 337 | |
| 338 | There is one exception to the RFC rule: it is required that the unqualified |
| 339 | address "<postmaster>" always be accepted. Apart from this, Exim rejects |
| 340 | domainless addresses in SMTP commands by default, but it can be configured with |
| 341 | a list of hosts and/or networks that are permitted to send addresses without |
| 342 | domains in SMTP commands. Any such address that is accepted (including |
| 343 | <postmaster>) is qualified by adding the value of the qualify_domain option. |
| 344 | |
| 345 | |
| 346 | 2.15 VRFY and EXPN [3.5.1, 3.5.2, 3.5.3, 7.3] |
| 347 | --------------------------------------------- |
| 348 | |
| 349 | [VRFY and EXPN should be supported.] |
| 350 | |
| 351 | Exim does not support VRFY and EXPN by default, but a list of hosts and |
| 352 | networks for which they are permitted can be given. |
| 353 | |
| 354 | |
| 355 | 2.16 Checking of EHLO/HELO commands [4.1.4] |
| 356 | ------------------------------------------- |
| 357 | |
| 358 | [Client must send EHLO. Server must not refuse message if EHLO/HELO check |
| 359 | fails.] |
| 360 | |
| 361 | Exim, as a client, always sends EHLO or HELO (see 2.3 above). As a server, it |
| 362 | does not insist on there having been a valid EHLO or HELO command before the |
| 363 | start of a message transaction. Any EHLO or HELO command that is received is |
| 364 | rejected only if it contains a syntax error. That is, it is never rejected on |
| 365 | the basis of any validation checking that may be performed on the data it |
| 366 | contains. |
| 367 | |
| 368 | However, Exim can be configured to insist that (a) there is valid EHLO/HELO |
| 369 | command before any message transaction and (b) the domain in that command |
| 370 | matches the domain obtained by looking up the IP address of the sending host. |
| 371 | It is possible to specify exception lists of hosts and/or networks for which |
| 372 | this check does not apply. |
| 373 | |
| 374 | |
| 375 | 2.17 Format of delivery error messages [3.7] |
| 376 | -------------------------------------------- |
| 377 | |
| 378 | [Standard report formats should be used if possible.] |
| 379 | |
| 380 | Exim's delivery failure reports are MIME format, and might be RFC1894 |
| 381 | conformant, but this has not been verified. |
| 382 | |
| 383 | |
| 384 | ## End ## |