| 1 | The following information is from the maildir man page of qmail. |
| 2 | |
| 3 | INTRODUCTION |
| 4 | maildir is a structure for directories of incoming mail |
| 5 | messages. It solves the reliability problems that plague |
| 6 | mbox files and mh folders. |
| 7 | |
| 8 | RELIABILITY ISSUES |
| 9 | A machine may crash while it is delivering a message. For |
| 10 | both mbox files and mh folders this means that the message |
| 11 | will be silently truncated. Even worse: for mbox format, |
| 12 | if the message is truncated in the middle of a line, it |
| 13 | will be silently joined to the next message. The mail |
| 14 | transport agent will try again later to deliver the mes- |
| 15 | sage, but it is unacceptable that a corrupted message |
| 16 | should show up at all. In maildir, every message is guar- |
| 17 | anteed complete upon delivery. |
| 18 | |
| 19 | A machine may have two programs simultaneously delivering |
| 20 | mail to the same user. The mbox and mh formats require |
| 21 | the programs to update a single central file. If the pro- |
| 22 | grams do not use some locking mechanism, the central file |
| 23 | will be corrupted. There are several mbox and mh locking |
| 24 | mechanisms, none of which work portably and reliably. In |
| 25 | contrast, in maildir, no locks are ever necessary. Dif- |
| 26 | ferent delivery processes never touch the same file. |
| 27 | |
| 28 | A user may try to delete messages from his mailbox at the |
| 29 | same moment that the machine delivers a new message. For |
| 30 | mbox and mh formats, the user's mail-reading program must |
| 31 | know what locking mechanism the mail-delivery programs |
| 32 | use. In contrast, in maildir, any delivered message can |
| 33 | be safely updated or deleted by a mail-reading program. |
| 34 | |
| 35 | Many sites use Sun's Network Failure System (NFS), presum- |
| 36 | ably because the operating system vendor does not offer |
| 37 | anything else. NFS exacerbates all of the above problems. |
| 38 | Some NFS implementations don't provide any reliable lock- |
| 39 | ing mechanism. With mbox and mh formats, if two machines |
| 40 | deliver mail to the same user, or if a user reads mail |
| 41 | anywhere except the delivery machine, the user's mail is |
| 42 | at risk. maildir works without trouble over NFS. |
| 43 | |
| 44 | THE MAILDIR STRUCTURE |
| 45 | A directory in maildir format has three subdirectories, |
| 46 | all on the same filesystem: tmp, new, and cur. |
| 47 | |
| 48 | Each file in new is a newly delivered mail message. The |
| 49 | modification time of the file is the delivery date of the |
| 50 | message. The message is delivered without an extra UUCP- |
| 51 | style From_ line, without any >From quoting, and without |
| 52 | an extra blank line at the end. The message is normally |
| 53 | in RFC 822 format, starting with a Return-Path line and a |
| 54 | Delivered-To line, but it could contain arbitrary binary |
| 55 | data. It might not even end with a newline. |
| 56 | |
| 57 | Files in cur are just like files in new. The big differ- |
| 58 | ence is that files in cur are no longer new mail: they |
| 59 | have been seen by the user's mail-reading program. |
| 60 | |
| 61 | HOW A MESSAGE IS DELIVERED |
| 62 | The tmp directory is used to ensure reliable delivery, as |
| 63 | discussed here. |
| 64 | |
| 65 | A program delivers a mail message in six steps. First, it |
| 66 | chdir()s to the maildir directory. Second, it stat()s the |
| 67 | name tmp/time.pid.host, where time is the number of sec- |
| 68 | onds since the beginning of 1970 GMT, pid is the program's |
| 69 | process ID, and host is the host name. Third, if stat() |
| 70 | returned anything other than ENOENT, the program sleeps |
| 71 | for two seconds, updates time, and tries the stat() again, |
| 72 | a limited number of times. Fourth, the program creates |
| 73 | tmp/time.pid.host. Fifth, the program NFS-writes the mes- |
| 74 | sage to the file. Sixth, the program link()s the file to |
| 75 | new/time.pid.host. At that instant the message has been |
| 76 | successfully delivered. |
| 77 | |
| 78 | The delivery program is required to start a 24-hour timer |
| 79 | before creating tmp/time.pid.host, and to abort the deliv- |
| 80 | ery if the timer expires. Upon error, timeout, or normal |
| 81 | completion, the delivery program may attempt to unlink() |
| 82 | tmp/time.pid.host. |
| 83 | |
| 84 | NFS-writing means (1) as usual, checking the number of |
| 85 | bytes returned from each write() call; (2) calling fsync() |
| 86 | and checking its return value; (3) calling close() and |
| 87 | checking its return value. (Standard NFS implementations |
| 88 | handle fsync() incorrectly but make up for it by abusing |
| 89 | close().) |
| 90 | |
| 91 | HOW A MESSAGE IS READ |
| 92 | A mail reader operates as follows. |
| 93 | |
| 94 | It looks through the new directory for new messages. Say |
| 95 | there is a new message, new/unique. The reader may freely |
| 96 | display the contents of new/unique, delete new/unique, or |
| 97 | rename new/unique as cur/unique:info. See |
| 98 | http://pobox.com/~djb/maildir.html for the meaning of |
| 99 | info. |
| 100 | |
| 101 | The reader is also expected to look through the tmp direc- |
| 102 | tory and to clean up any old files found there. A file in |
| 103 | tmp may be safely removed if it has not been accessed in |
| 104 | 36 hours. |
| 105 | |
| 106 | It is a good idea for readers to skip all filenames in new |
| 107 | and cur starting with a dot. Other than this, readers |
| 108 | should not attempt to parse filenames. |
| 109 | ### |