| 1 | Maildir++ |
| 2 | |
| 3 | In this document: |
| 4 | * HOWTO.maildirquota |
| 5 | * Mission statement |
| 6 | * Definitions and goals |
| 7 | * Contents of a maildirsize |
| 8 | * Calculating maildirsize |
| 9 | * Calculating the quota for a Maildir++ |
| 10 | * Delivering to a Maildir++ |
| 11 | * Reading from a Maildir++ |
| 12 | * Bugs |
| 13 | |
| 14 | HOWTO.maildirquota |
| 15 | |
| 16 | The remaining portion of this document is a technical description of |
| 17 | the maildir quota extension. This section is a brief overview of this |
| 18 | extension. |
| 19 | |
| 20 | What is a maildirquota? |
| 21 | |
| 22 | If you would like to have a quota on your maildir mailboxes, the best |
| 23 | solution is to always use filesystem-based quotas: per-user usage |
| 24 | quotas that is enforced by the operating system. |
| 25 | |
| 26 | This is the best solution when the default Maildir is located in each |
| 27 | account's home directory. This solution will NOT work if Maildirs are |
| 28 | stored elsewhere, or if you have a large virtual domain setup where a |
| 29 | single userid is used to hold many individual Maildirs, one for each |
| 30 | virtual user. |
| 31 | |
| 32 | This extension to the maildir format allows a "voluntary" maildir |
| 33 | quota implementation that does not rely on filesystem-based quotas. |
| 34 | |
| 35 | When maildirquota will not work. |
| 36 | |
| 37 | For this quota mechanism to work, all software that accesses a maildir |
| 38 | must observe this quota protocol. It follows that this quota mechanism |
| 39 | can be easily circumvented if users have direct (shell) access to the |
| 40 | filesystem containing the users' maildirs. |
| 41 | |
| 42 | Furthermore, this quota mechanism is not 100% effective. It is |
| 43 | possible to have a situation where someone may go over quota. This |
| 44 | quota implementation uses a deliberate trade-off. It is necessary to |
| 45 | use some form of locking in order to have a complete bulletproof quota |
| 46 | enforcement, but maildirs mail stores were explicitly designed to |
| 47 | avoid any kind of locking. This quota approach does not use locking, |
| 48 | and the tradeoff is that sometimes it is possible for a few extra |
| 49 | messages to be delivered to the maildir, before the door is |
| 50 | permanently shot. |
| 51 | |
| 52 | For best performance, all maildir clients should support this quota |
| 53 | extension, however there's a wide degree of tolerance here. As long as |
| 54 | the mail delivery agent that puts new messages into a Maildir uses |
| 55 | this extension, the quota will be enforced without excessive |
| 56 | degradation. |
| 57 | |
| 58 | In the worst case scenario, quotas are automatically recalculated |
| 59 | every fifteen minutes. If a maildir goes over quota, and a mail client |
| 60 | that does not support this quota extension removes enough mail from |
| 61 | the maildir, the mail delivery agent will not be immediately informed |
| 62 | that the maildir is now under quota. However, eventually the correct |
| 63 | quota will be recalculated and mail delivery will resume. |
| 64 | |
| 65 | Mail user agents sometimes put messages into the maildir themselves. |
| 66 | Messages added to a maildir by a mail user agent that does not |
| 67 | understand the quota extension will not be immediately counted towards |
| 68 | the overall quota, and may not be counted for an extensive period of |
| 69 | time. Additionally, if there are a lot of messages that have been |
| 70 | added to a maildir from these mail user agents, quota recalculation |
| 71 | may impose non-trivial load on the system, as the quota recalculator |
| 72 | will have to issue the stat system call for each message. |
| 73 | |
| 74 | How to implement the quota |
| 75 | |
| 76 | The best way to do that is to modify your mail server to implement the |
| 77 | protocol defined by this document. Not everyone, of course, has this |
| 78 | ability. Therefore, an alternate approach is available. |
| 79 | |
| 80 | This package creates a very short utility called "deliverquota". It |
| 81 | will NOT be installed anywhere by default, unless this maildir quota |
| 82 | implementation is a part of a larger package, in which case the parent |
| 83 | package may install this utility somewhere. If you obtained the |
| 84 | maildir package separately, you will need to compile it by running the |
| 85 | configure script, then by running make. |
| 86 | |
| 87 | deliverquota takes two arguments. deliverquota reads the message from |
| 88 | standard input, then delivers it to the maildir specified by the first |
| 89 | argument to deliverquota. The second argument specifies the actual |
| 90 | quota for this maildir, as defined elsewhere in this document. |
| 91 | deliverquota will deliver the message to the maildir, making a best |
| 92 | effort not to exceed the stated quota. If the maildir is over quota, |
| 93 | deliverquota terminates with exit code 77. Otherwise, it delivers the |
| 94 | message, updates the quota, and terminates with exit code 0. |
| 95 | |
| 96 | Therefore, proceed as follows: |
| 97 | * Copy deliverquota to some convenient location, say /usr/local/bin. |
| 98 | * Configure your mail server to use deliverquota. For example, if |
| 99 | you use Qmail and your maildirs are all located in $HOME/Maildir, |
| 100 | replace the './Maildir/' argument to qmail-start with the |
| 101 | following: |
| 102 | '| /usr/local/bin/deliverquota ./Maildir 1000000S' |
| 103 | |
| 104 | |
| 105 | |
| 106 | |
| 107 | This sets a one million byte limit on all Maildirs. As I |
| 108 | mentioned, this is meaningless if login access is available, |
| 109 | because the individual account owner can create his own |
| 110 | $HOME/.qmail file, and ignore deliverquota. Note that in this |
| 111 | case, you MUST use apostrophes on the qmail-start command line, in |
| 112 | order to quote this as one argument. |
| 113 | |
| 114 | If you would like to use different quotas for different users, you |
| 115 | will have to put together a separate process or a script that looks up |
| 116 | the appropriate quota for the recipient, and runs deliverquota |
| 117 | specifying the quota. If no login access to the mail server is |
| 118 | available, you can simply create a separate $HOME/.qmail for every |
| 119 | recipient. |
| 120 | |
| 121 | That's pretty much it. If you handle a moderate amount of mail, I have |
| 122 | one more suggestion. For the first couple of weeks, run deliverquota |
| 123 | setting the second argument to an empty string. This disables quota |
| 124 | enforcement, however it still activates certain optimizations that |
| 125 | permit very fast quota recalculation. Messages delivered by |
| 126 | deliverquota have their message size encoded in their filename; this |
| 127 | makes it possible to avoid stat-ing the message in the Maildir, when |
| 128 | recalculating the quota. Then, after most messages in your maildirs |
| 129 | have been delivered by deliverquota, activate the quotas!!! |
| 130 | |
| 131 | maildirquota-enhanced applications |
| 132 | |
| 133 | This is a list of applications that have been enhanced to support the |
| 134 | maildirquota extension: |
| 135 | * maildrop - mail delivery agent/mail filter. |
| 136 | * SqWebmail - webmail CGI binary. |
| 137 | |
| 138 | These applications fall into two classes: |
| 139 | * Mail delivery agents. These applications read some externally |
| 140 | defined table of mail recipients and their maildir quota. |
| 141 | * Mail clients. These applications read maildir quota information |
| 142 | that has been defined by the mail delivery agent. |
| 143 | |
| 144 | Mail clients generally do not need any additional setup in order to |
| 145 | use the maildirquota extension. They will automatically read and |
| 146 | implement any quota specification set by the mail delivery agent. |
| 147 | |
| 148 | On the other hand, mail delivery agents will require some kind of |
| 149 | configuration in order to activate the maildirquota extension for some |
| 150 | or all recipients. The instructions for doing that depends upon the |
| 151 | mail delivery agent. The documentation for the mail delivery agent |
| 152 | should be consulted for additional information. |
| 153 | _________________________________________________________________ |
| 154 | |
| 155 | Mission statement |
| 156 | |
| 157 | Maildir++ is a mail storage structure that's based on the Maildir |
| 158 | structure, first used in the Qmail mail server. Actually, Maildir++ is |
| 159 | just a minor extension to the standard Maildir structure. |
| 160 | |
| 161 | For more information, see http://www.qmail.org/man/man5/maildir.html. |
| 162 | I am not going to include the definition of a Maildir in this |
| 163 | document. Consider it included right here. This document only |
| 164 | describes the differences. |
| 165 | |
| 166 | Maildir++ adds a couple of things to a standard Maildir: folders and |
| 167 | quotas. |
| 168 | |
| 169 | Quotas enforce a maximum allowable size of a Maildir. In many |
| 170 | situations, using the quota mechanism of the underlying filesystem |
| 171 | won't work very well. If a filesystem quota mechanism is used, then |
| 172 | when a Maildir goes over quota, Qmail does not bounce additional mail, |
| 173 | but keeps it queued, changing one bad situation into another bad |
| 174 | situation. Not only know you have an account that's backed up, but now |
| 175 | your queue starts to back up too. |
| 176 | |
| 177 | Definitions, and goals |
| 178 | |
| 179 | Maildir++ and Maildir shall be completely interchangeable. A Maildir++ |
| 180 | client will be able to use a standard Maildir, automatically |
| 181 | "upgrading" it in the process. A Maildir client will be able to use a |
| 182 | Maildir++ just like a regular Maildir. Of course, a plain Maildir |
| 183 | client won't be able to enforce a quota, and won't be able to access |
| 184 | messages stored in folders. |
| 185 | |
| 186 | Folders are created as subdirectories under the main Maildir. The name |
| 187 | of the subdirectory always starts with a period. For example, a folder |
| 188 | named "Important" will be a subdirectory called ".Important". You |
| 189 | can't have subdirectories that start with two periods. |
| 190 | |
| 191 | A Maildir++ client ignores anything in the main Maildir that starts |
| 192 | with a period, but is not a subdirectory. |
| 193 | |
| 194 | Each subdirectory is a fully-fledged Maildir of its own, that is you |
| 195 | have .Important/tmp, .Important/new, and .Important/cur. Everything |
| 196 | that applies to the main Maildir applies equally well to the |
| 197 | subdirectory, including automatically cleaning up old files in tmp. A |
| 198 | Maildir++ enhancement is that a message can be moved between folders |
| 199 | and/or the main Maildir simply by moving/renaming the file (into the |
| 200 | cur subdirectory of the destination folder). Therefore, the entire |
| 201 | Maildir++ must reside on the same filesystem. |
| 202 | |
| 203 | Within each subdirectory there's an empty file, maildirfolder. Its |
| 204 | existence tells the mail delivery agent that this Maildir is a really |
| 205 | a folder underneath a parent Maildir++. |
| 206 | |
| 207 | Only one special folder is reserved: Trash (subdirectory .Trash). |
| 208 | Instead of marking deleted messages with the D flag, Maildir++ clients |
| 209 | move the message into the Trash folder. Maildir++ readers are |
| 210 | responsible for expunging messages from Trash after a system-defined |
| 211 | retention interval. |
| 212 | |
| 213 | When a Maildir++ reader sees a message marked with a D flag it may at |
| 214 | its option: remove the message immediately, move it into Trash, or |
| 215 | ignore it. |
| 216 | |
| 217 | Can folders have subfolders, defined in a recursive fashion? The |
| 218 | answer is no. If you want to have a client with a hierarchy of |
| 219 | folders, emulate it. Pick a hierarchy separator character, say ":". |
| 220 | Then, folder foo/bar is subdirectory .foo:bar. |
| 221 | |
| 222 | This is all that there's to say about folders. The rest of this |
| 223 | document deals with quotas. |
| 224 | |
| 225 | The purpose of quotas is to temporarily disable a Maildir, if it goes |
| 226 | over the quota. There is one and only major goal that this quota |
| 227 | implementation tries to achieve: |
| 228 | * Place as little overhead as possible on the mail system that's |
| 229 | delivering to the Maildir++ |
| 230 | |
| 231 | That's it. To achieve that goal, certain compromises are made: |
| 232 | * Mail delivery will stop as soon as possible after Maildir++'s size |
| 233 | goes over quota. Certain race conditions may happen with Maildir++ |
| 234 | going a lot over quota, in rare circumstances. That is taken into |
| 235 | account, and the situation will eventually resolve itself, but you |
| 236 | should not simply take your systemwide quota, multiply it by the |
| 237 | number of mail accounts, and allocate that much disk space. Always |
| 238 | leave room to spare. |
| 239 | * How well the quota mechanism will work will depend on whether or |
| 240 | not everything that accesses the Maildir++ is a Maildir++ client. |
| 241 | You can have a transition period where some of your mail clients |
| 242 | are just Maildir clients, and things should run more or less well. |
| 243 | There will be some additional load because the size of the Maildir |
| 244 | will be recalculated more often, but the additional load shouldn't |
| 245 | be noticeable. |
| 246 | |
| 247 | This won't be a perfect solution, but it will hopefully be good |
| 248 | enough. Maildirs are simply designed to rely on the filesystem to |
| 249 | enforce individual quotas. If a filesystem-based quota works for you, |
| 250 | use it. |
| 251 | |
| 252 | A Maildir++ may contain the following additional file: maildirsize. |
| 253 | |
| 254 | Contents of maildirsize |
| 255 | |
| 256 | maildirsize contains two or more lines terminated by newline |
| 257 | characters. |
| 258 | |
| 259 | The first line contains a copy of the quota definition as used by the |
| 260 | system's mail server. Each application that uses the maildir must know |
| 261 | what it's quota is. Instead of configuring each application with the |
| 262 | quota logic, and making sure that every application's quota definition |
| 263 | for the same maildir is exactly the same, the quota specification used |
| 264 | by the system mail server is saved as the first line of the |
| 265 | maildirsize file. All other application that enforce the maildir quota |
| 266 | simply read the first line of maildirsize. |
| 267 | |
| 268 | The quota definition is a list, separate by commas. Each member of the |
| 269 | list consists of an integer followed by a letter, specifying the |
| 270 | nature of the quota. Currently defined quota types are 'S' - total |
| 271 | size of all messages, and 'C' - the maximum count of messages in the |
| 272 | maildir. For example, 10000000S,1000C specifies a quota of 10,000,000 |
| 273 | bytes or 1,000 messages, whichever comes first. |
| 274 | |
| 275 | All remaining lines all contain two integers separated by a single |
| 276 | space. The first integer is interpreted as a byte count. The second |
| 277 | integer is interpreted as a file count. A Maildir++ writer can add up |
| 278 | all byte counts and file counts from maildirsize and enforce a quota |
| 279 | based either on number of messages or the total size of all the |
| 280 | messages. |
| 281 | |
| 282 | Calculating maildirsize |
| 283 | |
| 284 | In most cases, changes to maildirsize are recorded by appending an |
| 285 | additional line. Under some conditions maildirsize has to be |
| 286 | recalculated from scratch. These conditions are defined later. This is |
| 287 | the procedure that's used to recalculate maildirsize: |
| 288 | 1. If we find a maildirfolder within the directory, we're delivering |
| 289 | to a folder, so back up to the parent directory, and start again. |
| 290 | 2. Read the contents of the new and cur subdirectories. Also, read |
| 291 | the contents of the new and cur subdirectories in each Maildir++ |
| 292 | folder, except Trash. Before reading each subdirectory, stat() the |
| 293 | subdirectory itself, and keep track of the latest timestamp you |
| 294 | get. |
| 295 | 3. If the filename of each message is of the form xxxxx,S=nnnnn or |
| 296 | xxxxx,S=nnnnn:xxxxx where "xxxxx" represents arbitrary text, then |
| 297 | use nnnnn as the size of the file (which will be conveniently |
| 298 | recorded in the filename by a Maildir++ writer, within the |
| 299 | conventions of filename naming in a Maildir). If the message was |
| 300 | not written by a Maildir++ writer, stat() it to obtain the message |
| 301 | size. If stat() fails, a race condition removed the file, so just |
| 302 | ignore it and move on to the next one. |
| 303 | 4. When done, you have the grand total of the number of messages and |
| 304 | their total size. Create a new maildirsize by: creating the file |
| 305 | in the tmp subdirectory, observing the conventions for writing to |
| 306 | a Maildir. Then rename the file as maildirsize.Afterwards, stat |
| 307 | all new and cur subdirectories again. If you find a timestamp |
| 308 | later than the saved timestamp, REMOVE maildirsize. |
| 309 | 5. Before running this calculation procedure, the Maildir++ user |
| 310 | wanted to know the size of the Maildir++, so return the calculated |
| 311 | values. This is done even if maildirsize was removed. |
| 312 | |
| 313 | Calculating the quota for a Maildir++ |
| 314 | |
| 315 | This is the procedure for reading the contents of maildirsize for the |
| 316 | purpose of determine if the Maildir++ is over quota. |
| 317 | 1. If maildirsize does not exist, or if its size is at least 5120 |
| 318 | bytes, recalculate it using the procedure defined above, and use |
| 319 | the recalculated numbers. Otherwise, read the contents of |
| 320 | maildirsize, and add up the totals. |
| 321 | 2. The most efficient way of doing this is to: open maildirsize, then |
| 322 | start reading it into a 5120 byte buffer (some broken NFS |
| 323 | implementations may return less than 5120 bytes read even before |
| 324 | reaching the end of the file). If we fill it, which, in most |
| 325 | cases, will happen with one read, close it, and run the |
| 326 | recalculation procedure. |
| 327 | 3. In many cases the quota calculation is for the purpose of adding |
| 328 | or removing messages from a Maildir++, so keep the file descriptor |
| 329 | to maildirsize open. A file descriptor will not be available if |
| 330 | quota recalculation ended up removing maildirsize due to a race |
| 331 | condition, so the caller may or may not get a file descriptor |
| 332 | together with the Maildir++ size. |
| 333 | 4. If the numbers we got indicated that the Maildir++ is over quota, |
| 334 | some additional logic is in order: if we did not recalculate |
| 335 | maildirsize, if the numbers in maildirsize indicated that we are |
| 336 | over quota, then if maildirsize was more than one line long, or if |
| 337 | the timestamp on maildirsize indicated that it's at least 15 |
| 338 | minutes old, throw out the totals, and recalculate maildirsize |
| 339 | from scratch. |
| 340 | |
| 341 | Eventually the 5120 byte limitation will always cause maildirsize to |
| 342 | be recalculated, which will compensate for any race conditions which |
| 343 | previously threw off the totals. Each time a message is delivered or |
| 344 | removed from a Maildir++, one line is added to maildirsize (this is |
| 345 | described below in greater detail). Most messages are less than 10K |
| 346 | long, so each line appended to maildirsize will be either between |
| 347 | seven and nine bytes long (four bytes for message count, space, digit |
| 348 | 1, newline, optional minus sign in front of both counts if the message |
| 349 | was removed). This results in about 640 Maildir++ operations before a |
| 350 | recalculation is forced. Since most messages are added once and |
| 351 | removed once from a Maildir, expect recalculation to happen |
| 352 | approximately every 320 messages, keeping the overhead of a |
| 353 | recalculation to a minimum. Even if most messages include large |
| 354 | attachments, most attachments are less than 100K long, which brings |
| 355 | down the average recalculation frequency to about 150 messages. |
| 356 | |
| 357 | Also, the effect of having non-Maildir++ clients accessing the |
| 358 | Maildir++ is reduced by forcing a recalculation when we're potentially |
| 359 | over quota. Even if non-Maildir++ clients are used to remove messages |
| 360 | from the Maildir, the fact that the Maildir++ is still over quota will |
| 361 | be verified every 15 minutes. |
| 362 | |
| 363 | Delivering to a Maildir++ |
| 364 | |
| 365 | Delivering to a Maildir++ is like delivering to a Maildir, with the |
| 366 | following exceptions: |
| 367 | 1. Follow the usual Maildir conventions for naming the filename used |
| 368 | to store the message, except that append ,S=nnnnn to the name of |
| 369 | the file, where nnnnn is the size of the file. This eliminates the |
| 370 | need to stat() most messages when calculating the quota. If the |
| 371 | size of the message is not known at the beginning, append ,S=nnnnn |
| 372 | when renaming the message from tmp to new. |
| 373 | 2. As soon as the size of the message is known (hopefully before it |
| 374 | is written into tmp), calculate Maildir++'s quota, using the |
| 375 | procedure defined previously. If the message is over quota, back |
| 376 | out, cleaning up anything that was created in tmp. |
| 377 | 3. If a file descriptor to maildirsize was opened for us, after |
| 378 | moving the file from tmp to new append a line to the file |
| 379 | containing the message size, and "1". |
| 380 | |
| 381 | Reading from a Maildir++ |
| 382 | |
| 383 | Maildir++ readers should mind the following additional tasks: |
| 384 | 1. Make sure to create the maildirfolder file in any new folders |
| 385 | created within the Maildir++. |
| 386 | 2. When moving a message to the Trash folder, append a line to |
| 387 | maildirsize, containing a negative message size and a '-1'. |
| 388 | 3. When moving a message from the Trash folder, follow the steps |
| 389 | described in "Delivering to Maildir++", as far as quota logic |
| 390 | goes. That is, refuse to move messages out of Trash if the |
| 391 | Maildir++ is over quota. |
| 392 | 4. Moving a message between other folders carries no additional |
| 393 | requirements. |
| 394 | |