Commit | Line | Data |
---|---|---|
e05f33e0 PH |
1 | Maildir++ |
2 | ||
3 | In this document: | |
4 | * HOWTO.maildirquota | |
5 | * Mission statement | |
6 | * Definitions and goals | |
7 | * Contents of a maildirsize | |
8 | * Calculating maildirsize | |
9 | * Calculating the quota for a Maildir++ | |
10 | * Delivering to a Maildir++ | |
11 | * Reading from a Maildir++ | |
12 | * Bugs | |
13 | ||
14 | HOWTO.maildirquota | |
15 | ||
16 | The remaining portion of this document is a technical description of | |
17 | the maildir quota extension. This section is a brief overview of this | |
18 | extension. | |
19 | ||
20 | What is a maildirquota? | |
21 | ||
22 | If you would like to have a quota on your maildir mailboxes, the best | |
23 | solution is to always use filesystem-based quotas: per-user usage | |
24 | quotas that is enforced by the operating system. | |
25 | ||
26 | This is the best solution when the default Maildir is located in each | |
27 | account's home directory. This solution will NOT work if Maildirs are | |
28 | stored elsewhere, or if you have a large virtual domain setup where a | |
29 | single userid is used to hold many individual Maildirs, one for each | |
30 | virtual user. | |
31 | ||
32 | This extension to the maildir format allows a "voluntary" maildir | |
33 | quota implementation that does not rely on filesystem-based quotas. | |
34 | ||
35 | When maildirquota will not work. | |
36 | ||
37 | For this quota mechanism to work, all software that accesses a maildir | |
38 | must observe this quota protocol. It follows that this quota mechanism | |
39 | can be easily circumvented if users have direct (shell) access to the | |
40 | filesystem containing the users' maildirs. | |
41 | ||
42 | Furthermore, this quota mechanism is not 100% effective. It is | |
43 | possible to have a situation where someone may go over quota. This | |
44 | quota implementation uses a deliverate trade-off. It is necessary to | |
45 | use some form of locking in order to have a complete bulletproof quota | |
46 | enforcement, but maildirs mail stores were explicitly designed to | |
47 | avoid any kind of locking. This quota approach does not use locking, | |
48 | and the tradeoff is that sometimes it is possible for a few extra | |
49 | messages to be delivered to the maildir, before the door is | |
50 | permanently shot. | |
51 | ||
52 | For best performance, all maildir clients should support this quota | |
53 | extension, however there's a wide degree of tolerance here. As long as | |
54 | the mail delivery agent that puts new messages into a Maildir uses | |
55 | this extension, the quota will be enforced without excessive | |
56 | degradation. | |
57 | ||
58 | In the worst case scenario, quotas are automatically recalculated | |
59 | every fifteen minutes. If a maildir goes over quota, and a mail client | |
60 | that does not support this quota extension removes enough mail from | |
61 | the maildir, the mail delivery agent will not be immediately informed | |
62 | that the maildir is now under quota. However, eventually the correct | |
63 | quota will be recalculated and mail delivery will resume. | |
64 | ||
65 | Mail user agents sometimes put messages into the maildir themselves. | |
66 | Messages added to a maildir by a mail user agent that does not | |
67 | understand the quota extension will not be immediately counted towards | |
68 | the overall quota, and may not be counted for an extensive period of | |
69 | time. Additionally, if there are a lot of messages that have been | |
70 | added to a maildir from these mail user agents, quota recalculation | |
71 | may impose non-trivial load on the system, as the quota recalculator | |
72 | will have to issue the stat system call for each message. | |
73 | ||
74 | How to implement the quota | |
75 | ||
76 | The best way to do that is to modify your mail server to implement the | |
77 | protocol defined by this document. Not everyone, of course, has this | |
78 | ability. Therefore, an alternate approach is available. | |
79 | ||
80 | This package creates a very short utility called "deliverquota". It | |
81 | will NOT be installed anywhere by default, unless this maildir quota | |
82 | implementation is a part of a larger package, in which case the parent | |
83 | package may install this utility somewhere. If you obtained the | |
84 | maildir package separately, you will need to compile it by running the | |
85 | configure script, then by running make. | |
86 | ||
87 | deliverquota takes two arguments. deliverquota reads the message from | |
88 | standard input, then delivers it to the maildir specified by the first | |
89 | argument to deliverquota. The second argument specifies the actual | |
90 | quota for this maildir, as defined elsewhere in this document. | |
91 | deliverquota will deliver the message to the maildir, making a best | |
92 | effort not to exceed the stated quota. If the maildir is over quota, | |
93 | deliverquota terminates with exit code 77. Otherwise, it delivers the | |
94 | message, updates the quota, and terminates with exit code 0. | |
95 | ||
96 | Therefore, proceed as follows: | |
97 | * Copy deliverquota to some convenient location, say /usr/local/bin. | |
98 | * Configure your mail server to use deliverquota. For example, if | |
99 | you use Qmail and your maildirs are all located in $HOME/Maildir, | |
100 | replace the './Maildir/' argument to qmail-start with the | |
101 | following: | |
102 | '| /usr/local/bin/deliverquota ./Maildir 1000000S' | |
103 | ||
104 | ||
105 | ||
106 | ||
107 | This sets a one million byte limit on all Maildirs. As I | |
108 | mentioned, this is meaningless if login access is available, | |
109 | because the individual account owner can create his own | |
110 | $HOME/.qmail file, and ignore deliverquota. Note that in this | |
111 | case, you MUST use apostrophes on the qmail-start command line, in | |
112 | order to quote this as one argument. | |
113 | ||
114 | If you would like to use different quotas for different users, you | |
115 | will have to put together a separate process or a script that looks up | |
116 | the appropriate quota for the recipient, and runs deliverquota | |
117 | specifying the quota. If no login access to the mail server is | |
118 | available, you can simply create a separate $HOME/.qmail for every | |
119 | recipient. | |
120 | ||
121 | That's pretty much it. If you handle a moderate amount of mail, I have | |
122 | one more suggestion. For the first couple of weeks, run deliverquota | |
123 | setting the second argument to an empty string. This disables quota | |
124 | enforcement, however it still activates certain optimizations that | |
125 | permit very fast quota recalculation. Messages delivered by | |
126 | deliverquota have their message size encoded in their filename; this | |
127 | makes it possible to avoid stat-ing the message in the Maildir, when | |
128 | recalculating the quota. Then, after most messages in your maildirs | |
129 | have been delivered by deliverquota, activate the quotas!!! | |
130 | ||
131 | maildirquota-enhanced applications | |
132 | ||
133 | This is a list of applications that have been enhanced to support the | |
134 | maildirquota extension: | |
135 | * maildrop - mail delivery agent/mail filter. | |
136 | * SqWebmail - webmail CGI binary. | |
137 | ||
138 | These applications fall into two classes: | |
139 | * Mail delivery agents. These applications read some externally | |
140 | defined table of mail recipients and their maildir quota. | |
141 | * Mail clients. These applications read maildir quota information | |
142 | that has been defined by the mail delivery agent. | |
143 | ||
144 | Mail clients generally do not need any additional setup in order to | |
145 | use the maildirquota extension. They will automatically read and | |
146 | implement any quota specification set by the mail delivery agent. | |
147 | ||
148 | On the other hand, mail delivery agents will require some kind of | |
149 | configuration in order to activate the maildirquota extension for some | |
150 | or all recipients. The instructions for doing that depends upon the | |
151 | mail delivery agent. The documentation for the mail delivery agent | |
152 | should be consulted for additional information. | |
153 | _________________________________________________________________ | |
154 | ||
155 | Mission statement | |
156 | ||
157 | Maildir++ is a mail storage structure that's based on the Maildir | |
158 | structure, first used in the Qmail mail server. Actually, Maildir++ is | |
159 | just a minor extension to the standard Maildir structure. | |
160 | ||
161 | For more information, see http://www.qmail.org/man/man5/maildir.html. | |
162 | I am not going to include the definition of a Maildir in this | |
163 | document. Consider it included right here. This document only | |
164 | describes the differences. | |
165 | ||
166 | Maildir++ adds a couple of things to a standard Maildir: folders and | |
167 | quotas. | |
168 | ||
169 | Quotas enforce a maximum allowable size of a Maildir. In many | |
170 | situations, using the quota mechanism of the underlying filesystem | |
171 | won't work very well. If a filesystem quota mechanism is used, then | |
172 | when a Maildir goes over quota, Qmail does not bounce additional mail, | |
173 | but keeps it queued, changing one bad situation into another bad | |
174 | situation. Not only know you have an account that's backed up, but now | |
175 | your queue starts to back up too. | |
176 | ||
177 | Definitions, and goals | |
178 | ||
179 | Maildir++ and Maildir shall be completely interchangeable. A Maildir++ | |
180 | client will be able to use a standard Maildir, automatically | |
181 | "upgrading" it in the process. A Maildir client will be able to use a | |
182 | Maildir++ just like a regular Maildir. Of course, a plain Maildir | |
183 | client won't be able to enforce a quota, and won't be able to access | |
184 | messages stored in folders. | |
185 | ||
186 | Folders are created as subdirectories under the main Maildir. The name | |
187 | of the subdirectory always starts with a period. For example, a folder | |
188 | named "Important" will be a subdirectory called ".Important". You | |
189 | can't have subdirectories that start with two periods. | |
190 | ||
191 | A Maildir++ client ignores anything in the main Maildir that starts | |
192 | with a period, but is not a subdirectory. | |
193 | ||
194 | Each subdirectory is a fully-fledged Maildir of its own, that is you | |
195 | have .Important/tmp, .Important/new, and .Important/cur. Everything | |
196 | that applies to the main Maildir applies equally well to the | |
197 | subdirectory, including automatically cleaning up old files in tmp. A | |
198 | Maildir++ enhancement is that a message can be moved between folders | |
199 | and/or the main Maildir simply by moving/renaming the file (into the | |
200 | cur subdirectory of the destination folder). Therefore, the entire | |
201 | Maildir++ must reside on the same filesystem. | |
202 | ||
203 | Within each subdirectory there's an empty file, maildirfolder. Its | |
204 | existence tells the mail delivery agent that this Maildir is a really | |
205 | a folder underneath a parent Maildir++. | |
206 | ||
207 | Only one special folder is reserved: Trash (subdirectory .Trash). | |
208 | Instead of marking deleted messages with the D flag, Maildir++ clients | |
209 | move the message into the Trash folder. Maildir++ readers are | |
210 | responsible for expunging messages from Trash after a system-defined | |
211 | retention interval. | |
212 | ||
213 | When a Maildir++ reader sees a message marked with a D flag it may at | |
214 | its option: remove the message immediately, move it into Trash, or | |
215 | ignore it. | |
216 | ||
217 | Can folders have subfolders, defined in a recursive fashion? The | |
218 | answer is no. If you want to have a client with a hierarchy of | |
219 | folders, emulate it. Pick a hierarchy separator character, say ":". | |
220 | Then, folder foo/bar is subdirectory .foo:bar. | |
221 | ||
222 | This is all that there's to say about folders. The rest of this | |
223 | document deals with quotas. | |
224 | ||
225 | The purpose of quotas is to temporarily disable a Maildir, if it goes | |
226 | over the quota. There is one and only major goal that this quota | |
227 | implementation tries to achieve: | |
228 | * Place as little overhead as possible on the mail system that's | |
229 | delivering to the Maildir++ | |
230 | ||
231 | That's it. To achieve that goal, certain compromises are made: | |
232 | * Mail delivery will stop as soon as possible after Maildir++'s size | |
233 | goes over quota. Certain race conditions may happen with Maildir++ | |
234 | going a lot over quota, in rare circumstances. That is taken into | |
235 | account, and the situation will eventually resolve itself, but you | |
236 | should not simply take your systemwide quota, multiply it by the | |
237 | number of mail accounts, and allocate that much disk space. Always | |
238 | leave room to spare. | |
239 | * How well the quota mechanism will work will depend on whether or | |
240 | not everything that accesses the Maildir++ is a Maildir++ client. | |
241 | You can have a transition period where some of your mail clients | |
242 | are just Maildir clients, and things should run more or less well. | |
243 | There will be some additional load because the size of the Maildir | |
244 | will be recalculated more often, but the additional load shouldn't | |
245 | be noticeable. | |
246 | ||
247 | This won't be a perfect solution, but it will hopefully be good | |
248 | enough. Maildirs are simply designed to rely on the filesystem to | |
249 | enforce individual quotas. If a filesystem-based quota works for you, | |
250 | use it. | |
251 | ||
252 | A Maildir++ may contain the following additional file: maildirsize. | |
253 | ||
254 | Contents of maildirsize | |
255 | ||
256 | maildirsize contains two or more lines terminated by newline | |
257 | characters. | |
258 | ||
259 | The first line contains a copy of the quota definition as used by the | |
260 | system's mail server. Each application that uses the maildir must know | |
261 | what it's quota is. Instead of configuring each application with the | |
262 | quota logic, and making sure that every application's quota definition | |
263 | for the same maildir is exactly the same, the quota specification used | |
264 | by the system mail server is saved as the first line of the | |
265 | maildirsize file. All other application that enforce the maildir quota | |
266 | simply read the first line of maildirsize. | |
267 | ||
268 | The quota definition is a list, separate by commas. Each member of the | |
269 | list consists of an integer followed by a letter, specifying the | |
270 | nature of the quota. Currently defined quota types are 'S' - total | |
271 | size of all messages, and 'C' - the maximum count of messages in the | |
272 | maildir. For example, 10000000S,1000C specifies a quota of 10,000,000 | |
273 | bytes or 1,000 messages, whichever comes first. | |
274 | ||
275 | All remaining lines all contain two integers separated by a single | |
276 | space. The first integer is interpreted as a byte count. The second | |
277 | integer is interpreted as a file count. A Maildir++ writer can add up | |
278 | all byte counts and file counts from maildirsize and enforce a quota | |
279 | based either on number of messages or the total size of all the | |
280 | messages. | |
281 | ||
282 | Calculating maildirsize | |
283 | ||
284 | In most cases, changes to maildirsize are recorded by appending an | |
285 | additional line. Under some conditions maildirsize has to be | |
286 | recalculated from scratch. These conditions are defined later. This is | |
287 | the procedure that's used to recalculate maildirsize: | |
288 | 1. If we find a maildirfolder within the directory, we're delivering | |
289 | to a folder, so back up to the parent directory, and start again. | |
290 | 2. Read the contents of the new and cur subdirectories. Also, read | |
291 | the contents of the new and cur subdirectories in each Maildir++ | |
292 | folder, except Trash. Before reading each subdirectory, stat() the | |
293 | subdirectory itself, and keep track of the latest timestamp you | |
294 | get. | |
295 | 3. If the filename of each message is of the form xxxxx,S=nnnnn or | |
296 | xxxxx,S=nnnnn:xxxxx where "xxxxx" represents arbitrary text, then | |
297 | use nnnnn as the size of the file (which will be conveniently | |
298 | recorded in the filename by a Maildir++ writer, within the | |
299 | conventions of filename naming in a Maildir). If the message was | |
300 | not written by a Maildir++ writer, stat() it to obtain the message | |
301 | size. If stat() fails, a race condition removed the file, so just | |
302 | ignore it and move on to the next one. | |
303 | 4. When done, you have the grand total of the number of messages and | |
304 | their total size. Create a new maildirsize by: creating the file | |
305 | in the tmp subdirectory, observing the conventions for writing to | |
306 | a Maildir. Then rename the file as maildirsize.Afterwards, stat | |
307 | all new and cur subdirectories again. If you find a timestamp | |
308 | later than the saved timestamp, REMOVE maildirsize. | |
309 | 5. Before running this calculation procedure, the Maildir++ user | |
310 | wanted to know the size of the Maildir++, so return the calculated | |
311 | values. This is done even if maildirsize was removed. | |
312 | ||
313 | Calculating the quota for a Maildir++ | |
314 | ||
315 | This is the procedure for reading the contents of maildirsize for the | |
316 | purpose of determine if the Maildir++ is over quota. | |
317 | 1. If maildirsize does not exist, or if its size is at least 5120 | |
318 | bytes, recalculate it using the procedure defined above, and use | |
319 | the recalculated numbers. Otherwise, read the contents of | |
320 | maildirsize, and add up the totals. | |
321 | 2. The most efficient way of doing this is to: open maildirsize, then | |
322 | start reading it into a 5120 byte buffer (some broken NFS | |
323 | implementations may return less than 5120 bytes read even before | |
324 | reaching the end of the file). If we fill it, which, in most | |
325 | cases, will happen with one read, close it, and run the | |
326 | recalculation procedure. | |
327 | 3. In many cases the quota calculation is for the purpose of adding | |
328 | or removing messages from a Maildir++, so keep the file descriptor | |
329 | to maildirsize open. A file descriptor will not be available if | |
330 | quota recalculation ended up removing maildirsize due to a race | |
331 | condition, so the caller may or may not get a file descriptor | |
332 | together with the Maildir++ size. | |
333 | 4. If the numbers we got indicated that the Maidlir++ is over quota, | |
334 | some additional logic is in order: if we did not recalculate | |
335 | maildirsize, if the numbers in maildirsize indicated that we are | |
336 | over quota, then if maildirsize was more than one line long, or if | |
337 | the timestamp on maildirsize indicated that it's at least 15 | |
338 | minutes old, throw out the totals, and recalculate maildirsize | |
339 | from scratch. | |
340 | ||
341 | Eventually the 5120 byte limitation will always cause maildirsize to | |
342 | be recalculated, which will compensate for any race conditions which | |
343 | previously threw off the totals. Each time a message is delivered or | |
344 | removed from a Maildir++, one line is added to maildirsize (this is | |
345 | described below in greater detail). Most messages are less than 10K | |
346 | long, so each line appended to maildirsize will be either between | |
347 | seven and nine bytes long (four bytes for message count, space, digit | |
348 | 1, newline, optional minus sign in front of both counts if the message | |
349 | was removed). This results in about 640 Maildir++ operations before a | |
350 | recalculation is forced. Since most messages are added once and | |
351 | removed once from a Maildir, expect recalculation to happen | |
352 | approximately every 320 messages, keeping the overhead of a | |
353 | recalculation to a minimum. Even if most messages include large | |
354 | attachments, most attachments are less than 100K long, which brings | |
355 | down the average recalculation frequency to about 150 messages. | |
356 | ||
357 | Also, the effect of having non-Maildir++ clients accessing the | |
358 | Maildir++ is reduced by forcing a recalculation when we're potentially | |
359 | over quota. Even if non-Maildir++ clients are used to remove messages | |
360 | from the Maildir, the fact that the Maildir++ is still over quota will | |
361 | be verified every 15 minutes. | |
362 | ||
363 | Delivering to a Maildir++ | |
364 | ||
365 | Delivering to a Maildir++ is like delivering to a Maildir, with the | |
366 | following exceptions: | |
367 | 1. Follow the usual Maildir conventions for naming the filename used | |
368 | to store the message, except that append ,S=nnnnn to the name of | |
369 | the file, where nnnnn is the size of the file. This eliminates the | |
370 | need to stat() most messages when calculating the quota. If the | |
371 | size of the message is not known at the beginning, append ,S=nnnnn | |
372 | when renaming the message from tmp to new. | |
373 | 2. As soon as the size of the message is known (hopefully before it | |
374 | is written into tmp), calculate Maildir++'s quota, using the | |
375 | procedure defined previously. If the message is over quota, back | |
376 | out, cleaning up anything that was created in tmp. | |
377 | 3. If a file descriptor to maildirsize was opened for us, after | |
378 | moving the file from tmp to new append a line to the file | |
379 | containing the message size, and "1". | |
380 | ||
381 | Reading from a Maildir++ | |
382 | ||
383 | Maildir++ readers should mind the following additional tasks: | |
384 | 1. Make sure to create the maildirfolder file in any new folders | |
385 | created within the Maildir++. | |
386 | 2. When moving a message to the Trash folder, append a line to | |
387 | maildirsize, containing a negative message size and a '-1'. | |
388 | 3. When moving a message from the Trash folder, follow the steps | |
389 | described in "Delivering to Maildir++", as far as quota logic | |
390 | goes. That is, refuse to move messages out of Trash if the | |
391 | Maildir++ is over quota. | |
392 | 4. Moving a message between other folders carries no additional | |
393 | requirements. | |
394 |