Testsuite: Compiler info skip; whitespace stupidity.
[exim.git] / doc / doc-misc / Ext-maildir++
CommitLineData
e05f33e0
PH
1 Maildir++
2
3 In this document:
4 * HOWTO.maildirquota
5 * Mission statement
6 * Definitions and goals
7 * Contents of a maildirsize
8 * Calculating maildirsize
9 * Calculating the quota for a Maildir++
10 * Delivering to a Maildir++
11 * Reading from a Maildir++
12 * Bugs
13
14HOWTO.maildirquota
15
16 The remaining portion of this document is a technical description of
17 the maildir quota extension. This section is a brief overview of this
18 extension.
19
20 What is a maildirquota?
21
22 If you would like to have a quota on your maildir mailboxes, the best
23 solution is to always use filesystem-based quotas: per-user usage
24 quotas that is enforced by the operating system.
25
26 This is the best solution when the default Maildir is located in each
27 account's home directory. This solution will NOT work if Maildirs are
28 stored elsewhere, or if you have a large virtual domain setup where a
29 single userid is used to hold many individual Maildirs, one for each
30 virtual user.
31
32 This extension to the maildir format allows a "voluntary" maildir
33 quota implementation that does not rely on filesystem-based quotas.
34
35 When maildirquota will not work.
36
37 For this quota mechanism to work, all software that accesses a maildir
38 must observe this quota protocol. It follows that this quota mechanism
39 can be easily circumvented if users have direct (shell) access to the
40 filesystem containing the users' maildirs.
41
42 Furthermore, this quota mechanism is not 100% effective. It is
43 possible to have a situation where someone may go over quota. This
44 quota implementation uses a deliverate trade-off. It is necessary to
45 use some form of locking in order to have a complete bulletproof quota
46 enforcement, but maildirs mail stores were explicitly designed to
47 avoid any kind of locking. This quota approach does not use locking,
48 and the tradeoff is that sometimes it is possible for a few extra
49 messages to be delivered to the maildir, before the door is
50 permanently shot.
51
52 For best performance, all maildir clients should support this quota
53 extension, however there's a wide degree of tolerance here. As long as
54 the mail delivery agent that puts new messages into a Maildir uses
55 this extension, the quota will be enforced without excessive
56 degradation.
57
58 In the worst case scenario, quotas are automatically recalculated
59 every fifteen minutes. If a maildir goes over quota, and a mail client
60 that does not support this quota extension removes enough mail from
61 the maildir, the mail delivery agent will not be immediately informed
62 that the maildir is now under quota. However, eventually the correct
63 quota will be recalculated and mail delivery will resume.
64
65 Mail user agents sometimes put messages into the maildir themselves.
66 Messages added to a maildir by a mail user agent that does not
67 understand the quota extension will not be immediately counted towards
68 the overall quota, and may not be counted for an extensive period of
69 time. Additionally, if there are a lot of messages that have been
70 added to a maildir from these mail user agents, quota recalculation
71 may impose non-trivial load on the system, as the quota recalculator
72 will have to issue the stat system call for each message.
73
74 How to implement the quota
75
76 The best way to do that is to modify your mail server to implement the
77 protocol defined by this document. Not everyone, of course, has this
78 ability. Therefore, an alternate approach is available.
79
80 This package creates a very short utility called "deliverquota". It
81 will NOT be installed anywhere by default, unless this maildir quota
82 implementation is a part of a larger package, in which case the parent
83 package may install this utility somewhere. If you obtained the
84 maildir package separately, you will need to compile it by running the
85 configure script, then by running make.
86
87 deliverquota takes two arguments. deliverquota reads the message from
88 standard input, then delivers it to the maildir specified by the first
89 argument to deliverquota. The second argument specifies the actual
90 quota for this maildir, as defined elsewhere in this document.
91 deliverquota will deliver the message to the maildir, making a best
92 effort not to exceed the stated quota. If the maildir is over quota,
93 deliverquota terminates with exit code 77. Otherwise, it delivers the
94 message, updates the quota, and terminates with exit code 0.
95
96 Therefore, proceed as follows:
97 * Copy deliverquota to some convenient location, say /usr/local/bin.
98 * Configure your mail server to use deliverquota. For example, if
99 you use Qmail and your maildirs are all located in $HOME/Maildir,
100 replace the './Maildir/' argument to qmail-start with the
101 following:
102'| /usr/local/bin/deliverquota ./Maildir 1000000S'
103
104
105
106
107 This sets a one million byte limit on all Maildirs. As I
108 mentioned, this is meaningless if login access is available,
109 because the individual account owner can create his own
110 $HOME/.qmail file, and ignore deliverquota. Note that in this
111 case, you MUST use apostrophes on the qmail-start command line, in
112 order to quote this as one argument.
113
114 If you would like to use different quotas for different users, you
115 will have to put together a separate process or a script that looks up
116 the appropriate quota for the recipient, and runs deliverquota
117 specifying the quota. If no login access to the mail server is
118 available, you can simply create a separate $HOME/.qmail for every
119 recipient.
120
121 That's pretty much it. If you handle a moderate amount of mail, I have
122 one more suggestion. For the first couple of weeks, run deliverquota
123 setting the second argument to an empty string. This disables quota
124 enforcement, however it still activates certain optimizations that
125 permit very fast quota recalculation. Messages delivered by
126 deliverquota have their message size encoded in their filename; this
127 makes it possible to avoid stat-ing the message in the Maildir, when
128 recalculating the quota. Then, after most messages in your maildirs
129 have been delivered by deliverquota, activate the quotas!!!
130
131 maildirquota-enhanced applications
132
133 This is a list of applications that have been enhanced to support the
134 maildirquota extension:
135 * maildrop - mail delivery agent/mail filter.
136 * SqWebmail - webmail CGI binary.
137
138 These applications fall into two classes:
139 * Mail delivery agents. These applications read some externally
140 defined table of mail recipients and their maildir quota.
141 * Mail clients. These applications read maildir quota information
142 that has been defined by the mail delivery agent.
143
144 Mail clients generally do not need any additional setup in order to
145 use the maildirquota extension. They will automatically read and
146 implement any quota specification set by the mail delivery agent.
147
148 On the other hand, mail delivery agents will require some kind of
149 configuration in order to activate the maildirquota extension for some
150 or all recipients. The instructions for doing that depends upon the
151 mail delivery agent. The documentation for the mail delivery agent
152 should be consulted for additional information.
153 _________________________________________________________________
154
155Mission statement
156
157 Maildir++ is a mail storage structure that's based on the Maildir
158 structure, first used in the Qmail mail server. Actually, Maildir++ is
159 just a minor extension to the standard Maildir structure.
160
161 For more information, see http://www.qmail.org/man/man5/maildir.html.
162 I am not going to include the definition of a Maildir in this
163 document. Consider it included right here. This document only
164 describes the differences.
165
166 Maildir++ adds a couple of things to a standard Maildir: folders and
167 quotas.
168
169 Quotas enforce a maximum allowable size of a Maildir. In many
170 situations, using the quota mechanism of the underlying filesystem
171 won't work very well. If a filesystem quota mechanism is used, then
172 when a Maildir goes over quota, Qmail does not bounce additional mail,
173 but keeps it queued, changing one bad situation into another bad
174 situation. Not only know you have an account that's backed up, but now
175 your queue starts to back up too.
176
177Definitions, and goals
178
179 Maildir++ and Maildir shall be completely interchangeable. A Maildir++
180 client will be able to use a standard Maildir, automatically
181 "upgrading" it in the process. A Maildir client will be able to use a
182 Maildir++ just like a regular Maildir. Of course, a plain Maildir
183 client won't be able to enforce a quota, and won't be able to access
184 messages stored in folders.
185
186 Folders are created as subdirectories under the main Maildir. The name
187 of the subdirectory always starts with a period. For example, a folder
188 named "Important" will be a subdirectory called ".Important". You
189 can't have subdirectories that start with two periods.
190
191 A Maildir++ client ignores anything in the main Maildir that starts
192 with a period, but is not a subdirectory.
193
194 Each subdirectory is a fully-fledged Maildir of its own, that is you
195 have .Important/tmp, .Important/new, and .Important/cur. Everything
196 that applies to the main Maildir applies equally well to the
197 subdirectory, including automatically cleaning up old files in tmp. A
198 Maildir++ enhancement is that a message can be moved between folders
199 and/or the main Maildir simply by moving/renaming the file (into the
200 cur subdirectory of the destination folder). Therefore, the entire
201 Maildir++ must reside on the same filesystem.
202
203 Within each subdirectory there's an empty file, maildirfolder. Its
204 existence tells the mail delivery agent that this Maildir is a really
205 a folder underneath a parent Maildir++.
206
207 Only one special folder is reserved: Trash (subdirectory .Trash).
208 Instead of marking deleted messages with the D flag, Maildir++ clients
209 move the message into the Trash folder. Maildir++ readers are
210 responsible for expunging messages from Trash after a system-defined
211 retention interval.
212
213 When a Maildir++ reader sees a message marked with a D flag it may at
214 its option: remove the message immediately, move it into Trash, or
215 ignore it.
216
217 Can folders have subfolders, defined in a recursive fashion? The
218 answer is no. If you want to have a client with a hierarchy of
219 folders, emulate it. Pick a hierarchy separator character, say ":".
220 Then, folder foo/bar is subdirectory .foo:bar.
221
222 This is all that there's to say about folders. The rest of this
223 document deals with quotas.
224
225 The purpose of quotas is to temporarily disable a Maildir, if it goes
226 over the quota. There is one and only major goal that this quota
227 implementation tries to achieve:
228 * Place as little overhead as possible on the mail system that's
229 delivering to the Maildir++
230
231 That's it. To achieve that goal, certain compromises are made:
232 * Mail delivery will stop as soon as possible after Maildir++'s size
233 goes over quota. Certain race conditions may happen with Maildir++
234 going a lot over quota, in rare circumstances. That is taken into
235 account, and the situation will eventually resolve itself, but you
236 should not simply take your systemwide quota, multiply it by the
237 number of mail accounts, and allocate that much disk space. Always
238 leave room to spare.
239 * How well the quota mechanism will work will depend on whether or
240 not everything that accesses the Maildir++ is a Maildir++ client.
241 You can have a transition period where some of your mail clients
242 are just Maildir clients, and things should run more or less well.
243 There will be some additional load because the size of the Maildir
244 will be recalculated more often, but the additional load shouldn't
245 be noticeable.
246
247 This won't be a perfect solution, but it will hopefully be good
248 enough. Maildirs are simply designed to rely on the filesystem to
249 enforce individual quotas. If a filesystem-based quota works for you,
250 use it.
251
252 A Maildir++ may contain the following additional file: maildirsize.
253
254Contents of maildirsize
255
256 maildirsize contains two or more lines terminated by newline
257 characters.
258
259 The first line contains a copy of the quota definition as used by the
260 system's mail server. Each application that uses the maildir must know
261 what it's quota is. Instead of configuring each application with the
262 quota logic, and making sure that every application's quota definition
263 for the same maildir is exactly the same, the quota specification used
264 by the system mail server is saved as the first line of the
265 maildirsize file. All other application that enforce the maildir quota
266 simply read the first line of maildirsize.
267
268 The quota definition is a list, separate by commas. Each member of the
269 list consists of an integer followed by a letter, specifying the
270 nature of the quota. Currently defined quota types are 'S' - total
271 size of all messages, and 'C' - the maximum count of messages in the
272 maildir. For example, 10000000S,1000C specifies a quota of 10,000,000
273 bytes or 1,000 messages, whichever comes first.
274
275 All remaining lines all contain two integers separated by a single
276 space. The first integer is interpreted as a byte count. The second
277 integer is interpreted as a file count. A Maildir++ writer can add up
278 all byte counts and file counts from maildirsize and enforce a quota
279 based either on number of messages or the total size of all the
280 messages.
281
282Calculating maildirsize
283
284 In most cases, changes to maildirsize are recorded by appending an
285 additional line. Under some conditions maildirsize has to be
286 recalculated from scratch. These conditions are defined later. This is
287 the procedure that's used to recalculate maildirsize:
288 1. If we find a maildirfolder within the directory, we're delivering
289 to a folder, so back up to the parent directory, and start again.
290 2. Read the contents of the new and cur subdirectories. Also, read
291 the contents of the new and cur subdirectories in each Maildir++
292 folder, except Trash. Before reading each subdirectory, stat() the
293 subdirectory itself, and keep track of the latest timestamp you
294 get.
295 3. If the filename of each message is of the form xxxxx,S=nnnnn or
296 xxxxx,S=nnnnn:xxxxx where "xxxxx" represents arbitrary text, then
297 use nnnnn as the size of the file (which will be conveniently
298 recorded in the filename by a Maildir++ writer, within the
299 conventions of filename naming in a Maildir). If the message was
300 not written by a Maildir++ writer, stat() it to obtain the message
301 size. If stat() fails, a race condition removed the file, so just
302 ignore it and move on to the next one.
303 4. When done, you have the grand total of the number of messages and
304 their total size. Create a new maildirsize by: creating the file
305 in the tmp subdirectory, observing the conventions for writing to
306 a Maildir. Then rename the file as maildirsize.Afterwards, stat
307 all new and cur subdirectories again. If you find a timestamp
308 later than the saved timestamp, REMOVE maildirsize.
309 5. Before running this calculation procedure, the Maildir++ user
310 wanted to know the size of the Maildir++, so return the calculated
311 values. This is done even if maildirsize was removed.
312
313Calculating the quota for a Maildir++
314
315 This is the procedure for reading the contents of maildirsize for the
316 purpose of determine if the Maildir++ is over quota.
317 1. If maildirsize does not exist, or if its size is at least 5120
318 bytes, recalculate it using the procedure defined above, and use
319 the recalculated numbers. Otherwise, read the contents of
320 maildirsize, and add up the totals.
321 2. The most efficient way of doing this is to: open maildirsize, then
322 start reading it into a 5120 byte buffer (some broken NFS
323 implementations may return less than 5120 bytes read even before
324 reaching the end of the file). If we fill it, which, in most
325 cases, will happen with one read, close it, and run the
326 recalculation procedure.
327 3. In many cases the quota calculation is for the purpose of adding
328 or removing messages from a Maildir++, so keep the file descriptor
329 to maildirsize open. A file descriptor will not be available if
330 quota recalculation ended up removing maildirsize due to a race
331 condition, so the caller may or may not get a file descriptor
332 together with the Maildir++ size.
333 4. If the numbers we got indicated that the Maidlir++ is over quota,
334 some additional logic is in order: if we did not recalculate
335 maildirsize, if the numbers in maildirsize indicated that we are
336 over quota, then if maildirsize was more than one line long, or if
337 the timestamp on maildirsize indicated that it's at least 15
338 minutes old, throw out the totals, and recalculate maildirsize
339 from scratch.
340
341 Eventually the 5120 byte limitation will always cause maildirsize to
342 be recalculated, which will compensate for any race conditions which
343 previously threw off the totals. Each time a message is delivered or
344 removed from a Maildir++, one line is added to maildirsize (this is
345 described below in greater detail). Most messages are less than 10K
346 long, so each line appended to maildirsize will be either between
347 seven and nine bytes long (four bytes for message count, space, digit
348 1, newline, optional minus sign in front of both counts if the message
349 was removed). This results in about 640 Maildir++ operations before a
350 recalculation is forced. Since most messages are added once and
351 removed once from a Maildir, expect recalculation to happen
352 approximately every 320 messages, keeping the overhead of a
353 recalculation to a minimum. Even if most messages include large
354 attachments, most attachments are less than 100K long, which brings
355 down the average recalculation frequency to about 150 messages.
356
357 Also, the effect of having non-Maildir++ clients accessing the
358 Maildir++ is reduced by forcing a recalculation when we're potentially
359 over quota. Even if non-Maildir++ clients are used to remove messages
360 from the Maildir, the fact that the Maildir++ is still over quota will
361 be verified every 15 minutes.
362
363Delivering to a Maildir++
364
365 Delivering to a Maildir++ is like delivering to a Maildir, with the
366 following exceptions:
367 1. Follow the usual Maildir conventions for naming the filename used
368 to store the message, except that append ,S=nnnnn to the name of
369 the file, where nnnnn is the size of the file. This eliminates the
370 need to stat() most messages when calculating the quota. If the
371 size of the message is not known at the beginning, append ,S=nnnnn
372 when renaming the message from tmp to new.
373 2. As soon as the size of the message is known (hopefully before it
374 is written into tmp), calculate Maildir++'s quota, using the
375 procedure defined previously. If the message is over quota, back
376 out, cleaning up anything that was created in tmp.
377 3. If a file descriptor to maildirsize was opened for us, after
378 moving the file from tmp to new append a line to the file
379 containing the message size, and "1".
380
381Reading from a Maildir++
382
383 Maildir++ readers should mind the following additional tasks:
384 1. Make sure to create the maildirfolder file in any new folders
385 created within the Maildir++.
386 2. When moving a message to the Trash folder, append a line to
387 maildirsize, containing a negative message size and a '-1'.
388 3. When moving a message from the Trash folder, follow the steps
389 described in "Delivering to Maildir++", as far as quota logic
390 goes. That is, refuse to move messages out of Trash if the
391 Maildir++ is over quota.
392 4. Moving a message between other folders carries no additional
393 requirements.
394