Eximstats v1.43 - bugfix for pattern histograms with -h0 specified.
[exim.git] / doc / doc-txt / dbm.discuss.txt
CommitLineData
495ae4b0
PH
1$Cambridge: exim/doc/doc-txt/dbm.discuss.txt,v 1.1 2004/10/07 15:04:35 ph10 Exp $
2
3DBM Libraries for use with Exim
4-------------------------------
5
6Background
7----------
8
9Exim uses direct-access (so-called "dbm") files for a number of different
10purposes. These are files arranged so that the data they contain is indexed and
11can quickly be looked up by quoting an appropriate key. They are used as
12follows:
13
141. Exim keeps its "hints" databases in dbm files.
15
162. The configuration can specify that certain things (e.g. aliases) be looked
17 up in dbm files.
18
193. The configuration can contain expansion strings that involve lookups in dbm
20 files.
21
224. The filter commands "mail" and "vacation" have a facility for replying only
23 once to each incoming address. The record of which addresses have already
24 received replies may be kept in a dbm file, depending on the configuration
25 option once_file_size.
26
27The runtime configuration can be set up without specifying 2 or 3, but Exim
28always requires the availability of a dbm library, for 1 (and 4 if configured
29that way).
30
31
32DBM Libraries
33-------------
34
35The original library that provided the dbm facility in Unix was called "dbm".
36This seems to have been superseded quite some time ago by a new version called
37"ndbm" which permits several dbm files to be open at once. Several operating
38systems, including those from Sun, contain ndbm as standard.
39
40A number of alternative libraries also exist, the most common of which seems to
41be Berkeley DB (just called DB hereinafter). Release 1.85 was around for
42some time, and various releases 2.x began to appear towards the end of 1997. In
43November 1999, version 3.0 was released, and the ending of support for 2.7.7,
44the last 2.x release, was announced for November 2000. (Support for 1.85 has
45already ceased.) There were further 3.x releases, but by the end of 2001, the
46current release was 4.0.14.
47
48There are major differences in implementation and interface between the DB 1.x
49and 2.x/3.x/4.x releases, and they are best considered as two independent dbm
50libraries. Changes to the API were made for 3.0 and again for 3.1.
51
52Another DBM library is the GNU library, gdbm, though this does not seem to be
53very widespread.
54
55Yet another dbm library is tdb (Trivial Data Base) which has come out of the
56Samba project. The first releases seem to have been in mid-2000.
57
58Some older Linux releases contain gdbm as standard, while others contain no dbm
59library. More recent releases contain DB 1.85 or 2.x or later, and presumably
60will track the development of the DB library. Some BSD versions of Unix include
61DB 1.85 or later. All of the non-ndbm libraries except tdb contain
62compatibility interfaces so that programs written to call the ndbm functions
63should, in theory, work with them, but there are some potential pitfalls which
64have caught out Exim users in the past.
65
66Exim has been tested with ndbm, gdbm, DB 1.85, DB 2.x, DB 3.1, DB 4.0.14, and
67tdb 1.0.2, in various different modes in some cases, and is believed to work
68with all of them if it and they are properly configured.
69
70I have considered the possibility of calling different dbm libraries for
71different functions from a single Exim binary. However, because all bar one of
72the libraries provide ndbm compatibility interfaces (and therefore the same
73function names) it would require a lot of complicated, error-prone trickery to
74achieve this. Exim therefore uses a single library for all its dbm activities.
75
76However, Exim does also support cdb (Constant Data Base), an efficient file
77arrangement for indexed data that does not change incrementally (for example,
78alias files). This is independent of any dbm library and can be used alongside
79any of them.
80
81
82Locking
83-------
84
85The configuration option EXIMDB_LOCK_TIMEOUT controls how long Exim waits to
86get a lock on a hints database. From version 1.80 onwards, Exim does not
87attempt to take out a lock on an actual database file (this caused problems in
88the past). Instead, it takes out an fcntl() lock on a separate file whose name
89ends in ".lockfile". This ensures that Exim has exclusive access to the
90database before even attempting to open it. Exim creates the lock file the
91first time it needs it. It should never be removed.
92
93
94Main Pitfall
95------------
96
97The OS-specific configuration files that are used to build Exim specify the use
98of Berkeley DB on those systems where it is known to be standard. In the
99absence of any special configuration options, Exim uses the ndbm set of
100functions to control its dbm databases. This should work with any of the dbm
101libraries because those that are not ndbm have compatibility interfaces.
102However, there is one awful pitfall:
103
104Exim #includes a header file called ndbm.h which defines the functions and the
105interface data block; gdbm and DB 1.x provide their own versions of this header
106file, later DB versions do not. If it should happen that the wrong version of
107nbdm.h is seen by Exim, it may compile without error, but fail to operate
108correctly at runtime.
109
110This situation can easily arise when more than one dbm library is installed on
111a single host. For example, if you decide to use DB 1.x on a system where gdbm
112is the standard library, unless you are careful in setting up the include
113directories for Exim, it may see gdbm's ndbm.h file instead of DB's. The
114situation is even worse with later versions of DB, which do not provide an
115ndbm.h file at all.
116
117One way out of this for gdbm and any of the versions of DB is to configure Exim
118to call the DBM library in its native mode instead of via the ndbm
119compatibility interface, thus avoiding the use of ndbm.h. This is done by
120setting the USE_DB configuration option if you are using Berkeley DB, or
121USE_GDBM if you are using gdbm. This is the recommended approach.
122
123
124NDBM
125----
126
127The ndbm library holds its data in two files, with extensions .dir and .pag.
128This makes atomic updating of, for example, alias files, difficult, because
129simple renaming cannot be used without some risk. However, if your system has
130ndbm installed, Exim should compile and run without any problems.
131
132
133GDBM
134----
135
136The gdbm library, when called via the ndbm compatibility interface, makes two
137hard links to a single file, with extensions .dir and .pag. As mentioned above,
138gdbm provides its own version of the ndbm.h header, and you must ensure that
139this is seen by Exim rather than any other version. This is not likely to be a
140problem if gdbm is the only dbm library on your system.
141
142If gdbm is called via the native interface (by setting USE_GDBM in your
143Local/Makefile), it uses a single file, with no extension on the name, and the
144ndbm.h header is not required.
145
146The gdbm library does its own locking of the single file that it uses. From
147version 1.80 onwards, Exim locks on an entirely separate file before accessing
148a hints database, so gdbm's locking should always succeed.
149
150
151Berkeley DB 1.8x
152----------------
153
1541.85 was the most widespread DB 1.x release; there is also a 1.86 bug-fix
155release, but the belief is that the bugs it fixes will not affect Exim.
156However, maintenance for 1.x releases has been phased out.
157
158This dbm library can be called by Exim in one of two ways: via the ndbm
159compatibility interface, or via its own native interface. There are two
160advantages to doing the latter: (1) you don't run the risk of Exim's seeing the
161"wrong" version of the ndbm.h header, as described above, and (2) the
162performace is better. It is therefore recommended that you set USE_DB=yes in an
163appropriate Local/Makefile-xxx file. (If you are compiling for just one OS, it
164can go in Local/Makefile itself.)
165
166When called via the compatibility interface, DB 1.x creates a single file with
167a .db extension. When called via its native interface, no extension is added to
168the file name handed to it.
169
170DB 1.x does not do any locking of its own.
171
172
173Berkeley DB 2.x
174---------------
175
176DB 2.x was released in 1997. It is a major re-implementation and its native
177interface is incompatible with DB 1.x, though a compatibility interface was
178introduced in DB 2.1.0, and there is also an ndbm.h compatibility interface.
179
180Like 1.x, it can be called from Exim via the ndbm compatibility interface or
181via its native interface, and once again setting USE_DB in order to get the
182native interface is recommended. If USE_DB is *not* set, then you will have to
183provide a suitable version of ndbm.h, because one does not come with the DB 2.x
184distribution. A suitable version is:
185
186 /*************************************************
187 * ndbm.h header for DB 2.x *
188 *************************************************/
189
190 /* This header should replace any other version of ndbm.h when Berkeley DB
191 version 2.x is in use via the ndbm compatibility interface. Otherwise, any
192 extant version of ndbm.h may cause programs to misbehave. There doesn't seem
193 to be a version of ndbm.h supplied with DB 2.x, so I made this for myself.
194
195 Philip Hazel 12/Jun/97
196 */
197
198 #define DB_DBM_HSEARCH
199 #include <db.h>
200
201 /* End */
202
203When called via the compatibility interface, DB 2.x creates a single file with
204a .db extension. When called via its native interface, no extension is added to
205the file name handed to it.
206
207DB 2.x does not do any automatic locking of its own; it does have a set of
208functions for various forms of locking, but Exim does not use them.
209
210
211Berkeley DB 3.x
212---------------
213
214DB 3.0 was released in November 1999 and 3.1 in June 2000. The 3.x series is a
215development of the 2.x series and the above comments apply. Exim can
216automatically distinguish between the different versions, so it copes with the
217changes to the API without needing any special configuration.
218
219When Exim creates a DBM file using DB 3.x (e.g. when creating one of its hints
220databases), it specified the "hash" format. However, when it opens a DB 3 file
221for reading only, it specifies "unknown". This means that it can read DB 3
222files in other formats that are created by other programs.
223
224
225Berkeley DB 4.x
226---------------
227
228The 4.x series is a developement of the 2.x and 3.x series, and the above
229comments apply.
230
231
232tdb
233---
234
235tdb 1.0.2 was released in September 2000. Its origin is the database functions
236that are used by the Samba project.
237
238
239
240Testing Exim's dbm handling
241---------------------------
242
243Because there have been problems with dbm file locking in the past, I built
244some testing code for Exim's dbm functions. This is very much a lash-up, but it
245is documented here so that anybody who wants to check that their configuration
246is locking properly can do so. Now that Exim does the locking on an entirely
247separate file, locking problems are much less likely, but this code still
248exists, just in case. Proceed as follows:
249
250. Build Exim in the normal way. This ensures that all the makesfiles etc. get
251 set up.
252
253. From within the build directory, obey "make test_dbfn". This makes a binary
254 file called test_dbfn. If you are experimenting with different configurations
255 you *must* do "make makefile" after changing anything, before obeying "make
256 test_dbfn" again, because the make target for test_dbfn isn't integrated
257 with the making of the makefile.
258
259. Identify a scratch directory where you have write access. Create a sub-
260 directory called "db" in the scratch directory.
261
262. Type the command "test_dbfn <scratch-directory>". This will output some
263 general information such as
264
265 Exim's db functions tester: interface type is db (v2)
266 DBM library: Berkeley DB: Sleepycat Software: DB 2.1.0: (6/13/97)
267 USE_DB is defined
268
269 It then says
270
271 Test the functions
272 >
273
274. At this point you can type commands to open a dbm file and read and write
275 data in it. First type the command "open <name>", e.g. "open junk". The
276 response should look like this
277
278 opened DB file <scratch-directory>/db/junk: flags=102
279 Locked
280 opened 0
281 >
282
283 The tester will have created a dbm file within the db directory of the
284 scratch directory. It will also have created a file with the extension
285 ".lockfile" in the same directory. Unlike Exim itself, it will not create
286 the db directory for itself if it does not exist.
287
288. To test the locking, don't type anything more for the moment. You now need to
289 set up another process running the same test_dbfn command, e.g. from a
290 different logon to the same host. This time, when you attempt to open the
291 file it should fail after a minute with a timeout error because it is
292 already in use.
293
294. If the second process doesn't produce any error message, but gets back to the
295 > prompt, then the locking is not working properly.
296
297. You can check that the second process gets the lock when the first process
298 releases it by exiting from the first process with ^D, q, or quit; or by
299 typing the command "close".
300
301. There are some other commands available that are not related to locking:
302
303 write <key> <data>
304 e.g.
305 write abcde the quick brown fox
306
307 writes a record to the database,
308
309 read <key>
310 delete <key>
311
312 read and delete a record, respectively, and
313
314 scan
315
316 scans the entire database. Note that the database is purely for testing the
317 dbm functions. It is *not* one of Exim's regular databases, and you should
318 not try running this testing program on any of Exim's real database
319 files.
320
321Philip Hazel
322Last update: June 2002