Commit | Line | Data |
---|---|---|
495ae4b0 PH |
1 | $Cambridge: exim/doc/doc-txt/dbm.discuss.txt,v 1.1 2004/10/07 15:04:35 ph10 Exp $ |
2 | ||
3 | DBM Libraries for use with Exim | |
4 | ------------------------------- | |
5 | ||
6 | Background | |
7 | ---------- | |
8 | ||
9 | Exim uses direct-access (so-called "dbm") files for a number of different | |
10 | purposes. These are files arranged so that the data they contain is indexed and | |
11 | can quickly be looked up by quoting an appropriate key. They are used as | |
12 | follows: | |
13 | ||
14 | 1. Exim keeps its "hints" databases in dbm files. | |
15 | ||
16 | 2. The configuration can specify that certain things (e.g. aliases) be looked | |
17 | up in dbm files. | |
18 | ||
19 | 3. The configuration can contain expansion strings that involve lookups in dbm | |
20 | files. | |
21 | ||
22 | 4. The filter commands "mail" and "vacation" have a facility for replying only | |
23 | once to each incoming address. The record of which addresses have already | |
24 | received replies may be kept in a dbm file, depending on the configuration | |
25 | option once_file_size. | |
26 | ||
27 | The runtime configuration can be set up without specifying 2 or 3, but Exim | |
28 | always requires the availability of a dbm library, for 1 (and 4 if configured | |
29 | that way). | |
30 | ||
31 | ||
32 | DBM Libraries | |
33 | ------------- | |
34 | ||
35 | The original library that provided the dbm facility in Unix was called "dbm". | |
36 | This seems to have been superseded quite some time ago by a new version called | |
37 | "ndbm" which permits several dbm files to be open at once. Several operating | |
38 | systems, including those from Sun, contain ndbm as standard. | |
39 | ||
40 | A number of alternative libraries also exist, the most common of which seems to | |
41 | be Berkeley DB (just called DB hereinafter). Release 1.85 was around for | |
42 | some time, and various releases 2.x began to appear towards the end of 1997. In | |
43 | November 1999, version 3.0 was released, and the ending of support for 2.7.7, | |
44 | the last 2.x release, was announced for November 2000. (Support for 1.85 has | |
45 | already ceased.) There were further 3.x releases, but by the end of 2001, the | |
46 | current release was 4.0.14. | |
47 | ||
48 | There are major differences in implementation and interface between the DB 1.x | |
49 | and 2.x/3.x/4.x releases, and they are best considered as two independent dbm | |
50 | libraries. Changes to the API were made for 3.0 and again for 3.1. | |
51 | ||
52 | Another DBM library is the GNU library, gdbm, though this does not seem to be | |
53 | very widespread. | |
54 | ||
55 | Yet another dbm library is tdb (Trivial Data Base) which has come out of the | |
56 | Samba project. The first releases seem to have been in mid-2000. | |
57 | ||
58 | Some older Linux releases contain gdbm as standard, while others contain no dbm | |
59 | library. More recent releases contain DB 1.85 or 2.x or later, and presumably | |
60 | will track the development of the DB library. Some BSD versions of Unix include | |
61 | DB 1.85 or later. All of the non-ndbm libraries except tdb contain | |
62 | compatibility interfaces so that programs written to call the ndbm functions | |
63 | should, in theory, work with them, but there are some potential pitfalls which | |
64 | have caught out Exim users in the past. | |
65 | ||
66 | Exim has been tested with ndbm, gdbm, DB 1.85, DB 2.x, DB 3.1, DB 4.0.14, and | |
67 | tdb 1.0.2, in various different modes in some cases, and is believed to work | |
68 | with all of them if it and they are properly configured. | |
69 | ||
70 | I have considered the possibility of calling different dbm libraries for | |
71 | different functions from a single Exim binary. However, because all bar one of | |
72 | the libraries provide ndbm compatibility interfaces (and therefore the same | |
73 | function names) it would require a lot of complicated, error-prone trickery to | |
74 | achieve this. Exim therefore uses a single library for all its dbm activities. | |
75 | ||
76 | However, Exim does also support cdb (Constant Data Base), an efficient file | |
77 | arrangement for indexed data that does not change incrementally (for example, | |
78 | alias files). This is independent of any dbm library and can be used alongside | |
79 | any of them. | |
80 | ||
81 | ||
82 | Locking | |
83 | ------- | |
84 | ||
85 | The configuration option EXIMDB_LOCK_TIMEOUT controls how long Exim waits to | |
86 | get a lock on a hints database. From version 1.80 onwards, Exim does not | |
87 | attempt to take out a lock on an actual database file (this caused problems in | |
88 | the past). Instead, it takes out an fcntl() lock on a separate file whose name | |
89 | ends in ".lockfile". This ensures that Exim has exclusive access to the | |
90 | database before even attempting to open it. Exim creates the lock file the | |
91 | first time it needs it. It should never be removed. | |
92 | ||
93 | ||
94 | Main Pitfall | |
95 | ------------ | |
96 | ||
97 | The OS-specific configuration files that are used to build Exim specify the use | |
98 | of Berkeley DB on those systems where it is known to be standard. In the | |
99 | absence of any special configuration options, Exim uses the ndbm set of | |
100 | functions to control its dbm databases. This should work with any of the dbm | |
101 | libraries because those that are not ndbm have compatibility interfaces. | |
102 | However, there is one awful pitfall: | |
103 | ||
104 | Exim #includes a header file called ndbm.h which defines the functions and the | |
105 | interface data block; gdbm and DB 1.x provide their own versions of this header | |
106 | file, later DB versions do not. If it should happen that the wrong version of | |
107 | nbdm.h is seen by Exim, it may compile without error, but fail to operate | |
108 | correctly at runtime. | |
109 | ||
110 | This situation can easily arise when more than one dbm library is installed on | |
111 | a single host. For example, if you decide to use DB 1.x on a system where gdbm | |
112 | is the standard library, unless you are careful in setting up the include | |
113 | directories for Exim, it may see gdbm's ndbm.h file instead of DB's. The | |
114 | situation is even worse with later versions of DB, which do not provide an | |
115 | ndbm.h file at all. | |
116 | ||
117 | One way out of this for gdbm and any of the versions of DB is to configure Exim | |
118 | to call the DBM library in its native mode instead of via the ndbm | |
119 | compatibility interface, thus avoiding the use of ndbm.h. This is done by | |
120 | setting the USE_DB configuration option if you are using Berkeley DB, or | |
121 | USE_GDBM if you are using gdbm. This is the recommended approach. | |
122 | ||
123 | ||
124 | NDBM | |
125 | ---- | |
126 | ||
127 | The ndbm library holds its data in two files, with extensions .dir and .pag. | |
128 | This makes atomic updating of, for example, alias files, difficult, because | |
129 | simple renaming cannot be used without some risk. However, if your system has | |
130 | ndbm installed, Exim should compile and run without any problems. | |
131 | ||
132 | ||
133 | GDBM | |
134 | ---- | |
135 | ||
136 | The gdbm library, when called via the ndbm compatibility interface, makes two | |
137 | hard links to a single file, with extensions .dir and .pag. As mentioned above, | |
138 | gdbm provides its own version of the ndbm.h header, and you must ensure that | |
139 | this is seen by Exim rather than any other version. This is not likely to be a | |
140 | problem if gdbm is the only dbm library on your system. | |
141 | ||
142 | If gdbm is called via the native interface (by setting USE_GDBM in your | |
143 | Local/Makefile), it uses a single file, with no extension on the name, and the | |
144 | ndbm.h header is not required. | |
145 | ||
146 | The gdbm library does its own locking of the single file that it uses. From | |
147 | version 1.80 onwards, Exim locks on an entirely separate file before accessing | |
148 | a hints database, so gdbm's locking should always succeed. | |
149 | ||
150 | ||
151 | Berkeley DB 1.8x | |
152 | ---------------- | |
153 | ||
154 | 1.85 was the most widespread DB 1.x release; there is also a 1.86 bug-fix | |
155 | release, but the belief is that the bugs it fixes will not affect Exim. | |
156 | However, maintenance for 1.x releases has been phased out. | |
157 | ||
158 | This dbm library can be called by Exim in one of two ways: via the ndbm | |
159 | compatibility interface, or via its own native interface. There are two | |
160 | advantages to doing the latter: (1) you don't run the risk of Exim's seeing the | |
161 | "wrong" version of the ndbm.h header, as described above, and (2) the | |
162 | performace is better. It is therefore recommended that you set USE_DB=yes in an | |
163 | appropriate Local/Makefile-xxx file. (If you are compiling for just one OS, it | |
164 | can go in Local/Makefile itself.) | |
165 | ||
166 | When called via the compatibility interface, DB 1.x creates a single file with | |
167 | a .db extension. When called via its native interface, no extension is added to | |
168 | the file name handed to it. | |
169 | ||
170 | DB 1.x does not do any locking of its own. | |
171 | ||
172 | ||
173 | Berkeley DB 2.x | |
174 | --------------- | |
175 | ||
176 | DB 2.x was released in 1997. It is a major re-implementation and its native | |
177 | interface is incompatible with DB 1.x, though a compatibility interface was | |
178 | introduced in DB 2.1.0, and there is also an ndbm.h compatibility interface. | |
179 | ||
180 | Like 1.x, it can be called from Exim via the ndbm compatibility interface or | |
181 | via its native interface, and once again setting USE_DB in order to get the | |
182 | native interface is recommended. If USE_DB is *not* set, then you will have to | |
183 | provide a suitable version of ndbm.h, because one does not come with the DB 2.x | |
184 | distribution. A suitable version is: | |
185 | ||
186 | /************************************************* | |
187 | * ndbm.h header for DB 2.x * | |
188 | *************************************************/ | |
189 | ||
190 | /* This header should replace any other version of ndbm.h when Berkeley DB | |
191 | version 2.x is in use via the ndbm compatibility interface. Otherwise, any | |
192 | extant version of ndbm.h may cause programs to misbehave. There doesn't seem | |
193 | to be a version of ndbm.h supplied with DB 2.x, so I made this for myself. | |
194 | ||
195 | Philip Hazel 12/Jun/97 | |
196 | */ | |
197 | ||
198 | #define DB_DBM_HSEARCH | |
199 | #include <db.h> | |
200 | ||
201 | /* End */ | |
202 | ||
203 | When called via the compatibility interface, DB 2.x creates a single file with | |
204 | a .db extension. When called via its native interface, no extension is added to | |
205 | the file name handed to it. | |
206 | ||
207 | DB 2.x does not do any automatic locking of its own; it does have a set of | |
208 | functions for various forms of locking, but Exim does not use them. | |
209 | ||
210 | ||
211 | Berkeley DB 3.x | |
212 | --------------- | |
213 | ||
214 | DB 3.0 was released in November 1999 and 3.1 in June 2000. The 3.x series is a | |
215 | development of the 2.x series and the above comments apply. Exim can | |
216 | automatically distinguish between the different versions, so it copes with the | |
217 | changes to the API without needing any special configuration. | |
218 | ||
219 | When Exim creates a DBM file using DB 3.x (e.g. when creating one of its hints | |
220 | databases), it specified the "hash" format. However, when it opens a DB 3 file | |
221 | for reading only, it specifies "unknown". This means that it can read DB 3 | |
222 | files in other formats that are created by other programs. | |
223 | ||
224 | ||
225 | Berkeley DB 4.x | |
226 | --------------- | |
227 | ||
228 | The 4.x series is a developement of the 2.x and 3.x series, and the above | |
229 | comments apply. | |
230 | ||
231 | ||
232 | tdb | |
233 | --- | |
234 | ||
235 | tdb 1.0.2 was released in September 2000. Its origin is the database functions | |
236 | that are used by the Samba project. | |
237 | ||
238 | ||
239 | ||
240 | Testing Exim's dbm handling | |
241 | --------------------------- | |
242 | ||
243 | Because there have been problems with dbm file locking in the past, I built | |
244 | some testing code for Exim's dbm functions. This is very much a lash-up, but it | |
245 | is documented here so that anybody who wants to check that their configuration | |
246 | is locking properly can do so. Now that Exim does the locking on an entirely | |
247 | separate file, locking problems are much less likely, but this code still | |
248 | exists, just in case. Proceed as follows: | |
249 | ||
250 | . Build Exim in the normal way. This ensures that all the makesfiles etc. get | |
251 | set up. | |
252 | ||
253 | . From within the build directory, obey "make test_dbfn". This makes a binary | |
254 | file called test_dbfn. If you are experimenting with different configurations | |
255 | you *must* do "make makefile" after changing anything, before obeying "make | |
256 | test_dbfn" again, because the make target for test_dbfn isn't integrated | |
257 | with the making of the makefile. | |
258 | ||
259 | . Identify a scratch directory where you have write access. Create a sub- | |
260 | directory called "db" in the scratch directory. | |
261 | ||
262 | . Type the command "test_dbfn <scratch-directory>". This will output some | |
263 | general information such as | |
264 | ||
265 | Exim's db functions tester: interface type is db (v2) | |
266 | DBM library: Berkeley DB: Sleepycat Software: DB 2.1.0: (6/13/97) | |
267 | USE_DB is defined | |
268 | ||
269 | It then says | |
270 | ||
271 | Test the functions | |
272 | > | |
273 | ||
274 | . At this point you can type commands to open a dbm file and read and write | |
275 | data in it. First type the command "open <name>", e.g. "open junk". The | |
276 | response should look like this | |
277 | ||
278 | opened DB file <scratch-directory>/db/junk: flags=102 | |
279 | Locked | |
280 | opened 0 | |
281 | > | |
282 | ||
283 | The tester will have created a dbm file within the db directory of the | |
284 | scratch directory. It will also have created a file with the extension | |
285 | ".lockfile" in the same directory. Unlike Exim itself, it will not create | |
286 | the db directory for itself if it does not exist. | |
287 | ||
288 | . To test the locking, don't type anything more for the moment. You now need to | |
289 | set up another process running the same test_dbfn command, e.g. from a | |
290 | different logon to the same host. This time, when you attempt to open the | |
291 | file it should fail after a minute with a timeout error because it is | |
292 | already in use. | |
293 | ||
294 | . If the second process doesn't produce any error message, but gets back to the | |
295 | > prompt, then the locking is not working properly. | |
296 | ||
297 | . You can check that the second process gets the lock when the first process | |
298 | releases it by exiting from the first process with ^D, q, or quit; or by | |
299 | typing the command "close". | |
300 | ||
301 | . There are some other commands available that are not related to locking: | |
302 | ||
303 | write <key> <data> | |
304 | e.g. | |
305 | write abcde the quick brown fox | |
306 | ||
307 | writes a record to the database, | |
308 | ||
309 | read <key> | |
310 | delete <key> | |
311 | ||
312 | read and delete a record, respectively, and | |
313 | ||
314 | scan | |
315 | ||
316 | scans the entire database. Note that the database is purely for testing the | |
317 | dbm functions. It is *not* one of Exim's regular databases, and you should | |
318 | not try running this testing program on any of Exim's real database | |
319 | files. | |
320 | ||
321 | Philip Hazel | |
322 | Last update: June 2002 |