| 1 | DBM Libraries for use with Exim |
| 2 | ------------------------------- |
| 3 | |
| 4 | Background |
| 5 | ---------- |
| 6 | |
| 7 | Exim uses direct-access (so-called "dbm") files for a number of different |
| 8 | purposes. These are files arranged so that the data they contain is indexed and |
| 9 | can quickly be looked up by quoting an appropriate key. They are used as |
| 10 | follows: |
| 11 | |
| 12 | 1. Exim keeps its "hints" databases in dbm files. |
| 13 | |
| 14 | 2. The configuration can specify that certain things (e.g. aliases) be looked |
| 15 | up in dbm files. |
| 16 | |
| 17 | 3. The configuration can contain expansion strings that involve lookups in dbm |
| 18 | files. |
| 19 | |
| 20 | 4. The filter commands "mail" and "vacation" have a facility for replying only |
| 21 | once to each incoming address. The record of which addresses have already |
| 22 | received replies may be kept in a dbm file, depending on the configuration |
| 23 | option once_file_size. |
| 24 | |
| 25 | The runtime configuration can be set up without specifying 2 or 3, but Exim |
| 26 | always requires the availability of a dbm library, for 1 (and 4 if configured |
| 27 | that way). |
| 28 | |
| 29 | |
| 30 | DBM Libraries |
| 31 | ------------- |
| 32 | |
| 33 | The original library that provided the dbm facility in Unix was called "dbm". |
| 34 | This seems to have been superseded quite some time ago by a new version called |
| 35 | "ndbm" which permits several dbm files to be open at once. Several operating |
| 36 | systems, including those from Sun, contain ndbm as standard. |
| 37 | |
| 38 | A number of alternative libraries also exist, the most common of which seems to |
| 39 | be Berkeley DB (just called DB hereinafter). Release 1.85 was around for |
| 40 | some time, and various releases 2.x began to appear towards the end of 1997. In |
| 41 | November 1999, version 3.0 was released, and the ending of support for 2.7.7, |
| 42 | the last 2.x release, was announced for November 2000. (Support for 1.85 has |
| 43 | already ceased.) There were further 3.x releases, but by the end of 2001, the |
| 44 | current release was 4.0.14. |
| 45 | |
| 46 | There are major differences in implementation and interface between the DB 1.x |
| 47 | and 2.x/3.x/4.x releases, and they are best considered as two independent dbm |
| 48 | libraries. Changes to the API were made for 3.0 and again for 3.1. |
| 49 | |
| 50 | Another DBM library is the GNU library, gdbm, though this does not seem to be |
| 51 | very widespread. |
| 52 | |
| 53 | Yet another dbm library is tdb (Trivial Data Base) which has come out of the |
| 54 | Samba project. The first releases seem to have been in mid-2000. |
| 55 | |
| 56 | Some older Linux releases contain gdbm as standard, while others contain no dbm |
| 57 | library. More recent releases contain DB 1.85 or 2.x or later, and presumably |
| 58 | will track the development of the DB library. Some BSD versions of Unix include |
| 59 | DB 1.85 or later. All of the non-ndbm libraries except tdb contain |
| 60 | compatibility interfaces so that programs written to call the ndbm functions |
| 61 | should, in theory, work with them, but there are some potential pitfalls which |
| 62 | have caught out Exim users in the past. |
| 63 | |
| 64 | Exim has been tested with ndbm, gdbm, DB 1.85, DB 2.x, DB 3.1, DB 4.0.14, and |
| 65 | tdb 1.0.2, in various different modes in some cases, and is believed to work |
| 66 | with all of them if it and they are properly configured. |
| 67 | |
| 68 | I have considered the possibility of calling different dbm libraries for |
| 69 | different functions from a single Exim binary. However, because all bar one of |
| 70 | the libraries provide ndbm compatibility interfaces (and therefore the same |
| 71 | function names) it would require a lot of complicated, error-prone trickery to |
| 72 | achieve this. Exim therefore uses a single library for all its dbm activities. |
| 73 | |
| 74 | However, Exim does also support cdb (Constant Data Base), an efficient file |
| 75 | arrangement for indexed data that does not change incrementally (for example, |
| 76 | alias files). This is independent of any dbm library and can be used alongside |
| 77 | any of them. |
| 78 | |
| 79 | |
| 80 | Locking |
| 81 | ------- |
| 82 | |
| 83 | The configuration option EXIMDB_LOCK_TIMEOUT controls how long Exim waits to |
| 84 | get a lock on a hints database. From version 1.80 onwards, Exim does not |
| 85 | attempt to take out a lock on an actual database file (this caused problems in |
| 86 | the past). Instead, it takes out an fcntl() lock on a separate file whose name |
| 87 | ends in ".lockfile". This ensures that Exim has exclusive access to the |
| 88 | database before even attempting to open it. Exim creates the lock file the |
| 89 | first time it needs it. It should never be removed. |
| 90 | |
| 91 | |
| 92 | Main Pitfall |
| 93 | ------------ |
| 94 | |
| 95 | The OS-specific configuration files that are used to build Exim specify the use |
| 96 | of Berkeley DB on those systems where it is known to be standard. In the |
| 97 | absence of any special configuration options, Exim uses the ndbm set of |
| 98 | functions to control its dbm databases. This should work with any of the dbm |
| 99 | libraries because those that are not ndbm have compatibility interfaces. |
| 100 | However, there is one awful pitfall: |
| 101 | |
| 102 | Exim #includes a header file called ndbm.h which defines the functions and the |
| 103 | interface data block; gdbm and DB 1.x provide their own versions of this header |
| 104 | file, later DB versions do not. If it should happen that the wrong version of |
| 105 | nbdm.h is seen by Exim, it may compile without error, but fail to operate |
| 106 | correctly at runtime. |
| 107 | |
| 108 | This situation can easily arise when more than one dbm library is installed on |
| 109 | a single host. For example, if you decide to use DB 1.x on a system where gdbm |
| 110 | is the standard library, unless you are careful in setting up the include |
| 111 | directories for Exim, it may see gdbm's ndbm.h file instead of DB's. The |
| 112 | situation is even worse with later versions of DB, which do not provide an |
| 113 | ndbm.h file at all. |
| 114 | |
| 115 | One way out of this for gdbm and any of the versions of DB is to configure Exim |
| 116 | to call the DBM library in its native mode instead of via the ndbm |
| 117 | compatibility interface, thus avoiding the use of ndbm.h. This is done by |
| 118 | setting the USE_DB configuration option if you are using Berkeley DB, or |
| 119 | USE_GDBM if you are using gdbm. This is the recommended approach. |
| 120 | |
| 121 | |
| 122 | NDBM |
| 123 | ---- |
| 124 | |
| 125 | The ndbm library holds its data in two files, with extensions .dir and .pag. |
| 126 | This makes atomic updating of, for example, alias files, difficult, because |
| 127 | simple renaming cannot be used without some risk. However, if your system has |
| 128 | ndbm installed, Exim should compile and run without any problems. |
| 129 | |
| 130 | |
| 131 | GDBM |
| 132 | ---- |
| 133 | |
| 134 | The gdbm library, when called via the ndbm compatibility interface, makes two |
| 135 | hard links to a single file, with extensions .dir and .pag. As mentioned above, |
| 136 | gdbm provides its own version of the ndbm.h header, and you must ensure that |
| 137 | this is seen by Exim rather than any other version. This is not likely to be a |
| 138 | problem if gdbm is the only dbm library on your system. |
| 139 | |
| 140 | If gdbm is called via the native interface (by setting USE_GDBM in your |
| 141 | Local/Makefile), it uses a single file, with no extension on the name, and the |
| 142 | ndbm.h header is not required. |
| 143 | |
| 144 | The gdbm library does its own locking of the single file that it uses. From |
| 145 | version 1.80 onwards, Exim locks on an entirely separate file before accessing |
| 146 | a hints database, so gdbm's locking should always succeed. |
| 147 | |
| 148 | |
| 149 | Berkeley DB 1.8x |
| 150 | ---------------- |
| 151 | |
| 152 | 1.85 was the most widespread DB 1.x release; there is also a 1.86 bug-fix |
| 153 | release, but the belief is that the bugs it fixes will not affect Exim. |
| 154 | However, maintenance for 1.x releases has been phased out. |
| 155 | |
| 156 | This dbm library can be called by Exim in one of two ways: via the ndbm |
| 157 | compatibility interface, or via its own native interface. There are two |
| 158 | advantages to doing the latter: (1) you don't run the risk of Exim's seeing the |
| 159 | "wrong" version of the ndbm.h header, as described above, and (2) the |
| 160 | performance is better. It is therefore recommended that you set USE_DB=yes in an |
| 161 | appropriate Local/Makefile-xxx file. (If you are compiling for just one OS, it |
| 162 | can go in Local/Makefile itself.) |
| 163 | |
| 164 | When called via the compatibility interface, DB 1.x creates a single file with |
| 165 | a .db extension. When called via its native interface, no extension is added to |
| 166 | the file name handed to it. |
| 167 | |
| 168 | DB 1.x does not do any locking of its own. |
| 169 | |
| 170 | |
| 171 | Berkeley DB 2.x |
| 172 | --------------- |
| 173 | |
| 174 | DB 2.x was released in 1997. It is a major re-implementation and its native |
| 175 | interface is incompatible with DB 1.x, though a compatibility interface was |
| 176 | introduced in DB 2.1.0, and there is also an ndbm.h compatibility interface. |
| 177 | |
| 178 | Like 1.x, it can be called from Exim via the ndbm compatibility interface or |
| 179 | via its native interface, and once again setting USE_DB in order to get the |
| 180 | native interface is recommended. If USE_DB is *not* set, then you will have to |
| 181 | provide a suitable version of ndbm.h, because one does not come with the DB 2.x |
| 182 | distribution. A suitable version is: |
| 183 | |
| 184 | /************************************************* |
| 185 | * ndbm.h header for DB 2.x * |
| 186 | *************************************************/ |
| 187 | |
| 188 | /* This header should replace any other version of ndbm.h when Berkeley DB |
| 189 | version 2.x is in use via the ndbm compatibility interface. Otherwise, any |
| 190 | extant version of ndbm.h may cause programs to misbehave. There doesn't seem |
| 191 | to be a version of ndbm.h supplied with DB 2.x, so I made this for myself. |
| 192 | |
| 193 | Philip Hazel 12/Jun/97 |
| 194 | */ |
| 195 | |
| 196 | #define DB_DBM_HSEARCH |
| 197 | #include <db.h> |
| 198 | |
| 199 | /* End */ |
| 200 | |
| 201 | When called via the compatibility interface, DB 2.x creates a single file with |
| 202 | a .db extension. When called via its native interface, no extension is added to |
| 203 | the file name handed to it. |
| 204 | |
| 205 | DB 2.x does not do any automatic locking of its own; it does have a set of |
| 206 | functions for various forms of locking, but Exim does not use them. |
| 207 | |
| 208 | |
| 209 | Berkeley DB 3.x |
| 210 | --------------- |
| 211 | |
| 212 | DB 3.0 was released in November 1999 and 3.1 in June 2000. The 3.x series is a |
| 213 | development of the 2.x series and the above comments apply. Exim can |
| 214 | automatically distinguish between the different versions, so it copes with the |
| 215 | changes to the API without needing any special configuration. |
| 216 | |
| 217 | When Exim creates a DBM file using DB 3.x (e.g. when creating one of its hints |
| 218 | databases), it specified the "hash" format. However, when it opens a DB 3 file |
| 219 | for reading only, it specifies "unknown". This means that it can read DB 3 |
| 220 | files in other formats that are created by other programs. |
| 221 | |
| 222 | |
| 223 | Berkeley DB 4.x |
| 224 | --------------- |
| 225 | |
| 226 | The 4.x series is a developement of the 2.x and 3.x series, and the above |
| 227 | comments apply. |
| 228 | |
| 229 | |
| 230 | tdb |
| 231 | --- |
| 232 | |
| 233 | tdb 1.0.2 was released in September 2000. Its origin is the database functions |
| 234 | that are used by the Samba project. |
| 235 | |
| 236 | |
| 237 | |
| 238 | Testing Exim's dbm handling |
| 239 | --------------------------- |
| 240 | |
| 241 | Because there have been problems with dbm file locking in the past, I built |
| 242 | some testing code for Exim's dbm functions. This is very much a lash-up, but it |
| 243 | is documented here so that anybody who wants to check that their configuration |
| 244 | is locking properly can do so. Now that Exim does the locking on an entirely |
| 245 | separate file, locking problems are much less likely, but this code still |
| 246 | exists, just in case. Proceed as follows: |
| 247 | |
| 248 | . Build Exim in the normal way. This ensures that all the makesfiles etc. get |
| 249 | set up. |
| 250 | |
| 251 | . From within the build directory, obey "make test_dbfn". This makes a binary |
| 252 | file called test_dbfn. If you are experimenting with different configurations |
| 253 | you *must* do "make makefile" after changing anything, before obeying "make |
| 254 | test_dbfn" again, because the make target for test_dbfn isn't integrated |
| 255 | with the making of the makefile. |
| 256 | |
| 257 | . Identify a scratch directory where you have write access. Create a sub- |
| 258 | directory called "db" in the scratch directory. |
| 259 | |
| 260 | . Type the command "test_dbfn <scratch-directory>". This will output some |
| 261 | general information such as |
| 262 | |
| 263 | Exim's db functions tester: interface type is db (v2) |
| 264 | DBM library: Berkeley DB: Sleepycat Software: DB 2.1.0: (6/13/97) |
| 265 | USE_DB is defined |
| 266 | |
| 267 | It then says |
| 268 | |
| 269 | Test the functions |
| 270 | > |
| 271 | |
| 272 | . At this point you can type commands to open a dbm file and read and write |
| 273 | data in it. First type the command "open <name>", e.g. "open junk". The |
| 274 | response should look like this |
| 275 | |
| 276 | opened DB file <scratch-directory>/db/junk: flags=102 |
| 277 | Locked |
| 278 | opened 0 |
| 279 | > |
| 280 | |
| 281 | The tester will have created a dbm file within the db directory of the |
| 282 | scratch directory. It will also have created a file with the extension |
| 283 | ".lockfile" in the same directory. Unlike Exim itself, it will not create |
| 284 | the db directory for itself if it does not exist. |
| 285 | |
| 286 | . To test the locking, don't type anything more for the moment. You now need to |
| 287 | set up another process running the same test_dbfn command, e.g. from a |
| 288 | different logon to the same host. This time, when you attempt to open the |
| 289 | file it should fail after a minute with a timeout error because it is |
| 290 | already in use. |
| 291 | |
| 292 | . If the second process doesn't produce any error message, but gets back to the |
| 293 | > prompt, then the locking is not working properly. |
| 294 | |
| 295 | . You can check that the second process gets the lock when the first process |
| 296 | releases it by exiting from the first process with ^D, q, or quit; or by |
| 297 | typing the command "close". |
| 298 | |
| 299 | . There are some other commands available that are not related to locking: |
| 300 | |
| 301 | write <key> <data> |
| 302 | e.g. |
| 303 | write abcde the quick brown fox |
| 304 | |
| 305 | writes a record to the database, |
| 306 | |
| 307 | read <key> |
| 308 | delete <key> |
| 309 | |
| 310 | read and delete a record, respectively, and |
| 311 | |
| 312 | scan |
| 313 | |
| 314 | scans the entire database. Note that the database is purely for testing the |
| 315 | dbm functions. It is *not* one of Exim's regular databases, and you should |
| 316 | not try running this testing program on any of Exim's real database |
| 317 | files. |
| 318 | |
| 319 | Philip Hazel |
| 320 | Last update: June 2002 |