ncdbm(3) NcFTP Software ncdbm(3)

Name

NcDBM_Open, NcDBM_Close, NcDBM_Query, NcDBM_Insert, NcDBM_Delete, NcDBM_FirstKey, NcDBM_NextKey - flatfile database routines

Synopsis

#include <ncdbm.h>
int NcDBM_Open(NcDBMFile *const db, const char *const dbPathName, const int how, const int mode, const int hsize);
int NcDBM_Close(NcDBMFile *const db);
int NcDBM_Query(NcDBMFile *const db, const NcDBMDatum key, NcDBMDatum *const content);
int NcDBM_Exists(NcDBMFile *const db, const NcDBMDatum key);
int NcDBM_Insert(NcDBMFile *const db, NcDBMDatum key, NcDBMDatum content, NcDBMInsertMode flags);
int NcDBM_ExclusiveLock(NcDBMFile *const db, int lockmode);
int NcDBM_Delete(NcDBMFile *const db, const NcDBMDatum key);
int NcDBM_NextKey(NcDBMFile *const db, NcDBMDatum *const nxtkey, NcDBMDatum *const content, NcDBMIterState *itp);
int NcDBM_FirstKey(NcDBMFile *const db, NcDBMDatum *const firstkey, NcDBMDatum *const content, NcDBMIterState *const itp);
const char *NcDBM_Strerror(int e);
void NcDBM_SetHashFunction(NcDBMFile *const db, NcDBMHashFunction func);

Description

These functions are useful for creating and maintaining a simple single key, flat-file database of key/content pairs.  The functions will handle databases up to 2 GB in size, with content records limited only by available memory and disk space, and key fields up to 255 characters.  This package is intended to serve as a suitable alternative to the ndbm(3) and gdbm(3) libraries.  In particular, this package is useful for providing a heterogeneous interface across multiple platforms (i.e. ndbm is not available on some platforms), as well as being useful for commercial applications (i.e. gdbm is restricted by the GNU Public License).

This implementation is essentially a disk-based hash-table, which provides good performance as long as the hash function produces a decent distribution over the hash buckets.  For most applications, the default hash function used internally provides a good spread.

The database file is composed of a header, a fixed-length hash-table, and data records.  Although the hash-table is of fixed size, this size can be configured before the database is created.  The database file therefore requires only one file, unlike ndbm which uses one file for a record map and another file for the record data itself.

Keys and contents are described by the NcDBMDatum datatype.  This datatype is a structure containing an arbitrary data field (named dptr, for similarity with ndbm) and a length field (named dsize, for similarity with ndbm) which specifies the length of the data field in bytes.

Opening a database

Before a database can be accessed, it must be opened using NcDBM_Open.  You provide a pointer to an uninitialized NcDBMFile structure, a pathname for the database file (usually suffixed with .db), open flags, a file creation mode, and the size of the hash-table (if you're creating a new database).

NcDBM_Open's "open flags" are a bitwise OR of the following constants:

These correspond to the open flags of the open() system call.  The mode flag also corresponds to that of open() -- along with the current umask, this will be used to set the permission bits on the database file, if created.  This parameter is ignored if you're not creating a database, so use 0 in those cases.  The last parameter specifies the size of the hash-table; generally you should use the predefined symbol NCDBM_DEFAULT_HSIZE unless you really want to tune the size.

For example, to create a new database for adding new records, you might do:

NcDBMFile db;
int err;

err = NcDBM_Open(&db, "/tmp/new.db", NCDBM_O_WRONLY|NCDBM_O_CREAT, 00666, NCDBM_DEFAULT_HSIZE);

Likewise, when you finish using a database use NcDBM_Close to release the resources associated with it.

Adding records

Use the NcDBM_Insert function to add or replace records.  If a record by the same key field already exists, the function will return an error unless you specify to overwrite the record.  To do that you specify NCDBM_REPLACE as the last parameter, otherwise you use NCDBM_INSERT to fail upon matching keys.  For example, to insert a new record for the key field "gleason" with the record content as "402-555-1234,Omaha,NE" you would do:

NcDBMDatum key, content;
int err;

key.dptr = "gleason";
key.dsize = strlen(key.dptr);
content.dptr = "402-555-1234,Omaha,NE";
content.dsize = strlen(content.dptr);
err = NcDBM_Insert(&db, key, content, NCDBM_INSERT);

Searching for records

Eventually you'll want to lookup previously inserted records.  To do that, you use the NcDBM_Query function.  You specify the key datum to search on, and a pointer to a content datum to write to if a match is found.  If a match is found, the content datum is populated with the data allocated dynamically using the malloc() library function.  It is then your responsibility to free() the data when you're finished using it.  The NcDBM_Exists function is provided if you only want to determine if a certain key exists, and don't care about the record contents associated with it.

Iterating through the database

To pass through all the database records (in undefined order), you use the NcDBM_FirstKey and NcDBM_NextKey functions.  To get the first record, use NcDBM_FirstKey which initializes a NcDBMIterState structure.  NcDBM_NextKey then uses that NcDBMIterState structure to maintain the iteration status.  Here's an example on how to iterate an entire database:

NcDBMDatum key;
NcDBMDatum d;
NcDBMIterState itstate;
int e;

e = NcDBM_FirstKey(gDBptr, &key, &d, &itstate);
if (e == NCDBM_ERR_NO_MORE_RECORDS) {
	fprintf(gStdErr, "(database is empty)\n");
	return (-1);
} else if (e != NCDBM_NO_ERR) {
	fprintf(gStdErr, "Export failed : %s.\n", NcDBM_Strerror(e));
	return (-1);
}

/* Print first record. */
fprintf(stdout, "%s%c%s\n", key.dptr, gDelim, d.dptr);
free(d.dptr);
free(key.dptr);

for (;;) {
	e = NcDBM_NextKey(gDBptr, &key, &d, &itstate);
	if (e == NCDBM_ERR_NO_MORE_RECORDS)
		break;
	if (e != NCDBM_NO_ERR) {
		fprintf(gStdErr, "Export failed : %s.\n", NcDBM_Strerror(e));
		return (-1);
	}

	/* Print each record. */
	fprintf(stdout, "%s%c%s\n", key.dptr, gDelim, d.dptr);
	free(d.dptr);
	free(key.dptr);
}

Deleting records

Removing records can be accomplished by using the NcDBM_Delete function, but it should be noted that (like ndbm) the data file does not get smaller when the record is deleted; rather, the database simply marks the record as deleted and changes the database structure so that future searches won't match this record.

Error Checking

Functions return an integer result code which is 0 (NCDBM_NO_ERR) when no error occurs, or a negative error code when an error is reported.  Reference the symbols defined in ncdbm.h for the complete listing of error codes, or use the NcDBM_Strerror() function to get a textual error description.

Miscellaneous

To use a different hashing function, use NcDBM_SetHashFunction.  If you use a different function, you must be consistent (all applications using the database must use the same function) and you must specify the function before accessing the data.  A good time to do that is immediately after opening the database.  The function should be declared like:

NcDBM_hash32_t NcDBM_DefaultHashFunction(const unsigned char *ucp, int dsize);

The NcDBM_hash32_t datatype is simply a 32-bit unsigned integral type.  For most systems, that is identical to an unsigned int.  The function should take the binary data indicated by "ucp" whose length is "dsize" bytes and return a hash value.

Notes

The database was designed to operate best in a mostly read-only environment.  That is, performance is best when doing many queries and few inserts.

The database uses file locking to allow multiple processes to safely access the same database file, as long as the processes use library functions to manipulate the database contents (the library uses advisory record locks for UNIX platforms).  Note that ndbm does not do locking of any kind and is therefore unsuitable for concurrent use by multiple processes.

The database files are currently dependent on the host machine's byte-ordering.  As such, database files may not be portable to different platforms.

Database files are not "sparse" files like ndbm's .pag files.  However, the files are compact so they wouldn't benefit from using sparse files.

Database hash-table buckets may contain a potentially unlimited number of records per hash-bucket.  This is different from ndbm, which limits the number of records matching a particular hash-bucket, since it places all matching records into a fixed-length contiguous portion of the .pag file.