As I have ranted about previously, I’m not a big fan of relational databases.  They have their uses, but they also have some very serious drawbacks, and I think they should be employed selectively.  There are lots of other ways you can handle persistence for your application.  This post kicks off what I hope will be a short series about some of the alternatives that I’ve used.  We’ll see how long it is before I get derailed into something more interesting.

Today, we’re going to look at ESENT, an embedded database that actually ships with every recent version of Windows (dating back, I believe, to Windows Server 2000, though ). ESENT is an extremely performant engine that is used by Microsoft Exchange and Active Directory (and probably other things, too).  Ayende is using it to power DivanDB, a .NET document-oriented database.  Since Ayende first blogged about it, there’s been a considerable rise in interest for ESENT.  There’s even talk of a LINQ to ESENT API, though I would agree that the underlying API needs a lot of work first.

ESENT has a fairly solid feature set:

  • ACID transactions with savepoints, lazy commits and robust crash recovery.
  • Snapshot isolation.
  • Record-level locking — multi-versioning provides non-blocking reads.
  • Highly concurrent database access.
  • Flexible meta-data (tens of thousands of columns, tables and indexes are possible).
  • Indexing support for integer, floating point, ASCII, Unicode and binary columns.
  • Sophisticated index types including conditional, tuple and multi-valued.
  • Individual columns can be up to 2GB in size. A database can be up to 16TB in size.
  • Can be configured for high performance or low resource usage.
  • No administration required (even the database cache size can adjust itself automatically).
  • No download. Your application uses the esent.dll which comes with the operating system.

One thing relational database lovers will notice is the lack of SQL support.  Indeed, you won’t be writing “SELECT * FROM…” with ESENT.  The API it exposes is much lower-level.  Another thing that will probably cause many developers to turn and run away screaming is the word “Jet”.  Yes, ESENT is indeed also known as JET Blue, but this is not the Jet that Microsoft Access uses.  Access actually uses a completely separate thing, JET Red.  The two are not related as far as code base goes. 

Using ESENT from .NET is very easy.  There’s a managed API available here.  Unfortunately, the API does resemble the underlying C API very closely, but that appears to be by design.  Still, the API is quite usable, just a bit verbose. 

Creating a database

Before you can use a database, you have to create it.  For this example, I’m going to be making a very simple document database that simply stores an XML-serialized Document object.  I’m just using ESENT as a key-value repository and not taking advantage of many of its other features.

using (Instance instance = new Instance("createdb"))
{
    instance.Init();

    using (Session session = new Session(instance))
    {
        JET_DBID dbid;

        //Overwrite the existing DB.
        Api.JetCreateDatabase(session, DATABASE_NAME, null, out dbid, CreateDatabaseGrbit.OverwriteExisting);

        using (Transaction transaction = new Transaction(session))
        {
            JET_TABLEID tableid;

            Api.JetCreateTable(session, dbid, TABLE, 16, 100, out tableid);

            JET_COLUMNID columnid;
            
            JET_COLUMNDEF idColumn = new JET_COLUMNDEF { coltyp = JET_coltyp.Binary, cp = JET_CP.None};
            JET_COLUMNDEF docColumn = new JET_COLUMNDEF {coltyp = JET_coltyp.LongText, cp = JET_CP.Unicode};

            Api.JetAddColumn(session, tableid, DOC_ID_COLUMN, idColumn, null, 0, out columnid);
            Api.JetAddColumn(session, tableid, DOC_BODY_COLUMN, docColumn, null, 0, out columnid);

            string keyDescriptor = string.Format("+{0}\0\0", DOC_ID_COLUMN);

            Api.JetCreateIndex(session, tableid, "DocumentID", CreateIndexGrbit.IndexPrimary, keyDescriptor,
                               keyDescriptor.Length, 100);

            Api.JetCloseTable(session, tableid);

            transaction.Commit(CommitTransactionGrbit.None);
        }

    }

}

Yeah, like I said, the API is very verbose.  You have your instance creation (the string parameter there can be anything you want to name your session), your session, the creation of the physical database file (note in this case that I’m overwriting the DB if it already exists), and then a transaction.  Inside the transaction, a single table is created with two columns.  A primary key is then created across the ID column. 

Inserting a document

Assuming our database exists, the next thing we will probably want to do is put some data into it.  (Side note: a subcontractor once said, and this is an exact quote, “So you put data in your database?” in response to me describing how we persisted objects in the project.  It was one of the funniest things I’ve ever heard anyone say.  Anyway…  )  This method assumes that you have an Instance object as a field that’s been initialized previously.  Instances are thread-safe, but sessions are not, so be sure to create a new session even if you are reusing an instance. 

protected override void InsertDocument(Document document)
{
    using (Session session = new Session(mInstance))
    {
        JET_DBID dbid;

        Api.JetAttachDatabase(session, DATABASE_NAME, AttachDatabaseGrbit.None);
        Api.JetOpenDatabase(session, DATABASE_NAME, null, out dbid, OpenDatabaseGrbit.None);

        using (Table table = new Table(session, dbid, TABLE, OpenTableGrbit.None))
        {
            var columnIds = Api.GetColumnDictionary(session, table);

            using (Transaction transaction = new Transaction(session))
            {
                using (Update update = new Update(session, table, JET_prep.Insert))
                {
                    Api.SetColumn(session, table, columnIds[DOC_ID_COLUMN], document.DocumentID);
                    Api.SetColumn(session, table, columnIds[DOC_BODY_COLUMN], document.ToXml(), Encoding.Unicode);

                    update.Save();
                }

                transaction.Commit(CommitTransactionGrbit.None);
            }
        }
    }
}

Again, we open a session, then we both attach and open the database.  After that, it’s safe to request access to the table, where we can create a transaction to perform the actual insert.  You need the underlying JET_COLUMNID objects when you are populating columns, which can be obtained using Api.GetColumnDictionary.  The basic pattern is still the same as you would use for a relational DB though: open a connection, begin a transaction, make a command (in this case, an insert), set some properties, save and commit.  Did I mention that the ESENT API was verbose?

Retrieving documents

You will find no SQL in ESENT.  In the unlikely event that you want your data back (<—– JOKE), you will have to use a lower-level API than you are probably used to.

protected override Document SelectDocument(Guid documentId)
{
    using (Session session = new Session(mInstance))
    {
        JET_DBID dbid;

        Api.JetAttachDatabase(session, DATABASE_NAME, AttachDatabaseGrbit.None);
        Api.JetOpenDatabase(session, DATABASE_NAME, null, out dbid, OpenDatabaseGrbit.None);

        using (Table table = new Table(session, dbid, TABLE, OpenTableGrbit.None))
        {
            using (Transaction transaction = new Transaction(session))
            {
                Api.JetSetCurrentIndex(session, table, null);

                Api.MakeKey(session, table, documentId, MakeKeyGrbit.NewKey);

                //NOTE: This will throw an exception if the key isn't in the index.  It might
                //be better to use JetTrySeek instead.
                Api.JetSeek(session, table, SeekGrbit.SeekEQ);

                string xml = Api.RetrieveColumnAsString(session, table, Api.GetColumnDictionary(session, table)[DOC_BODY_COLUMN]);

                return XmlSerialization.CreateInstance<Document>(xml);
            }
        }
    }
}

Again, we start by opening a session, attaching and opening the database, then requesting access to the table.  To do a look-up, we have to use an index, which is what Api.JetSetCurrentIndex does.  The last parameter is the name of the index; passing in null tells it to use the primary index.  Next, you have to make a key for the lookup, which I’m doing based on the document ID field.  After that, you have to seek to the record.  JetSeek will throw an exception if it can’t find a match, so use JetTrySeek if you aren’t positive the record is going to be present.  After jumping through all those hoops, you can finally get your data back using one of the Api.RetrieveColumnAs methods.  The final step there just uses a helper class I’ve made to go from the XML-serialized representation of my document back to an actual Document object.

Gotchas

I ran into a threading problem while I was prototyping with ESENT.  I decided to be clever, and I started trying to use the various OpenTableGrbit and OpenDatabaseGrbit enum members to tell the database what kind of access I needed (read-only, write, etc).  It turns out that I fail at being clever though, because apparently that causes it to go from record-level locking to table-level locking.  That’s not a problem if you have a single-threaded application, but it’s going to bite you hard if you have a few threads trying to work.  Instead, just play it safe and use the default settings (None) unless you know what you’re doing.

The End… FOR NOW

I really like ESENT.  I’m running some tests now to assess its feasibility on a project that we’re working on where we need a super-fast key-value storage engine.  Depending on how that goes, you may be hearing more about ESENT in the future.  In any case, I’d *love* to see a more generic API created around ESENT.  Maybe I’ll tackle that in my spare time.  Oh right, I don’t have any spare time. :(  Oh well, back to work…