Click here to Skip to main content
15,881,559 members
Articles / Database Development / SQL Server
Article

Database local cache

Rate me:
Please Sign up or sign in to vote.
3.33/5 (4 votes)
14 Dec 2007CPOL8 min read 55.4K   679   29   4
A C# class that uses the local file system to cache binary objects stored in a database.

Introduction

Let's assume you are developing a desktop application for managing a fitness center. The members information is, of course, to be stored in a database, so you define a "Members" table with fields such as name, birth date, telephone, and the like. But, your happiness ends when you are faced with the requirement of recording the member's photo together with the rest of the data. Then, the second most transcendental question in human history (right after the meaning of life) comes to your mind: Should I store the photo in the database for the sake of data consistency, or in the local file system for the sake of performance?

This article presents a C# class that will allow you to grab the best of two worlds.

Background

There are times when we need to store some large binary data as part of a larger, relational set of data. The debate about where this kind of data should be stored is not new. Basically, there are two alternatives, each one offering its own advantages: using a database, which offers transactional integrity and allows to tie all the data together in a consistent way; and using disk files, which can't be beaten when data retrieval speed is a must, and can indeed save a lot of network traffic.

This article presents a C# class, DatabaseFileCache, that aims to get the best of the two approaches. The basic idea is to store the authoritative binary data in the database, and once retrieved, to cache this data in the local file system.

So, the first time a given binary object is requested by the application, it is retrieved from the database and cached in a local file. The next time the same object is requested, the cached file is returned instead, unless the object has changed since the last request. To keep track of changes, timestamp columns are used; the timestamp value is stored as part of the file name in the local cache.

How it works

To use the class, we first need a database table to store the binary objects (a "binary object" is any opaque sequence of bytes). The default name for this table is Objects ("default names" are the names used by the class unless you explicitly specify different ones), and it has the following columns:

  • A string (varchar or similar) column to store the object name (default name is Name).
  • An image, varbinary, or similar type column to store the actual binary data of the object (default name is Value).
  • A timestamp column to keep track of object changes (default name is timestamp).

Some flexibility is allowed here. The value column may be of any type that ADO.NET can convert to a byte array, and the timestamp column may be of any type, provided that it automatically changes whenever the value column changes (the timestamp data type was created exactly for this purpose, by the way). Extra columns may be present in the table as long as they either allow null values, have a default value, or are automatically filled by the database engine when a new record is created.

We also need a directory in the local file system to store the cached files. The default is a directory named DatabaseCache placed under the directory specified by the DataDirectory application domain property (yes, that's the same DataDirectory used by SQL Server 2005 Express connection strings; you can set it with the AppDomain.CurrentDomain.SetData method). If DataDirectory is not set, the DatabaseCache directory is assumed to be in the main application executable directory (Assembly.GetEntryAssembly().Location is used to obtain this information).

The main operation provided by the class is to retrieve a binary object, given its name. This is the pseudo-code algorithm for this operation:

Is there a file in the local cache with a matching name?
  No
    Retrieve the object from the database
    If no object with such name exists in database, return NULL
    Store the object in a file in the local cache
    Return the cached file
  Yes
    Obtain the current timestamp for the object in the database
    If no object with such name exists in database
      Delete the cached file
      Return NULL
    Obtain the timestamp of the cached file
    Do both timestamps match?
      Yes
        Return the cached file
      No
        Delete the cached file
        Retrieve the object from the database
        Store the object in a file in the local cache
        Return the cached file

As you can see, the binary object is retrieved from the database only when needed; that is, when the object is requested for the first time, and when it has changed since the time it was cached; in all other cases, only the timestamp (a few bytes) is retrieved from the database, and the cached file is directly returned.

Note also that dummy cached files (objects that exist in the local cache but not in the database) are appropriately detected and deleted. These dummies may appear, for example, when the database is accessed by multiple users, and one of them deletes an object in the database while another user had already cached the object.

About the object names

To name your binary objects in the database, any string can be used. When storing these objects in the local file system, the file name is composed from the object name in the following way. First, characters that are not valid for a file name (such as "*" or "/") are replaced by the sequence %uuuu, where uuuu is the Unicode representation of the character in hexadecimal; the "%" character itself and the dot are replaced as well. Next, the timestamp value is converted to its hexadecimal representation, and appended to the file name as the extension.

For example, assume you are naming your object as follows:

Fast retransmit *really* increases speed in 20% over TCP/IP.pdf

Then, the cached file name would be the following, assuming a fictitious 4 byte timestamp:

Fast retransmit %002Areally%002A increases speed 
         in 20%0025 over TCP%002FIP%002Epdf.0045AB7F

Such encoded strings can be easily converted back to their original un-encoded versions.

Using the code

To use this code, you need to create an instance of the DatabaseFileCache class. Each instance is associated with a given table in a given database and with a given local cache path (you can specify these parameters in the class constructor, or later via class properties). Once the instance is created, you use its methods to store objects in the database and to retrieve them using the cache mechanism explained.

To instantiate the class, you have three constructors:

  • A simple one which admits a single parameter: the connection object for the database. All other parameters (database table and column names, and the local cache location) are set to their default values.
  • A mid-complex one, which admits as parameters the connection object, the database table name, and the local cache location. The table column names are set to their default values.
  • The complete one, which admits as parameters the connection object, the database table name, the table column names, and the local cache location.

You can change all of the class parameters at any time, by accessing these public properties whose names are self-explaining: Connection, CachePath, TableName, NameColumn, ValueColumn, and TimestampColumn. Note that if you supply a relative path for CachePath, it will be combined with either the DataDirectory path or with the application executable path, as explained above.

There is an extra property that you may want to adjust: CommandTimeout, which controls the maximum time the SQL commands execution may take. It defaults to 30 seconds; you may want to set it to a larger value if you are dealing with very large objects or very slow networks.

As for the public methods exposed by the class, these are the ones:

  • SaveObject: Will store in the database the specified byte array or the contents of the specified file, with the specified name or with the same name as the file itself (there are three method overloads). If there is already an object with the same name in the database, its contents will be overwritten. This method does not access the local cache at all.
  • GetObject: Will retrieve an object from either the database or the local cache, following the algorithm explained above. Returns the path to the cached file, or null if the object with the specified name does not exist in the database.
  • GetCachedFile: Will return the path of the cached file for the specified object name, if it exists, or null otherwise, without accessing the database at all. This method can be used as a "life vest" if the database becomes unreachable, but only if it is acceptable to use data that may be out of date.
  • DeleteObject: Deletes the object with the specified name from both the database and the local cache, if it exists.
  • RenameObject: Changes the name of an object in both the database and the local cache, if it exists.
  • ObjectExists: Returns a boolean value indicating whether an object exists in the database with the specified name or not.
  • GetObjectNames: Returns an array of strings with the names of all the objects stored in the database.
  • PurgeCache: Deletes from the local cache all the files that have no matching object in the database. If the database is accessed by multiple users, it may be convenient to execute this method once when the application execution begins and/or ends.

Note that you can, at any time, manually delete some or all of the cached files, besides/instead of using the PurgeCache method. The class does not maintain any state information about the cached files, it only searches for existing files as needed.

Here are some simple examples of how to use the class. For example, for creating an instance, having the database table named Photos and using C:\PhotosCache as the cache directory, you would do:

C#
SqlConnection connection=new SqlConnection(@"Data Source" + 
     @"=.\SQLEXPRESS;AttachDbFilename="|DataDirectory" + 
     @"|Data\MyDatabase.mdf";Integrated security=true;");
DatabaseFileCache cache=new DatabaseFileCache(connection, "Photos", @"C:\PhotosCache");

Then, to store objects in the database:

C#
cache.SaveObject(@"C:\Photos\DSCF0100.jpg", "Kaito's first birtday.jpg");
// Using an alternative method overload:
cache.SaveObject(new byte[] { 1,3,5,7,11 }, "Some primes");

And to retrieve the data:

C#
string filePath=cache.GetObject("Kaito's first birtday.jpg")
MyForm.MyPictureBox.Load(filePath);

filePath=cache.GetObject("Some primes");
byte[] somePrimes=File.ReadAllBytes(filePath);

Points of interest

Nobody is perfect (not even me!). The DatabaseCache class could be improved/extended in a number of ways. I'll point here the ones that seem more interesting to me.

The class uses SQL Server as the database engine, hence all the connection, command, and parameter objects used belong to the System.Data.SqlClient namespace. It should not be difficult to adapt it for another provider, or even better, to use the ADO.NET factory classes for supporting any provider with one single code base.

Binary data is stored to and retrieved from the database the crude way: by directly using parameters of byte[] type in a single command execution. This will work fine for relatively small objects, but for very large objects, it would be more convenient to use some method for sending and retrieving data in small chunks.

Final note

This is my first submission to The Code Project, so please don't beat me too hard.

History

  • 14 December 2007: First version.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer SunHotels
Spain Spain
Under the secret identity of a C# programmer, a freaky guy who loves MSX computers and japanese culture is hidden. Beware!

Comments and Discussions

 
GeneralGoogle "C# memcached" and that is what you want. Pin
Zhefu Zhang20-Dec-07 7:36
Zhefu Zhang20-Dec-07 7:36 
Generalcache.SaveObject(new byte[] { 1,3,5,7,11 }, "Some primes"); Pin
Axel Rietschin16-Dec-07 2:34
professionalAxel Rietschin16-Dec-07 2:34 
GeneralRe: cache.SaveObject(new byte[] { 1,3,5,7,11 }, "Some primes"); Pin
konamiman.MSX16-Dec-07 21:01
konamiman.MSX16-Dec-07 21:01 
GeneralRe: cache.SaveObject(new byte[] { 1,3,5,7,11 }, "Some primes"); Pin
Axel Rietschin17-Dec-07 3:07
professionalAxel Rietschin17-Dec-07 3:07 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.