Monday, April 8, 2013

A fixed capacity cache with an LRU eviction policy in C#

One of the critical strategies in creating a system that performs well at high scale is effective caching.  In distributed systems, caching can be your best defense against overloading a database, and an effective solution to temporary database failures, so you can keep servicing requests during downtime.  Our services often provide a cache timeout per endpoint, depending on how fresh the data needs to be for the system to function reliably.  However, the cache size can grow unwieldy if there are many unique calls to an endpoint in a short period of time, when only a subset of the calls are likely to benefit from the caching.

The solution is a cache of a fixed size, which evicts items using a least recently used policy when capacity is reached.  Requirements are that the cache support O(1) inserts and reads, and manage the freshness of the data internally.  The user should be able to specify a capacity of the cache, as well as a staleness timeout.  The interface in my example would be a string-string key value store, with functions for set(key, value), and get(key).  Expired or non-existing items return null strings.

What makes this a fun problem is understanding how to mix data structures to achieve these goals.  Constant time lookups and inserts mean you want a hash table to be your primary storage.  To maintain an order, we need to store our nodes in a list... but it is important to make sure removing nodes from the list, as well as adding either to the head or tail of the list occur in constant time.  A dequeue is appropriate for this goal.  In my design, I'll store the dequeue nodes in the hash table indexed by the key, and a tuple containing the key, value, and timeout inside the dequeue nodes.

My solution is below:

No comments:

Post a Comment