Caching library

Internal Cache Structure

Cache can be divided into several parts – each should be identified with the symbolic string – 2-level cache. If cache consists of only one type of data, then it isn’t divided – 1 level cache. Let's call each part of the 2-level cache an “entry”. As entries would be registered very seldom – basically during cache initialization – they would be stored in the sorted (by entry name) array.

Let’s consider that we have 2-level cache (all information provided for 2-level successfully applies to 1-level cache). Each part is an entry, where similar types of data are stored (for example, getpwnam function results).

Entry is the lowest level of cache. Each entry consists of its elements. Each element of entry is the pair of symbolic key of arbitrary length, some amount of arbitrary user data and cache specific data (the time of the last request, the total number of requests and the request frequency). With such structure, I’d like to implement each entry as a hash table with overridable hashing function.

So far, we have cache storage architecture basically defined. But maybe even more important part of the cache is its policies. There are 3 policies that must be implemented:

  1. LRU (Least Recently Used) – the element with the oldest request time would be deleted the first.

  2. LFU (Least Frequently Used) – the element with the worst frequency-of-request value would be deleted the first.

  3. FIFO (First In First Out) – the oldest element in the cache would be deleted the first.

To implement these policies we should have an array of pointers to the cache elements. This array will be sorted with qsort, customized by specific function pointer (one for each policy). Memory for this array will be managed in the way to avoid multiple calls of malloc or realloc.

Let's call elements removal from cache a “transformation”. Transformations can be made when the cache size is exceeded or by the user action. Cache flushing is also a transformation and it only can be made by the user request.

Transformation can be applied to every cache entry. If transformation is applied to the whole cache, it is iteratively applied to each cache entry. During cache flushing all elements from cache are removed. When cache size is exceeded some amount of elements should be deleted (this magic number will be 15% by default and would be overridable by user).

Multipart caching

get*ent families of functions retrieve data iteratively. And we can only cache the whole snapshot of these data. So we should have the notion of caching session when caching such type of data. User initializes the session, then adds each part of the snapshot. And then closes the session. After that the whole snapshot is cached. There would be only limited number of sessions and each session would have the limited lifetime. Multipart data can be stored in special type of entries – multipart entries.

To work with multipart entry, user will have to use special functions.

Caching library interface

/* cache library */

struct cache; // I'll be absolutely clear about it during the development process
struct cache_multipart_session; // I'll be absolutely clear about it during the development process

enum cache_entry_type { CEL_MULTIPART, CEL_NORMAL }:
enum cache_transformation_type { CTT_FLUSH, CTT_CLEAN };
enum cache_policy_type { CPT_LRU, CPT_LFU, CPT_FIFO };

struct cache_params
{
        unsigned char num_levels; /* can be 1 or 2 */
        size_t maximum_memsize; /* maximum cache total size in bytes; if 0 then no such check is made */
        size_t maximum_elemsize; /* maximum cache total size in cache elements; if 0 then no such check is made */
};

/* base structure - normal_cache_entry_params and multipart_cache_entry_params are inherited from it */
struct cache_entry_params
{
        enum cache_entry_type entry_type; 
};

/* params, used for most entrues */
struct normal_cache_entry_params
{
        /* inherited fields */
        enum cache_entry_type entry_type; 
        
        /* unique fields */
        const char * entry_name;
        size_t maximum_memsize; /* maximum entry size in bytes; if 0 then no such check is made */
        size_t maximum_elemsize; /* maximum entry size in elements; if 0 then no such check is made */
        time_t maximum_lifetime;
        enum cache_policy_type policy; /* policy, that should be applied during this entry transformation */
};

/* params, used for multipart entries */
struct multipart_cache_entry_params
{
        /* inherited fields */
        enum cache_entry_type entry_type;
        
        const char * entry_name;
        size_t maximum_memsize; /* maximum entry size in bytes; if 0 then no such check is made */
        size_t maximum_elemsize; /* maximum entry size in elements; if 0 then no such check is made */
        time_t maximum_lifetime;
};

/* initialization/destruction routines */
cache * init_cache(cache_params * params);
void destroy_cache(cache * the_cache);

/* cache entries registration/unregistration */
int register_cache_entry(cache * the_cache, struct cache_entry_params * entry_params);
int unregister_cache_entry(cache * the_cache, const char * entry_name);

/* store/retrieve operations used on normal entries */
int cache_store(cache * the_cache, const char * entry_name, const char * cache_key, void * data, size_t data_size);
int cache_retrieve(cache * the_cache, const char * entry_name, const char * cache_key, void * data, size_t * data_size);

/* store/retrieve operations used on multipart entries
 * 
 * entry data, filled with cache_multipart_store are committed only if cache_multipart_session_end
 * function was called
 */
cache_multipart_session * cache_multipart_session_begin(cache * the_cache, const char * entry_name);
int cache_multipart_store(cache_multipart_session * session, void * data, size_t data_size);
int cache_multipart_read(cache_multipart_session * session, size_t part_offset, void * data, size_t * data_size);
int cache_multipart_session_end(cache_multipart_session * session);

/* transforms the specified cache entry, or all entries if the entry_name is NULL */
int transform_cache(cache * the_cache, const char * entry_name, enum cache_transformation_type transformation_type);

Deliverables of the caching library functionality

Caching library can supply the local cache, and will provide all the functionality, described in the Internal Cache Structure paragraph. It will be compiled via standard FreeBSD makefiles into the static library, which will be used by the caching daemon and by the libc in case of in-process caching.

Caching daemon architecture

Communication protocol

Communication protocol is defined below as the amount of structures. The sequence of fields in each structure is equal to sequence in which, they would be sent. Each data buffer is preceeded by its size. Each request structure is preceeded by the unsigned char, which uniquely identifies it.

struct cache_store_request
{
        size_t entry_length;
        char * entry; // ignored if entry_length is 0
        
        size_t cache_key_length;
        char * cache_key;

        size_t data_size;
        void * data;
};

struct cache_store_response
{
        int error_code;
};

struct cache_read_request
{
        size_t entry_length;
        char * entry; // ignored if error_code is 0
        
        size_t cache_key_length;
        char * cache_key;
};

struct cache_read_response
{
        int error_code;

        size_t data_size; // ignored if error_code is 0
        void * data; // ignored if error_code is 0
};

struct cache_session_begin_request
{
        size_t entry_length;
        char * entry; // ignored if entry_length is 0
};

struct cache_session_begin_response
{
        int error;
        int session_id; // ignored if error is 0
};

struct cache_session_end_request
{
        int session_id;
};

struct cache_session_end_response
{
        int error;
};

struct cache_multipart_read_request
{
        int session_id;
        size_t offset;
};

struct cache_multipart_read_response
{
        int error_code;

        size_t data_size; // ignored if error_code is 0
        void * data; // ignored if error_code is 0
};

struct cache_multipart_write_request
{
        int session_id;

        size_t data_size;
        void * data;
};

struct cache_multipart_write_response
{
        int error_code;
};

struct cache_transformation_request
{
        size_t entry_length;
        char * entry; // ignored if entry_length is 0

        enum cache_transformation_type transformation_type;
};

struct cache_transformation_response
{
        int error_code;
};

Communication library

Here is the definition of the communication library, which will communicate with the caching daemon.

/* caching daemon communication library */
enum cached_transformation_type { CDTT_FLUSH, CDTT_CLEAN };

struct cached_connection;
struct cached_multipart_session;

struct cached_connection_params
{
        char * socket_path; /* path to the unix socket */
        time_t timeout; /* maximum allowed transaction time; 0 means infinite */
};

/* connection routines */
cached_connection * open_cached_connection(cached_connection_params * params);
void close_cached_connection(cached_connection * connection);

/* normal entries store/retrieve routines */
int cached_store(cached_connection * connection, const char * cache_key, void * data, size_t data_size);
int cached_retrieve(cache_connection * connection, const char * cache_key, void * data, size_t * data_size);

/* multipart entries store/retrieve operations
 * 
 * they work just by the same rules as in the local version
 * cache_multipart_session can exist even if the cached_connection is closed
 */
cached_multipart_session * cached_multipart_session_begin(cached_connection * connection, const char * entry_name);
int cached_multipart_store(cached_multipart_session * session, void * data, size_t data_size);
int cached_multipart_read(cached_multipart_session * session, size_t part_offset, void * data, size_t * data_size);
int cached_multipart_session_end(cached_multipart_session * session);

/* requests the cache transformation in the daemon */
int cached_transform(cached_connection * connection, const char * entry_name, enum cached_transformation_type transformation_type);

Communication library interface is very similar to the caching library interface. We can even define some macro, to make the calls to the remote cache and to the local cache look similar. I think, it's unnecessary, though.

Daemon workflow

Daemon consists of several threads, which work in concurrent mode. There is a global kqueue, tuned to work with the unix socket. Each request to the daemon is the state-machine. The request life usually looks this way: socket IO operation -> cache IO operation -> socket IO operation. Each state is represented with function pointer.

Here is the approximate definition of structure, which represents request state:

struct query_state;
struct query_headers;

typedef void *(*query_state_func)(struct query_state * qstate);
typedef void * query_process_func;

struct query_state
{
        int sockfd; // socket to read/write
        int kevent_filter; // kevent filter, which can be EVFILT_WRITE or EVFILT_READ
        size_t kevent_watermark; // the amount of bytes we need to be able to send/receive

        query_state_func current_func; // function, which represents current state
        query_state_func end_func; // function, which should be called, when the state is about to be destroyed

        struct query_headers * headers_data; // data, which should be used to process the query
        
        void * data; // data, which were retrieved from cache, or which should be placed into the cache
        size_t data_size; // size of the data

        int status_code;
        uid_t euid; // euid of the process which send the request, received from the socket credentials
};

When request arrives, daemon creates the query_state and adds event to kqueue to read data from socket. After all data are read, daemon processes the request using the caching library. After that it adds event to kqueue to send the response.

Besides request-handling threads there should be the “maintenance” thread. This thread should register all cache sessions and delete “stalled” sessions (sessions, where cache_session_end function was not called).

Caching daemon configuration file

Caching daemon configuration file will be called cached.conf and will be most probably placed in the /etc folder. It will consist of 2 types of elements. First is the key=value pair. Possible pairs are:

max_cache_memsize=n
max_cache_elemsize=n
cache_session_timeout=n
max_cache_sessions=n

The 2nd type of configuration file's element is this:

entry “entry_name” {
multipart=yes|no
max_memsize=n
max_elemsize
policy=LFU|LRU|FIFIO
max_lifetime=n
}

The configuration file will have some pairs, describing global cache structure and then there will be the sequence of entry elements, describing each cache entry. The configuration file will be read by the caching daemon on its startup. It will also be re-read on SIGHUP signal.

Integrating nsswitch and caching

Nsswitch and caching intergration will be made this way: We must implement the “cache” module. This module will contain 1 function. It will use the mdata argument. Function will assume that mdata is the pointer to the structure of this type:

struct cache_mdata
{
        id_func_t id_func;
        marshal_func_t marshal_func;
        unmarshal_func_t unmarshal_func;
};

Function pointers typedefs will look this way:

int (*id_func_t)( char * entry, size_t * entry_size, char * key, size_t * key_size,va_list * ap);
int (*marshal_func_t)(void * buffer, size_t * buffer_size, void * retval, va_list *ap);
int (*unmarshal_func_t)(void * buffer, size_t buffer_size, void * retval, va_list * ap);

The cache_mdata structure will be specified in the dtab array before the nsdispatch call. id_func_t build entry name and cache key regarding the ap argument. marshal_func_t marshals the call result to the buffer. And unmarshal_func_t unmarshals the buffer to the result.

If the cache module is present in the nsswitch.conf then the function from it is called. This function places the data, which were passed in the mdata argument, as the retval. Then nsdispatch uses id_func_t and marshal_func_t and tries to find the call result in the cache. If successful it will be treated as if the cache module returns NS_SUCCESS. In other case it will be treated as if the cache module returns NS_NOTFOUND. However, if in the end of modules chain the successful result appears and it is not from the cache, than it will be stored in the cache. Caching library/caching daemon communication library will be used only inside of the nsdispatch.

Note: may be this approach will change during the implementation. However the basic concept will be the same, I think. Cache module must be treated in the special way – not as the module itself but as the rule of processing query results.

Note: the ability of caching NS_NOTFOUND results is not described here. I will add its description a bit later. It won't need the caching library/daemon architecture to change significantly.

Nsswitch extensions

For now nsswitch should be extended to use these database:

  1. services
  2. protocols
  3. rpc
  4. ssh host keys (openssh port should be patched)
  5. globus grid security files (port should be patched)

When the above is done, I'll be able to implement some other features:

  1. New "user" source that looked in a ~/.hosts file if the binary met certain conditions (not-suid etc.)
  2. nsswitch support for MAC and audit related configuration files

Project dates

I'd like to work on caching daemon and nsswitch extensions in parallel. I'll divide all the time, that I have (july and august) in 4 periods – each will last 2 weeks.

1st period (week 1,2). OpenSSH patch and services patch. 50% of the caching library implementation.

2nd period (week 3,4) Protocols and rpc patch. Caching library completed.

3rd period (week 5,6) Globus grid security files. 50% of the caching daemon and communication library implementation. Nsdispatch integration.

4th period (week 7,8) Caching daemon and communication library completed. All patches are verified, the project is completed.

This schedule doesn't include “user” source implementation and nsswitch support for MAC and audit related configuration files, but I hope that I'll have some time left to make these features. However if I am not able to do it before september, I'll be able to do it after the Summer Of Code.

NsswitchAndCachingTechnicalDetails (last edited 2008-06-17 21:37:36 by localhost)