Widening of the nsswitch implementation and addition of the caching ability to the nsswitch.
Nsswitch (name service switch) is the subsystem of the FreeBSD OS (http://www.freebsd.org/cgi/man.cgi?query=nsswitch.conf&apropos=0&sektion=0&manpath=FreeBSD+5.3-RELEASE+and+Ports&format=html), which is similar to the nsswitch subsystem in SunOS (http://docs.sun.com/app/docs/doc/816-5174/6mbb98uhu?a=view). It is widely used by several system functions. There are 2 basic concepts in the nsswitch: database and data source. Currently possible databases are: passwords (user accounts), groups, hosts, networks and shells. Information from these databases can be obtained by using getpw*, getgr*, gethost*, getnet* and getusershell families of functions. Nsswitch allows user to assign different data sources to each database. Files, nis, dns are among standard data sources. Each source can be implemented as the plugin. For example, LDAP source plugin is widely used. However, there are only 5 possible databases where nsswitch can be used. In SunOS, for example, this number is significantly bigger. In FreeBSD these databases can work with nsswitch, but this ability is not implemented:
- SSH host keys
- Globus grid security files
To make these databases working with nsswitch we need to patch several libc source files (see the details in the project description part). So the first goal of the project is to extend implementation of the nsswitch subsystem in FreeBSD.
One of the great things about nsswitch is that all work with databases is done in the similar way: system calls nsdispatch (http://www.freebsd.org/cgi/man.cgi?query=nsdispatch&apropos=0&sektion=0&manpath=FreeBSD+5.3-RELEASE+and+Ports&format=html) function, which then execute corresponding functions in the data sources (data source can be implemented in the libc or in the external .so plugin). So we have an ability to organize caching of results of all queries to the databases. There are 2 basic strategies of implementing caching: we can make an in-process caching (integrate caching mechanisms in the libc) or we can implement special caching daemon. When using in-process caching, all cached data are visible only by the process, which made the system calls. And the cache is lost when process is killed. When using caching daemon the cache will be system-wide - it would be used by all processes, but it will be slower, because of the overhead of all connection operations (we must communicate with daemon by using unix sockets, for example). So, the second goal of the project is to implement the caching ability in the nsswitch, so that user will be able to dynamically choose, which strategy to use (by specifying the variable in the rc.conf, for example).
Benefits to the FreeBSD community
Caching can significantly improve performance of several system function calls (especially when working with such sources as NIS and LDAP). It is proved by my lookupd project (see Biography part for more details). But caching can be used only in places, where nsswitch is used. That's why there is a need to extend nsswitch implementation. Besides, after such widening, more system databases will work in the similar way - as they already do in SunOS, for example. And users will be able to use different data sources to work with these databases - to use LDAP for services database, for example.
To extend nsswitch implementation in the FreeBSD, we must make several system function calls working via nsdispatch. Their current code will be moved to the data source implementation (files or NIS data source, for example), and function body will only execute nsdispatch with appropriate arguments. If NIS and files sources functionality is mixed in the code, than it will be moved to the files source and to the NIS source separately. For such databases as "services", "compat" source will be implemented (so that we can use "+" in the /etc/services file). Here is the approximate list of libc files to be patched:
When this stage of work is done, users would be able to configure all these databases with /etc/nsswitch.conf.
I'm planning to implement caching feature in 3 steps:
1st step. Implement the caching library - the simple library with 4 basic functions to work with cache. Here is their approximate definitions:
- cache * init_cache(struct cache_params params);
void store(cache * the_cache, const char * cache_key, void * data, release_func_t release_func); (memory for each cache entry can be freed with its own specific release function)
- void * retrieve(cache * the_cache, const char * cache_key);
- void transform_cache(cache_transform_enum operation); (possible transforms are: flushing and removing old entries)
- void release_cache(cache * the_cache);
Note: these definitions are quite approximate right now. During the work, they may be changed.
2nd step. Implement the caching daemon. Caching daemon will use pthreads library - it will consist of several concurrent threads, which will listen to the unix socket by using kevent (http://www.freebsd.org/cgi/man.cgi?query=kevent&apropos=0&sektion=0&manpath=FreeBSD+5.3-RELEASE+and+Ports&format=html). Daemon will work as the state machine - each query will have a state. State will consist of amount of bytes read/written, the cache key, the founded (or not founded) caching entry and the user credentials (which would be obtained through socket credentials mechanism). I've used the similar model in the lookupd project (kevent+state machine) and it has shown excellent results. To organize the caching itself, the daemon will use the caching library.
3rd step. Make system function calls, where nsswitch is used, able to use caching. This will be done by patching nsdispatch function. Nsdispatch will have to recognize "cache" source - so that it could be specified in the nsswitch.conf file. Nsdispatch will use in-process caching model or the caching daemon (it will be configured by the rc.conf variable). User will be able to turn caching off in compile time by specifying special macro.
I will be able to start in any time in June, so I'll have about 2 months to finish the project. Here is the approximate schedule:
- Widening nsswitch implementation - 2 weeks
- Implementing caching library - 1,5 week
- Implementing caching daemon - 2,5 weeks
- Patching nsdispatch (to use caching from system functions) - 2 weeks
So I'll have the project completed in the end of the August 2005.