Berkeley DB Cleanup in Ports
The purpose of this task is to reduce the number of Berkeley DB versions in the FreeBSD ports tree from ten to two (10 -> 2). We currently offer Berkeley DB versions 4.0, 4.1, 4.2, 4.3, 4.4, 4.6, 4.7, 4.8, 5.3, 6.0.
In the ideal case, we will get along with just db5 (which is 5.3) and db6.
db6 uses the Affero GPL v3 (reference), which is more restrictive on use than the SleepyCat license - basically the source must be offered for software running as a service.
User upgrade documentation
Please do not add questions to this Wiki page. Instead, send them to the FreeBSD ports mailing list with a Cc: to mandree@FreeBSD.org!
For now (Dec. 2013), Berkeley DB ports 4.0 to 4.7 (inclusively) have been marked deprecated and should not be used for new installs, to save yourself the later work of upgrading when they are removed, somewhen in 2014. A schedule may be fixed and announced in 2014, and will leave sufficient time for upgrades.
Before you upgrade an application to use a newer Berkeley DB version, you may need to make some preparations so that your applications' databases can be used with the new Berkeley DB version.
Upgrade warnings and hints
Do not force delete older BerkeleyDB packages, that appears to confuse the upgrade tools and can cause failed application rebuilds. Instead, first rebuild the ports that depend on it, then remove the old Berkeley DB versions. Berkeley DB ports should be able to build/install with an older and a newer version version both installed if WITH_BDB_VER is set. If a port then grabs the old Berkeley DB version, that is an issue with the port that requires Berkeley DB, and should be reported to the maintainer.
when you are using transactional databases (those that store log.* files), you must make sure that databases are consistent. A newer Berkeley DB version (even a minor) can usually not recover from logs that an older version has written.
BEFORE YOU UPGRADE, YOU MUST RECOVER INCONSISTENT TRANSACTIONAL DATABASES WITH THE OLD BERKELEY DB VERSION. This does not apply to databases not using transactions.
if you've been using queue databases in db 4, 4.0 or 4.1 with certain features, you may need to dump them before the upgrade and reload them after the upgrade. Check http://docs.oracle.com/cd/E17076_03/html/upgrading/changelog_4_2_52.html#idp51324064 to see if that applies for your application.
if you've been using hash databases with a version 4.5 or older, and are upgrading to 4.6 or newer, dumping/reloading the databases can improve performance
Authoritative Oracle documentation
This is reference material, you may wish to read through the summary below first and refer to the Oracle documentation in case you have particular questions.
The upgrade process in general is described at http://docs.oracle.com/cd/E17076_03/html/upgrading/upgrade_process.html
Detailed information on upgrading from 4.0 through 4.7 is at http://docs.oracle.com/cd/E17076_03/html/upgrading/index.html
Detailed information on upgrading 4.7 to 4.8, 5.x and 6 is at http://docs.oracle.com/cd/E17076_03/html/installation/index.html
Again: If the application uses Berkeley DB for transactional databases, i. e. when transaction log.* files are used, you must use the old Berkeley DB tools or application to recover the database from its logs in case it is corrupted. Logs are usually incompatible between one Berkeley DB version and the next, the new tools cannot read logs written by older Berkeley DB versions.
The upgrade process is:
- ALL: shut down all application instances (processes) orderly
ONLY for transactional databases (those that write log.* files): see to it that all databases are consistent, and run recovery if necessary. If the application does not offer options, there are dbXX_recover (where XX is a hint to the Berkeley DB version) and for some versions db_recover-XX utilities to achieve that. Make sure the application is not running while you run dbXX_recover or similar tools.
ALL backup databases (*.db files), and where present: environments (__db.* files), log.* files
ALL: if the application offers a "database dump to text"-like utility, use it and backup the result. Else you can use db*_dump* utility (it has the version in its name where I am writing the asterisk (star)).
ONLY for databases with Berkeley DB environments (__db.* files): in doubt, remove the database environment (i. e. the __db.* files, and only those!), the application would normally recreate it on start
- ALL: keep the old Berkeley DB version installed, and install the new Berkeley DB version (depending on the license conditions, choose if you want to upgrade to db5 or db6).
ALL: rebuild the applications to use the new Berkeley DB version, you can set WITH_BDB_VER=5 or WITH_BDB_VER=6 in /etc/make.conf, or uniquename_WITH_BDB_VER=5 (where uniquename is what your port's UNIQUENAME is, for instance, bogofilter_WITH_BDB_VER=6)
CHECK: if you had to, or chose to, dump databases (for compatibility as described in the warnings above, or for performance), move the original database files away to a safe place (do not delete them yet), and reload the databases from the dumps. Either use the application's features, or, lacking that, the db_load* utility of the new version.
- ALL: restart applications
- ALL: finally, if all applications that used to require the old Berkeley DB version have been upgraded and you are sure you do not need the old Berkeley DB versions' tools to recover databases, you can remove the old Berkeley DB version.
- If applications fail to restart after the upgrade complaining about incompatible options, you may need to edit DB_CONFIG files if one of the options you were using was renamed or removed.
If applications fail to restart after the upgrade and complaining they are unable to join an environment, removing the environment (__db.* files) usually suffices.
Ports system documentation (For FreeBSD porters/port maintainers and committers)
Statistics: see Ports/BerkeleyDBStats
Note that db6 massively changed the license from the former SleepyCat to the Affero GPL, so we will have to keep db5 in the tree. The Affero GPL requires "if you run the program on a server and let other users communicate with it there, your server must also allow them to download the source code corresponding to the program that it's running. If what's running there is your modified version of the program, the server's users must get the source code as you modified it." <http://www.gnu.org/licenses/why-affero-gpl.en.html>
- check that there are no dependencies on particular versions for replication; meaning that all applications using an old Berkeley DB version can be migrated and still communicate with their (possibly off-site, or other OS) peers (requested by stas@)
- leave sufficient time (2 - 3 months) so that dependent ports can be fixed
- change the autodetection and defaults so that it prefers "newest installed" version, or db5 for installs from scratch. Very few ports that work with db 4.8 cannot work with db 5.
- fix the autodetection in ports/Mk/bsd.database.mk so that port maintainers, and users, can each specify a range of permitted/supported versions, rather than a minimum, and INVALID_BDB_VER. The latter might work on the assumption that we only ever need to exclude db5 to force a port to use db48, but we already have better examples of constraining versions in bsd.port.mk (b.p.m) that we can reuse
- do an exp-run or similar experiments and/or get maintainers of ports requiring db* on board this project