Parallelization in the ports collection and pkgng utility
Contents
- Parallelization in the ports collection and pkgng utility
- Short description
-
The Ports Collection
- General development approach
-
Parallel installation of several ports
- Problems
-
Approach to solving
- Locking technique
- Port's directory locking
- ${PKG_DBDIR} locking
- ${DISTDIR} locking
- ${PORT_DBDIR} locking
- Prevent parallel ''make install'' process of the same port or same port dependency
- Port X fails to be installed no matter for what reason
- Continuation of the DEFAULT sequence of targets
- Redesign of Conflict checking
- Parallel installation of port's dependencies
- non-default make targets and limitations
- The Code
- pkgng utility
- Deliverables
- Milestones
Short description
The FreeBSD Ports Collection has been a primary system for building and installing software on FreeBSD since FreeBSD 1.0. Nevertheless, it does not provide a safe way for building several ports simultaneously. Port’s dependencies are built sequentially. To maintain the system in consistent state while building several ports it is necessary to prevent concurrent access to shared files and directories from multiple processes. My approach is based on lock files, that serve both as barriers and critical section triggers for several concurrent processes. An important aspect of my project is dealing with various failures and unexpected terminations of port’s build process to avoid any deadlock situations and inconsistent state of the ports system. Firstly, my changes to the ports framework allow safe way to build and install several ports at the same time. Secondly, I designed a convenient approach for parallel port’s dependencies builds. The main aim of this project is to make system updates faster and easier. My modifications to the ports collection allow multicore servers to use all its potential both by installing several ports and several port’s dependencies simultaneously. Same goes for the Tinderbox and pointyhat systems used by port committers. Another benefit of my project is package building with pkgng, since the build systems can build packages in parallel.
This project is consists of several main parts:
1. Parallelization in the FreeBSD Ports Collection.
- Parallel installation of several ports at the same time.
- Parallel installation of the port's dependencies.
2. Parallelization in pkgng. Which involves parallel installation of several binary packages at the same time.
This project also considers that the first port which starts doing something gets priority and no other port is able to interrupt it.
The Ports Collection
General development approach
bsd.parallel.mk
It is appropriate not to flush The big Kahuna with non related to port's build features. All necessary targets and parallel related techniques are concentrated in bsd.parallel.mk file.
This file also includes parallel related variables.
Important
bsd.parallel.mk triggers definitions of several global variables, which are defined in bsd.port.mk (especially ${UNIQUENAME}). It is necessary to include bsd.port.mk in "include section" of bsd.port.mk, just before .BEGIN target. All other bsd.port.mk magic uses global variables, defined in bsd.parallel.mk, starting from .BEGIN target.
${_parv_WANT_PARALLEL_BUILD}
This variable if set to any value acts as trigger for parallel execution of port's build. Example:
.if defined(WANT_PARALLEL_BUILD) .include "${PORTSDIR}/Mk/bsd.parallel.mk" .endif
It is impossible put all parallel related stuff in bsd.parallel.mk. Thus all parallel related behavior of port's build process is also triggered by this variable.
Important
Define as many parallel related targets, variables and algorithms in bsd.parallel.mk as possible.
Add as few lines of code into triggered sections of bsd.port.mk as possible.
- Sufficient naming of parallel related stuff. It must be obvious for developer what variables are related to parallel port's evaluation.
"Infinity" loops
- All parallel applications usually suffer from various barriers. When an application have to do some stuff, but it is unable to implement this stuff due to locking or something similar, to avoid concurrent access to various sources, because an application have to consider evaluation of other processes. Wide spread approach is to stop a process/thread, with an assumption that later this process may be awakened by another process/thread or processes' manager.
But this approach is inappropriate in scope of The Ports Collections.
Hence, in several stages of port's build/install longtime loops may occur (directory locking, dependency spawning, checking currently building dependencies, etc. ). I propose to avoid wasting of CPU time during this stages. In general, such loops look like the following:- explore current state
- attempt to change current state
- exit flag modification
Termination of the process tree
- While working with a huge process tree of spawned background dependencies builds, it is a some kind of challenge to terminate the whole tree in case of some failures or user interrupts.
The problem is that termination of non-parallel parts of port's build works fine using keyboard interrupts or kill utilities. But termination of parent make process, while dependency builds are still working leads to the following problems:
Soft termination . Parent make process will not respond to "soft" kill signals, until all background dependency builds will be processed.
Hard termination . Keyboard interrupts and hard kill signal ( KILL ) will kill parent make process, but this action will leave lots background child processes alive. These processes do not have assign terminal, moreover they will waste CPU time.
I developed rather efficient technique regarding correct termination of the whole process tree on parallel stages of port's build.
All parallel stages have sh script, which controls evaluation of background processes. This script concerns everything what is going under background processes. We also have such script in parent make process. It is sufficient to use sh built-in command trap to listen to several signals and act accordingly. We are able to distinguish parent make script form dependency script, using make global variable ${INSTALLS_DEPENDS}. Hence, only parent make script will use trap to process various interrupts and failures. It is appropriate since only parent make process has assigned terminal, and user expects to interact with this process. ${TERMINATE_PROCESS_TREE} is a sequence of commands, to be evaluated by trap .${TERMINATE_PROCESS_TREE}
- This script contains all magic, related to termination of the whole process tree, starting from parent ${.MAKE.PID}.
This script implements Breadth-first traversal of the process tree. It prevents processes of the current level of tree from evaluation of any commands using STOP signal. Then it determines children of processes of the current level of the process tree and stops them and so forth... It is necessary to stop processes to avoid new untracked PIDs ( otherwise some heavy gcc appears ). Finally, this script kills all tracked PIDs.
The only limitation. It is make process's sh script who is responsible for killing the whole tree, and who concerns how to implement this stuff. It is not the make process itself. Hence, in case of using kill utilities, it is necessary to send signal not to the parent make process, but its script. There is no need to search for the right script! Just use pkill:
pkill -P PID
where PID is pid of parent make process.
Variables available to the user
Variable
Description
_parv_WANT_PARALLEL_BUILD
trigger for parallel ports installation. Set this variable to some value to enable parallel ports build/install. It does not matter what value is assigned.
Example: _parv_WANT_PARALLEL_BUILD=yes_parv_CHECK_ACTIVE_TIMEOUT
timeout in seconds before next check of active builds in case if port is prohibit to spawn another background process.
Default: 2_parv_WAIT_FOR_LOCK_TIME
time in seconds to wait if lock file is locked by lockf(1) in case of directory locking.
Default: 5_parv_WAIT_FOR_UNLOCK_TIME
time in seconds to wait if lock file is locked by lockf(1) in case of directory unlocking.
Default: 15_parv_LOCK_ATTEMPT_TIMEOUT
while trying to lock a directory in "while" loop, if the directory is locked, this variable specifies delay in seconds before next attempt to lock a directory.
Default: 2_parv_ON_LOCK_FEEDBACK_TIMEOUT
while trying to lock a directory in "while" loops, if the directory is locked, user feedback is printed once in ${_parv_ON_LOCK_FEEDBACK_TIMEOUT} attempts.
Default: 2_parv_PARALLEL_BUILDS_NUMBER
number of parallel dependency builds for current port. Default: ${_parv_DEFAULT_PAR_BUILDS_NUM}. If value of this variable is more then ${_parv_DEFAULT_PAR_BUILDS_NUM}, then it will be set to ${_parv_DEFAULT_PAR_BUILDS_NUM}
_parv_PORTS_LOGS_DIR
directory that contains dependency ports' log files.
Default: /tmp/portslogs
Targets available to the user
Target
Description
check-license-depends
license checking for port's dependencies. Does not lock any directory. If any dependencies need to ask for confirmation then port's build stops, and user is listed all ports that will ask for licences checking. Then a user will have to eval "make patch" for the above mentioned ports. Only if no dependencies require license confirmation parallel ports build will be allowed
locking-config-recursive
Configure options for current port and all dependencies recursively, while holding lock on ${_parv_PORT_DBDIR_LOCK_LOOP}. Considers dynamic changes in port's dependencies. Skips already checked ports
Usage practice
It is sufficient to redefine user specific variables, if their default values do not suit user's needs, in /etc/make.conf file.
In some cases user may need to reset parallel port's build. To avoid /etc/make.conf modification one can use ${_parv_WANT_NON_PARALLEL_BUILD} variable. Thi variable has higher priority than ${_parv_WANT_PARALLEL_BUILD}.
Example:cat /etc/make.conf _parv_WANT_PARALLEL_BUILD=yes _parv_PARALLEL_BUILDS_NUMBER=3 make -D_parv_WANT_NON_PARALLEL_BUILD patch
Parallel installation of several ports
This section covers cases when a user have already called make build/install in one port's directory and decides to call make build/install in another port's directory, while the first port is still building.
Problems
1. Prevent concurrent access to shared directories.
It is necessary to enable parallel installation of several ports at the same time.
On callingmake all install
in the port's directory, various directories and files are involved. Especially the following directories are of the main interest for us:- ${PKG_DBDIR}
- ${PORT_DBDIR}
- ${DISTDIR}/${DIST_SUBDIR}
- ${PACKAGES}
- ${PREFIX}
- ${WRKDIR}
files that are listed in pkg-plist, ${PLIST_FILES} and ${PLIST_DIRS}
Also, several important files may be modified. e.g. if it is necessary to add new users or groups (${USERS}, ${GROUPS}. Moreover there may be several other ports this port depends on. Thus all the above mentioned is valid for all dependent ports.
==> It is necessary to prevent concurrent access to above mentioned files/dirs, so that each port's data will not be spoiled by other port.
2. Prevent parallel make install/build process of the same port or same port's dependency.
Scenario:
Port A is installing its dependency port X
Another port B determines that it needs to install its dependent port X, while X and A are not finished yet.
==> It is necessary to prevent port B from doing this stuff.
==> Is it possible to force port B to start installing another dependency instead of just waiting for port X to be installed?
==> What if port X fails to be installed no matter for what reason?
3. Port is unable to use a dependency.
Scenario:
A user called make build ( not install ) of some port A.
The user now calls make install of some port B that has a dependency on port A, which is processing at the moment.
Thus port B is unable to start the installation of its dependency port A. Moreover port B is unable to use port A after it will be processed.
==> Is it possible to determine what target was called in port's A directory?
==> Is port B responsible for calling make install after port A will be processed?
4. Redesign of Conflict checking.
While a port is installed, it is sufficient to check for conflicts on various stages (extract, install), if a user did not disable this option (${DISABLE_CONFLICTS}). But in parallel approach, when port A is being installed and another port B starts its installation, the following applies:
it is necessary to check conflicts with port A and its dependencies (with the assumption that port A will be installed successfully).
Every dependency of port B needs to check conflicts with port A and its dependencies either
Pitfall. Port A was evaluated as the following:
% cd /usr/ports/some_cat/A % make fetch
While fetch is running, port B starts the installation. Naturally, the directory of port A is locked. Thus port B concludes that port A is doing something, but there is no way to determine, what target was called in port's A ${.CURDIR}.
==> Does port B consider port A as a conflict port, in spite of that port A is just evaluating fetch target?
Approach to solving
The main approach to prevent concurrent access to shared directories and files is to lock them. Only the first process which locked the file/dir will be able to implement a sequence of actions. As soon as the process ends its work with this file/dir, it is necessary to unlock it, so that other processes can use it. If a process determines that some directory is locked, it is necessary to wait for unlocking of this directory or act accordingly.
The most appropriate technique for this purpose is to use LOCK files. Thus if some directory contains a specific file it is assumed to be locked.
Locking technique
Directory locking
To check if the directory is locked, it is necessary to evaluate two operations. Look for a specific LOCK file in the directory and, if the directory is not locked, add LOCK file there. It is important for those two operation to be atomic, so that no other process could do some stuff between those operations. Since mv(1) represents an atomic operation while moving files within single file system, it suits our purpose. The -n option is extremely important as it prevents from overriding file if it already exists. Thus the directory locking will be implemented as the following:
First of all it is necessary to create LOCK file in some local directory, and then try to lock some shared directory, using this lock file.
% mkdir dir_to_lock % touch lock_file % ls dir_to_lock lock_file % mv -n lock_file dir_to_lock/ % echo $? 0 % ls dir_to_lock % ls dir_to_lock/ lock_file % touch lock_file % ls dir_to_lock lock_file % mv -n lock_file dir_to_lock/ % echo $? 0 % ls dir_to_lock lock_file %
As you can see, the exit status in both cases is 0, which is not very good as we need to determine whether the file was moved to the directory or not. It is possible to use the -v option and to examine mv's stdout. However, if the file was not moved, it is still located in the original directory and obviously needs to be removed. So one can just try to remove this lock file from local directory and examine exit status of rm command.
% rm lock_file % echo $? 0 #shared dir is locked by another process % rm lock_file rm: lock_file: No such file or directory % echo $? 1 #shared dir is locked by this process
Pitfall
- It is a pain to implement atomic checks considering stalled locks.
E.g. If some port locks a directory, exits unexpectedly, leaving LOCK file. In this case it is also necessary not only to check the existence of LOCK file, but to check the validity of this lock.
- It is a pain to implement atomic checks considering stalled locks.
lockf(1) utility
lockf(1) is sufficient utility for file locking. Fortunately it knows how to perform a cleanup if the process died unexpectedly.
lockf also provides -t option which specifies timeout for waiting for the lock.Snippet:
_parv_DO_LOCK= \ lockf -k -t ${_parv_WAIT_FOR_LOCK_TIME} ${_parv_LOCK_FILE} ${SH} -c '${_parv_LOCK_SEQ}'
Thus ${_parv_LOCK_SEQ} is atomic as no other process is able to implement any sequence of commands locking on the same ${_parv_LOCK_FILE}.
Stalled locks
To consider stalled locks it is appropriate to write locker process's PID to LOCK file. Hence it is possible to check validity of the PID and to conclude whether this lock stalled or not.
ps -p $${pid} > /dev/null && status=$$? || status=$$?; \
File locking
As mentioned in the Problems section, it might be necessary to prevent the parallel modification of several important files; especially files that contain information about users/groups. All operations an logic related to user/group modification is concentrated in the create-users-groups: target of the bsd.port.mk file. There is no need to worry about concurrent user/group modification. This stage is evaluated as the following: test if the user/group already exists, and add user/group otherwise. All user/group addition is implemented using the ${PW} utility.
create-users-groups: ... ${PW} groupadd $$group -g $$gid; ... ${PW} groupadd $$group -g $$gid; else echo \"Using existing group '$$group'.\"; fi" >> ${TMPPLIST}; ... eval ${PW} useradd $$login -u $$uid -g $$gid $$class -c \"$$gecos\" -d $$homedir -s $$shell; ... ${PW} useradd $$login -u $$uid -g $$gid $$class -c \"$$gecos\" -d $$homedir -s $$shell; ... ${PW} groupmod ${_group} -m $${_login}; ... ${PW} groupmod ${_group} -m $${_login}; fi" >> ${TMPPLIST};
pw will prevent different processes from adding the same users/groups and the relevant files will be consistent, so we do not need to worry about them. Other targets or usage scenarios do not seem to have any negative impact on the system users/groups files.
This section gives us a convenient approach to directory locking. But, obviously, just LOCKING of all shared directories while a port and all it's dependencies are being installed will prevent another parallel port installation from doing most of it's stuff. Thus it is necessary to find out what directories (in which stages) must be LOCKED and UNLOCKED. It is also necessary to make the LOCKING phases as short as possible to allow other parallel port installations to be as efficient as possible.
Port's directory locking
Ideally the most convenient approach to signal that some port is being installed is to lock ${WRKDIR} as the make process starts and unlock it as soon as the install target has been evaluated (.BEGIN and .END targets). Unfortunately ${WRKDIR} is not the best directory to place the LOCK file in. It is created only in the do-extract: target, which is unsuitable for a parallel ports installation. Additionally, if a user decides to change ${WRKDIRPREFIX}, another port will be unable to check out the LOCK file in the right place.
==> The most suitable directory to place LOCK file is separate directory, specified by global variable ${LOCK_DIR}. Thus this directory collects files of all currently building ports.
PitfallIn several circumstances, while building a port sub make calls are used in the same port's directory. But accordingly to the above mentioned this directory is already locked. Hence sub make call is unable to do its stuff.
I propose to act as the following. Let's assume that sub make processes are "good" processes, thus we are able to allow child process to use port's dir if it was locked by parent process.
Snippet:cur_pid=${.MAKE.PID}; \ while true; do \ ppid=$$( ps -o ppid -p $${cur_pid} | ${AWK} "NR==2" ); \ if [ $${ppid} -eq $${pid} ]; then \ ${ECHO_CMD} "===> ${${_lock_dir}}/${_parv_${_lock_dir}_LOCK_FILE} is locked by parent make process."; \ ${ECHO_CMD} " We are allowed to work here."; \ break; \ elif [ $${ppid} -eq 0 ]; then \ exit ${_parv_ON_LOCK_EXIT_STATUS}; \ else \ cur_pid=$${ppid}; \ fi; \ done; \
Fortunately, processes tree depth is rather small for such checks.==> This is valid only for port's directory locking.
LOCK file naming is also a major problem. I propose to use ${PKGNAME} as LOCK file name since only this file name may be used while processing ${XXX_CONFLICS} patterns of currently building ports during checking conflicts phase.
Locking phases
Lock - .BEGIN target.
Unlock - .END target.
${PKG_DBDIR} locking
- Luckily we can assume that this variable is not changed by the port.
Nevertheless an implementation of ${PKG_DBDIR} locking is much more complicated than port's locking. Locking phases for this directory are discussed further in this section and below.
It necessary to consider the following scenario:
- Port A starts installation, and has a conflict port B
- Port B starts installation
- Port A finds out that current system does not include conflicting port B
- Port B ends installation
- Port A checks current installation of conflicting port B
- Port A ends installation
Hence, it is necessary to use ${PKG_DBDIR} locking as some kind of barrier for parallel port's build.
Locking phases
fake-pkg target:
It is sufficient to integrate ${PKG_DBDIR} locking technique into ${_INSTALL_SUSEQ} and to surround fake-pkg target by locking targets.
install-ldconfig-file lock-pkg-dbdir fake-pkg unlock-pkg-dbdir security-check
Also, this approach supports pkgng usage in the ports collections.
XXX-depends target:
It is impossible to implement ${PKG_DBDIR} locking just by surrounding XXX-depends target, because in this case locking phase will be to long, which will block parallel execution flow. It is necessary to integrate locking technique inside XXX-depends targets.
Lock - before searching for a dependency port.
Unlock - after searching for a dependency port. It also includes assumption about current build of this port.
${DISTDIR} locking
- It seems that there is no need to add any locking technique for this directory. Also there are some circumstances in which improper use of this directory may cause port installation failures. These situations will be discussed below.
To avoid any collisions, related to ${DISTDIR} it is enough just to force "${DISTDIR} clean" related targets to lock port's directory. This will except potential deletion of currently using dist file.
${PORT_DBDIR} locking
It also reasonable to lock ${PORT_DBDIR} to prevent concurrent access problems. Thus only one port per time is able to configure options. One may think that it is rather long period of locking, while configuring some huge port, like x11*, blender, xorg. It may take from a couple of minutes to half an hour (it depends on user's experience) to process all OPTIONS recursively. Which will block OPTIONS processing for other ports. But populating of ${PORT_DBDIR} is stage that potentially require user interaction. Hence, it is suitable for user to process OPTIONS of one port (which unlocks ${PORT_DBDIR}) and then process OPTIONS of another port.
Locking phases
locking-config-recursive: locking-config-message lock-port-dbdir config-recursive unlock-port-dbdir
Prevent parallel ''make install'' process of the same port or same port dependency
- So far we have developed a convenient approach for directory locking. It is our basis for the parallel port installation.
Now we need to prevent the parallel installation of the same port and change the default dependency checking algorithm. The first part is rather obvious since we know how to lock the port. If port has already been locked, it is necessary just to exit make process. The second part is more complicated. The current dependency checking acts as follows:
Great dynamic target ${deptype:L}-depends: creates the following targets: pkg-depends extract-depends patch-depends lib-depends fetch-depends build-depends run-depends, which serve as dependency targets in the default install sequence: check-sanity fetch checksum extract patch configure build and maybe install.
For our approach, it is necessary not only to check the dependencies in path (e.g. EXTRACT_DEPENDS - A list of "path:dir[:target]" tuples), but we also have to assume that this dependency port might be installed right now. Thus, as we know were to find this port (dir) in the ports tree, it is necessary to check out the LOCK file, as described above. If the port is locked then it is installing and there is no need to install it once more. It is extremely necessary to lock ${PKG_DBDIR} _before_ each of ${deptype:L}-depends: targets and to unlock the directory after each target has been evaluated. The following sample situation will show this importance:
Pitfall
Port A is being installed. Port B starts its installation and finds out that it has port A as one of it's dependency ports. It searches for this port in various directories and, maybe, in ${PKG_DBDIR} (it is not very important for us). Port B does not find this port A as installed port, and assumes that it might be installed now. Then port A finishes its installation, populates ${PKG_DBDIR} and unlocks its ${.CURDIR}. Port B, with its assumption, checks out port's A dir in the ports tree and finds out that it is unlocked. As a result port B starts the installation of port A again.
But with proper locking of ${PKG_DBDIR} port A will be unable to finish its installation. Port B thus will find out that port A is being installed.
I also propose to redesign default dependency installation behavior for parallel purpose. At the moment it works as the following:
${deptype:L}-depends: .if defined(${deptype}_DEPENDS) .if !defined(NO_DEPENDS) @for i in `${ECHO_CMD} "${${deptype}_DEPENDS}"`; do \ ... # to build or not to build ... if [ $$notfound != 0 ]; then \ ${ECHO_MSG} "===> Verifying $$target for $$prog in $$dir"; \ if [ ! -d "$$dir" ]; then \ ${ECHO_MSG} " => No directory for $$prog. Skipping.."; \ else \ ${_INSTALL_DEPENDS} \ fi; \ fi; \ done .endif .else @${DO_NADA} .endif .endfor
This kills parallel benefits, since some port will potentially wait while one of its dependencies will be built by another port. Let's force this port to leave this dependency for further processing and start working with another dependency. Thus redesigning will be some kind of round-robin algorithm. We need a temporary variable for the port's dependencies, so that we will be able to delete already installed ports from the list and try to build ports that have not been found.This will not change execution flow and behavior for non parallel builds, but provides solid acceleration for parallel dependency builds (details of implementation see below).
Important
- If a port determines that one of its dependencies is being built by another port, this is not the reason to assume that this dependency is processed!
Possible Risk
- It might done only if the order of dependency processing does not have any impact on ports installation.
Port X fails to be installed no matter for what reason
- For parallel ports installation a problem occurs, if some port fails.
This port's directory (and potentially ${PKG_DBDIR}) will be still locked. If necessary we can try to reinstall this port with the following mechanism. The locking technique, based on LOCK files enables us to write the PID of the make process to the LOCK file. Thus, the lock file test consists of the following steps (pseudo code)
_parv_CHECK_SEQ= \ pid=\$$(${CAT} ${LOCK_DIR}/$${pkg_name}); \ if [ \$${pid} ]; then \ ps -p \$${pid} > /dev/null && status=\$$? || status=\$$?; \ if [ \$${status} -eq 0 ]; then \ ${ECHO_CMD} '===> $${pkg_name} is already locked by another working process'; \ exit ${_parv_ON_LOCK_EXIT_STATUS}; \ else \ ${ECHO_CMD} '===> Stalled lock Detected for $${pkg_name}'; \ ${ECHO_CMD} ' Deleting stalled lock'; \ fi; \ else \ ${ECHO_CMD} '===> $${pkg_name} is not locked'; \ fi; \ ${RM} -rf ${LOCK_DIR}/$${pkg_name}
It is also worth mentioning that one need just to invokemake install
and ${*_COOKIE} files will skip all already processed targets, if there are any.
Continuation of the DEFAULT sequence of targets
- It is sufficient to assume that some port is able to do what it needs with its dependency port.
No matter what target was evaluated in the dependency port's dir(extract, build, etc. ), if some port needs to install its dependency, then this port is responsible for calling make install .
Scenario:
Port B finds out that it has a dependency port A.
Port A is being processed currently, and no matter with what target make utility was called (fetch, build).
Port B waits until the end of processing of port A.
As soon as port A is unlocked port B checks it's dependency port A once more.
If port A is installed (listed in /var/db/pkg) then this dependency is assumed to be checked.
If port A is not installed, port B calls ${DEPENDS_TAGRET} in the port's A directory.
There is no need to find out what target was called in dependency port's directory. Because, even install target is not a reason to conclude that this dependency will be installed successfully. Anyway, it is necessary to check ${PKG_DBDIR} on existence of this dependency port.
Thanks to ${*_COOKIE} variables, all already processed default steps will be skipped and dependency port A will be installed.
Redesign of Conflict checking
As mentioned in the corresponding section above, it is necessary to force a port to check conflicts not only with already installed ports (which are listed in ${PKG_DBDIR}), but also to check, if any of those conflicting ports are installed now. It will be enough just to examine lock of this port.
Assumption
- The first port being processed (fetch, build, package ...) gets priority over others. Thus we have to assume that this port is more important for the user. We should let a user do anything he wants with port, and do not interrupt this process.
It is also assumed that a port, which is listed as conflicting port in the corresponding ${CONFLICTS*} variable, considers as conflict no matter what target was called in this port's ${.CURDIR}. To put it other words, the port notices a conflicting port, if the corresponding port's directory is locked.
Further, if port B finds out that it has a conflicting port A being processed now, it is necessary to provide the user with a talkative feedback, so that it will be obvious what is going on. Example:
${ECHO_MSG}; \ ${ECHO_MSG} "===> ${PKGNAME} conflicts with currently installing package(s): "; \ for entry in $${conflicts_with}; do \ ${ECHO_MSG} " $${entry}"; \ done; \ ${ECHO_MSG}; \ ${ECHO_MSG} " Please remove them first with pkg_delete(1)."; \ exit 1; \
Also make exits with non zero status on port B.
Despite conflicts checking uses information, stored in ${PKG_DBDIR}, proper sequence of conflicts checking targets prevent us from using of any barriers and ${PKG_DBDIR} locking. This increases efficiency of parallel ports builds.
Possible Risk
check-active-build-conflicts follows the behavior of old bsd.port.mk conflicts related targets. Thus it does not check conflicts recursively. It may be a situation, when some "deep" dependency faces a conflict. This will kill all parallel port build.
Integration
Besides new target ( check-active-build-conflicts ), bsd.port.mk includes other conflict checking targets ( check-conflicts, check-build-conflicts, check-install-conflicts, identify-install-conflicts ). It is appropriate to evaluate check-active-build-conflicts target before other conflict checking targets. Hence, even if during evaluation of check-active-build-conflicts target, some conflicting port ends installation, then this port will be mentioned in ${PKG_DBDIR}. This port will be caught by old bad.port.mk targets. Using such approach we can avoid any additional locking.
Hence check-conflicts now looks like this
check-conflicts: check-active-build-conflicts check-build-conflicts check-install-conflicts
_SANITY_SEQ:
_SANITY_SEQ= \ ... check-depends check-active-build-conflicts identify-install-conflicts check-deprecated ...
Parallel installation of port's dependencies
Parallel ports installation provides sufficient basis for parallel dependencies build.
Nevertheless several features still need redesign to match efficient parallel execution flow.
Problems
Blocked processing of port's dependencies is inappropriate for parallel dependencies processing. ${deptype:L}-depends and lib-depends targets need redesign in a nonblocking manner to match parallel execution flow.
Track processing of spawned dependencies builds. While spawning dependencies builds as sub make background processes parent make process refuses to track exit codes of its child processes.
Moreover there may be several kinds of exit codes:- exit 0
exit codes that signal about execution errors ( <> 0 ).
exit codes that signal that processing of port's dependency was stopped be cause this dependency had already been locked ( ${_parv_MAKE_LOCK_EXIT_STATUS} ).
Blocked execution due to user interaction. Both, the make options command and ports that set INTERACTIVE will block for user input. For a parallelization approach, this currently hinders the execution flow of dependencies, since every dependency will block its execution when waiting for user input. Moreover, process is unable to interact with user, if it is spawned as background process.
Redesign of MAKE output. Parallel processing of port's dependencies leads to mixture of output streams of several child processes, be cause they are directed to single terminal. It is necessary to provide user only with most important information about building process, so that a user does not get a sense of deadlock.
Aproach to solving
Non blocking processing of dependency build
As was mentioned above, we have to redesign default dependency installation behavior. It is worth mentioning that my approach does not change execution flow of non parallel builds.
I propose to substitute "for loop" in XXX-depends targets for "while loop". This will allow us to use round-robin implementation, and not to block port's build process in most of cases while building port's dependencies. Hence, if some dependency is locked, we are able to process next dependency.
Snippet:
${deptype:L}-depends: depends=`${ECHO_CMD} "${${deptype}_DEPENDS}"`; \ depends=$$( echo "$${depends}" | sed 's/^[[:space:]]*//' | sed 's/[[:space:]]*$$//' ); \ active_builds=""; \ while [ $${#depends} -ne 0 ]; do \ i=$${depends%% *}; \ ## ... ## core bsd.port.mk code ## some dependency checks ## ... if [ $$notfound -eq 1 ]; then \ ${ECHO_MSG} "===> Verifying $$target for $$prog in $$dir"; \ if [ ! -d "$$dir" ]; then \ ${ECHO_MSG} " => No directory for $$prog. Skipping.."; \ else \ ${_INSTALL_DEPENDS} \ elif [ $${notfound} -eq 0 ]; then \ depends="$${depends%%$${i}*} $${depends##*$${i}}"; \ depends=$$( echo "$${depends}" | sed 's/^[[:space:]]*//' | sed 's/[[:space:]]*$$//' ); \ ## round-robin step if some dependency is locked elif [ $${notfound} -eq ${_parv_ON_LOCK_EXIT_STATUS} ]; then \ if [ $$( ${ECHO_CMD} $${depends} | wc -w ) -gt 1 ]; then \ depends="$${depends#* } $${depends%% *}"; \ depends=$$( echo "$${depends}" | sed 's/^[[:space:]]*//' | sed 's/[[:space:]]*$$//' ); \ fi; \ fi; \ ## ...
Track processing of spawned dependencies builds
- It is sufficient also to track a list of spawned dependencies.
Hence, while building port's dependencies ( XXX-depends targets ) we have two temporary lists:
depends - A list of "path:dir[:target]" tuples of other ports this package depends on, which are not processed.
active_builds - A list of "spawned_pid:path:dir[:target]" tuples of currently building dependencies.
if [ $${spawned} ]; then \ active_builds="$${active_builds} $${spawned}:$${i}"; \ depends="$${depends%%$${i}*} $${depends##*$${i}}"; \ depends=$$( echo "$${depends}" | sed 's/^[[:space:]]*//' | sed 's/[[:space:]]*$$//' ); \ spawned=""; \ ${_parv_PRINT_ACTIVE_BUILDS}; \ fi; \
Where spawned=$! .
Thanks to sh(1) builtin wait command, we are able to track exit codes of spawned dependency builds, and act accordingly.
Snippet:
for build in $$( ${ECHO_CMD} "$${active_builds}" ); do \ pid=$${build%%:*}; \ dep=$${build\#*:}; \ ps -p $${pid} > /dev/null || { \ wait $${pid} && status=$$? || status=$$?; \ if [ $${status} -eq 0 ]; then \ active_builds="$${active_builds%%$${build}*} $${active_builds\#\#*$${build}}"; \ active_builds=$$( echo "$${active_builds}" | sed 's/^[[:space:]]*//' | sed 's/[[:space:]]*$$//' ); \ builds_num=$$(( $${builds_num} - 1 )); \ ${ECHO_CMD} "=====> $$(cd $${dir}; ${MAKE} -V PKGNAME) is installed"; \ elif [ $${status} -eq ${_parv_MAKE_LOCK_EXIT_STATUS} ]; then \ ${ECHO_CMD} "===> $$(cd $${dir}; ${MAKE} -V PKGNAME) is locked. Unable to start build."; \ active_builds="$${active_builds%%$${build}*} $${active_builds\#\#*$${build}}"; \ active_builds=$$( echo "$${active_builds}" | sed 's/^[[:space:]]*//' | sed 's/[[:space:]]*$$//' ); \ builds_num=$$(( $${builds_num} - 1 )); \ depends="$${depends} $${dep}"; \ depends=$$( echo "$${depends}" | sed 's/^[[:space:]]*//' | sed 's/[[:space:]]*$$//' ); \ else \ ${ECHO_CMD} "Errors occured while building a dependency port $$(cd $${dir}; ${MAKE} -V PKGNAME"; \ ${ECHO_CMD} "Checkout its log"; \ ${ECHO_CMD} " ${_parv_PORTS_LOGS_DIR}/${_parv_PORT_LOG_FILE})"; \ ${ECHO_CMD} "Terminating..."; \ exit 1; \ fi; \ }; \ done; \
OPTIONS and INTERACTIVE targets approach
The make options target and handling of INTERACTIVE flags needs to be redesigned to match the execution flow for a parallelized approach. On gathering all dependencies of a port and the port information, make options could be executed non-parallelized, before executing other make targets. This is also relevant for license checks.
We need to process all stages that require user intervention as soon as possible before parallelization and in non-parallelized manner. fetch-depends is the first target where parallelization occurs, hence it is necessary to process user interaction before fetch target.
_SANITY_SEQ is the most suitable place for such stuff.
Recursive OPTIONS processing
CONFIG target is not a part of the default sequence of targets (check-sanity fetch checksum extract patch configure build). Especially this target is responsible for creation of ${OPTIONSFILE}. User intervention is necessary while populating this file. And all this stuff is concentrated in CONFIG target. All default targets (with the exception of check-sanity and fetch) have config-conditional: target as their dependency:
.for target in extract patch configure build install package .if !target(${target}) && defined(_OPTIONS_OK) ${target}: ${${target:U}_COOKIE} .elif !target(${target}) ${target}: config-conditional @cd ${.CURDIR} && ${MAKE} CONFIG_DONE_${UNIQUENAME:U}=1 ${${target:U}_COOKIE} .elif target(${target}) && defined(IGNORE) .endif
But config-conditional: itself does nothing interesting, but checks the consistence of ${OPTIONSFILE} file. It calls cd ${.CURDIR} && ${MAKE} config in case of problems with ${OPTIONSFILE}.
This behavior is reasonable in case if a user deletes ${OPTIONSFILE} file.
CONFIG target evaluates for the first time before extract target. As mentioned above we need to process all stages that require user intervention as soon as possible before parallelization and in non-parallelized manner. By the way there is also a handful target:
config-recursive: @${ECHO_MSG} "===> Setting user-specified options for ${PKGNAME} and dependencies"; @for dir in ${.CURDIR} $$(${ALL-DEPENDS-LIST}); do \ (cd $$dir; ${MAKE} config-conditional); \ done
But this target requires bug fix
- It is assumed that if a user calls
%make config-recursive
then options of current port and all it's dependency ports will be processed, but- If this port(A) enables dependency port(Z) via options then $$(${ALL-DEPENDS-LIST}) will not include this port(Z), hence options of port(Z) will not be processed.
- If dependency port(B) of port(A) enables another dependency port(X) then options of this port(X) will not be processed either.
Correct evaluation:
config-recursive: config-conditional @${ECHO_MSG} "===> Setting user-specified options for ${PKGNAME} and dependencies"; @for dir in $$(${MAKE} run-depends-list build-depends-list | uniq); do \ (cd $$dir; ${MAKE} config-recursive); \ done
Pitfalls
- bsd.port.mk uses sub make calls, hence config-recursive will potentially be execute several times per port installation.
Hence it is sufficient to follow config-conditional target approach and to trigger evaluation of config-recursive target using ${CONFIG_DONE_${UNIQUENAME:U}} variable.
If "parent" port checks options for all dependency tree, there is no need to evaluated config-recursive target for dependencies. It is enough to trigger its evaluation by ${INSTALLS_DEPENDS} variable. While configuring options for a huge port's dependency tree (blender, xorg, etc.) lots of "options loops" occur. This kills recursive options config approach.
Mail from Marcus Von Appen
- Is there some way to track the recursion behavior? It feels a bit, like it is running in circles - after around 80 minutes (or 50) in total, the configuration checks were still not finished for graphics/blender, which seems to be a bit too much.
Hence, if port finds out that one of its dependencies has been configured there is no need to call config-recursive for this dependency. This reduces evaluation time of config-recursive target from 100+ minutes to a couple of minutes, mostly waiting for user input.
Snippet:
.if !defined(CONFIG_DONE_${UNIQUENAME:U}) && !defined(INSTALLS_DEPENDS) locking-config-recursive: locking-config-message lock-port-dbdir config-recursive unlock-port-dbdir .endif config-recursive: config-conditional @if [ ! ${DEP_CHECK_CONFIG} ]; then \ already_checked_file=/tmp/${_parv_CHECKED_CONFIG_F_PREFIX}.${.MAKE.PID}; \ trap '${RM} -rf $${already_checked_file};' EXIT TERM INT; \ ${ECHO_CMD} ${.CURDIR} > $${already_checked_file}; \ else \ already_checked_file=${DEP_CHECK_CONFIG}; \ fi; \ for dir in $$(${MAKE} run-depends-list build-depends-list | uniq); do \ if [ ! $$(grep $${dir}$$ $${already_checked_file}) ]; then \ ${ECHO_CMD} " configure options for $${dir}"; \ ( cd $${dir}; \ ${MAKE} "DEP_CHECK_CONFIG=$${already_checked_file}" config-recursive ); \ ${ECHO_CMD} $${dir} >> $${already_checked_file}; \ fi; \ done
- It is assumed that if a user calls
Integration
locking-config-recursive target:
_SANITY_SEQ= \ ... check-license check-config buildanyway-message \ options-message locking-config-recursive
_PKG_DEP= check-sanity locking-config-recursive
_FETCH_DEP= pkg locking-config-recursive
## .for target in extract patch configure build install package ${target}: locking-config-recursive config-conditional
Recursive license checking
- As this stage may potentially require user interaction, it is also necessary to process license checks before any parallelization.
Pitfall
Bad news that some ports store license file in dist file. Hence, it is possible to install license for this port only during patch target.
${ECHO_CMD} " The following ports will ask for license conformation:"; \ for port in $${license_to_ask}; do \ ${ECHO_CMD} " $${port}"; \ done; \ ${ECHO_CMD} " Unable to process in parallel way."; \ ${ECHO_CMD} " Call:"; \ ${ECHO_CMD} " make -D_parv_WANT_NON_PARALLEL_BUILD patch"; \ ${ECHO_CMD} " in the following directories:"; \ for dir in $${dirs_to_process}; do \ ${ECHO_CMD} " $${dir}"; \ done; \ exit 1; \
As such implementation does not modify any files/directories, and problems of "death reading" do not seem to be critical, there is no need to implement any additional locking.
Integration
check-license-depends target:
It is important to check licenses, after configuring of port's OPTIONS, since some dependencies may change.
_SANITY_SEQ= \ ... check-vulnerable check-license check-config buildanyway-message \ options-message locking-config-recursive check-license-depends
Redesign of MAKE output
- During parallel ports' builds and parallel dependency builds several stages may require a long time to be processed.
Mostly these situations appear when port needs to wait for some external stuff to be processed (directory locking, dependency spawning, checking currently building dependencies, etc.). This situations are "infinity" loops.
On the one hand, without any feedback user will get a sense of deadlock.
On the other hand, printing feedback message about what is going on on every step of the loop will create lots of noise on console.
The gold middle is use feedback timeout. And this timeout is user configurable.
Such approach is used in various parts of port's build. Hence a user is provided with talkative feedback about what is going on e.g.- What directory have to be locked now
- What dependency is spawning
- What dependencies are building now
- Why make process is unable to spawn more dependencies
and so forth. And this feedback does not annoy a user.
To avoid a mixture of dependencies output, which are directed to the same terminal, a propose to redirect dependency port's output to different file. Thus, if a user wants to track this output, it is enough just to use something like tail -F file_name . It is worth mentioning that if something goes wrong with dependencies builds, it is important to inform a user, what dependency fails to be installed. And my approach does this.
${ECHO_CMD} "Errors occured while building a dependency port $$(cd $${dir}; ${MAKE} -V PKGNAME"; \ ${ECHO_CMD} "Checkout its log"; \ ${ECHO_CMD} " ${_parv_PORTS_LOGS_DIR}/${_parv_PORT_LOG_FILE})"; \ ${ECHO_CMD} "Terminating..."; \ exit 1; \
Pitfall
- Port A has a dependency port B. Port B has a dependency port C ... Port Y has a dependency port Z. Port Z fail to be build. And on console user sees that something went wrong with port B. User checks log file of port B. Finds out that something went wrong with port C. Check log file of port C .... and so forth.
Finally he finds exact reason of failure in log file of port Z.
This approach potentially leads to such form of recursion.
Degree of parallelization
Degree of parallelization is controlled by global variable ${_parv_PARALLEL_BUILDS_NUMBER}.
It is both the default value and maximum value of spawned parallel processes. It is set to the number of CPUs on the user's local machine.
User is able change value of this variable, but it can not be greater than the number of CPUs. The value of this variable will be not changed in this case.
non-default make targets and limitations
Problem
Most of our attention is devoted to parallel ports installation. Non default targets are uncovered.
While processing this targets port neither performs any locking nor checks for LOCK files. While using this target user must be extremely accurate and consider the following limitations.
Limitations
deinstall-depends/deinstall/deinstall-all targets
Port consider that on of its dependencies is installed if it is stored in /var/db/pkg. ===>Be sure not delete a dependency port while one port is being installed. This limitation is also relevant for evaluation of targets that lists port's information based on ports dependencies and ${OPTIONS}
clean/clean-depends/reinstall targets
- Consider not to use this targets neither on the port that is currently being processed nor on one of it dependencies.
===>This will spoil port's data.
- Consider not to use this targets neither on the port that is currently being processed nor on one of it dependencies.
The Code
https://socsvn.freebsd.org/socsvn/soc2012/scher/par_ports
pkgng utility
TODO
Test Plan
(List of steps you plan to use to test your work, as discussed with your mentor)
The Code
(Link to your code, for example https://socsvn.freebsd.org/socsvn/soc2012/username/ or a Perforce depot path)
Deliverables
- Working parallel installation system for ports
- Working parallel dependency build for ports
- Parallel installation/deinstallation support for pkgng
Milestones
(5-10 milestones, with dates, indicating when you hope or expect to be able to complete features. This section is mandatory. Please negotiate these with your mentor to make sure you're not under- or over-estimating the amount of work to be done. Please also make sure the following four dates are included within your milestones)
- May 21: Start of coding.
- June 10: Working parallel installation system for ports (INSTALL target).
- June 15: Working parallel installation system for ports (user configurable DEFAULT targets).
- June 25: First parallel dependency builds.
- July 1: Refactored dependency build. Non blocking builds. Round-robin implementation.
- July 4: Birthday Party.
- July 9-13: Mid-term Evaluations.
- July 15: Reimplementation of OPTIONS and INTERACTIVE targets behavior.
- July 30: Parallel installation/deinstallation support for pkgng.
- August 13: End of coding (soft).
- August 15: Reimplementation of make OUTPUT.
- August 20: End of coding (hard).
- August 20-25: Documentation.
October 18-21: EuroBSDcon ?
- October 20-21: Mentor Summit at Google ?