Lessons Learnt from the April 2012 lang/lua disaster
This document is a loose collection of thoughts that need to be ordered, and needs some more markup (for code).
Problem (Symptom)
lang/lua 5.1.4 and 5.1.5 went through several non-functional PORTREVISIONS because we hadn't gotten the CFLAGS handling right. Each time we'd thought we'd fixed the port, we received new PRs complaining about missing features, broken builds, and other mishaps.
The underlying cause
/etc/make.conf gets read three times:
- when make starts, from /usr/share/mk/sys.mk, and before make starts looking for BSDmakefile or Makefile;
- when the do-build: target launches make again, i. e. after lang/lua/Makefile has made essential additions to, for instance, CFLAGS -- these get lost;
- inside the port, which adds MYCFLAGS and re-executes make.
In case the user set CFLAGS=anything in /etc/make.conf, that would override changes that the port had made.
Random thoughts to evolve to lessons learnt, and contributions to the porter's handbook and possibly Mk/bsd.port.mk
passing CFLAGS=... in MAKE_ARGS is often harmful to the average port, including lang/lua, and needs to be limited to cases where a port's CFLAGS needs to be stomped over (which should hopefully be rare - needs to be investigated)
MAKE_ARGS get invisibly passed to sub-makes by means of the ${MAKEFLAGS} - see man make(1), where it overrides variable definitions from the environment, or inside the Makefile.
lang/lua has opted to set MAKE_ARGS= __MAKE_CONF=${NONEXISTENT}
NOTE that this does not prevent the first reading of /etc/make.conf - that happens before the Makefile is read (see make's manual page).
We should consider moving this into ports/Mk/bsd.port.mk. This makes sure that /etc/make.conf is only read once, before the port's category/port/Makefile. This is sufficient to populate variables and influence the port's build.
The whole issue is sidestepped with gmake, i. e. in ports that set USE_GMAKE=yes -- gmake doesn't read /etc/make.conf, and doesn't fall prey to CFLAGS=... set there (rather than CFLAGS?=... or CFLAGS+=...). (?) How many ports get "magically" fixed through USE_GMAKE=yes when the only reason is that /etc/make.conf overrides CFLAGS additions?
- It was proposed to fix our /usr/share/examples/etc/make.conf to no longer show CFLAGS=..., but I believe this is infeasible:
- it has been established practice for years, including all FreeBSD releases through 4/2012
- we can't erase all copies, in printed books, third-party websites, of such examples;
- unless we fix /etc/make.conf being read more than once, options set with += (such as CFLAGS) might duplicate these settings in ports using BSD make, the more often the deeper make recurses (todo: check this)
- mezz@ proposed that I patch CFLAGS into the port's src/Makefile in post-patch stage (we already have REINPLACE_CMD to kick the hard-coded -O2, per the porter's handbook), but that requires "make clean patch" to make CFLAGS take effect, which is a disadvantage, and unnecessary once we avoid losing additions to CFLAGS made in lang/lua/Makefile
- Hard to convince Pav it's needed, but: #bsdports 2012-04-11 21:56:37Z Pav: I need to come up with a legit reason to allow re-reading make.conf deep inside builds
- #bsdports 2012-04-11 22:01:23Z: Pav: exp-run is totally ineffective here, because it can't identify interestingly innovative real world use cases
Contributions
If you were involved in reporting and/or solving the bug, have observed related bugs, you are pav@, crees@, ohauer@, Niclas Zeising, mezz@, feel free to add to this section, and sign your contributions like this -> -- MatthiasAndree