FreeBSD.org - What went wrong; and a way out

Few would disagree that we, as a project, are in a spot of bother. I'm going to risk trying to document what I've seen happen with freebsd.org over the last 15+ years and explain how and why we got where we are.

I'm going to over-simplify at the expense of pure correctness for things that aren't really important to the direction I'm trying to take this.

In the beginning (roughly 1993) in the aftermath of "the patchkit", there were a group of "core developers". We had a close relationship with Walnut Creek CDROM. The division of tasks and responsibility was fairly distinct, but there were grey areas due to cross pollination.

In general though, the FreeBSD project's part of the deal was we got to make a damn fine unix system. WC-CDROM took care of taking what we produced and beating it into a product. The division was fairly simple.

We did the unix parts, they did installers, releases, release building, packaging, answered the phones, supported end users.

WC failed for many reasons, including its company-wide business model failing to adapt to the coming of ubiquitous internet. It wasn't because there was no market for producing and supporting a freebsd product. They were doing slackware linux and a whole multitude of other things.

There were a few changes of hands, BSDI's implosion, WindRiver, etc.

And somewhere along the way we fooled ourselves into believing we could actually do it all ourselves. Everything from producing a Unix OS where technical excellence mattered through release engineering, packaging, end user support.

In hindsight, we, as a project, spread ourselves way too thin. We compromised our technical work in order to accommodate release cycle demands. We never really committed to improving our install infrastructure, nor packaging, and our end-user support is a black-hole. However, somehow, in spite of re@ barely getting enough support from the project, we manage to put out releases that are "good enough" that there was never a clear vacuum for a 3rd party WC-CDROM-replacement to make a business out of. Our release system and process is still essentially the same as it was when we picked up the pieces from WC-CDROM.

It's not re@'s fault - they work very hard and are responsive and so on. But its perpetuating a mistake that we, as a project made and have been paying for ever since.

The details of how we got here or why aren't really that important. This was meant as a little bit of context for the newer folk from somebody who was there and participated in making the mistakes we've made.

We've been skirting around the broken process for a while now and have developed some highly unusual aberrations as a result, at least from the perspective of open source volunteer projects.

Take the whole slush/freeze thing as an example. Once upon a time, cvs mechanics demanded it. Merging was so painful that we absolutely needed to pre-fix as much as possible before a branch. Over time, we realized that it also was a useful way to force people to help out with the release process.

But now we have a decent SCM system, so the only thing left of the process is "to force people to help out with releases". ie: take away the ability to work on moving freebsd forward, to create an intolerable environment in the theory that developers will work on release issues to get the freeze over and done with.

Wait, what?? That is the kicker here. We have to come up with systems to force people to care about releases? Think about that.

Then along came P4/projects repositories, git/hg and other dvcs's. Part of the attraction of those systems is that the users of them get to ignore the freezes and keep on doing their thing. Having a tier-1 officially supported shared git repo as a means of doing collaborative work means... even more people get to ignore freezes and helping with re@ work.. just like the folks who work in perforce.freebsd.org already get to do..

Wait, what??? How is this a good thing again?

The end result is that we've compromised the very things that made us what we are.

We started off as a developer-oriented Unix project that aimed for technical excellence. An entirely separate entity took care of end user concerns.

Now, we've become an organization that spends more time trying to subvert our own self imposed processes and have let ourselves fall significantly behind in the technical excellence area.

We've become a "Jack of all trades, master of none" project. There is no future in that.

I doubt that there are any people who are truly happy with where we are now, and our destiny if we don't change something means us fading away as an "also-ran".

The "What is FreeBSD, anyway?" question really needs to be answered.

A strawman

The big picture issues are so important that they need to be dealt with first. Which details to deal with later depend on what direction we go.

If I had a magic wand, or a time machine, and could ignore logistics and freely change a few key things, here's what I'd do and why.

TL;DR version first:

  1. I would split freebsd.org development and release engineering into two entities, repositories, etc.
  2. "head" would become "stable" and our development workflows would be changed to support that.
  3. The "head" src repository would NOT have long lived branches.. It would have WIP-feature branches only.
  4. re@ as it exists right now does a very good job of stabilizing trees. Its job would be changed so they'd be responsible for keeping "head" in a perpetually "stable" condition, and they'd be given a stick to wield to achieve that.
  5. core@ (for lack of another entity) would take an active part in tracking long term project goals and progress, flag waving, cat herding, encouragement etc to try and keep people's eyes on the ball - to get stuff done.

And the why's..

Splitting is important because it gives the two groups clear goals and responsibilities. There would be a significant overlap between them of course. It can't work unless there are other infrastructure / workflow and commitment changes below.

The expectation that "head" is usably stable is absolutely crucial, as is the implementation of the support infrastructure to make that possible.

The product of the development half of the project has to be a reasonably "stable" head. I didn't say "release quality", but rather something that you could genuinely expect a developer to run on their desktop or laptop or server.

Having a perpetually stable "head" is crucial because that means that the release organization doesn't have to go through month long freezes or stabilization efforts to get something releasable. The function of the release half of the org, or the sister org, or whatever, is to do timely releases when the timing is right and the incremental feature set is right. And in between releases, they get to work on improving the packaging, installation, integration with the ports folks, etc etc. Once betas are cut and bugs start to turn up, then they're fixed in the release tree and pushed "upstream" to head. There's no more pre-release freezes to fix stuff. head keeps moving.

And for "head" to be stable, then there needs to be infrastructure to make it work. We *need* practical, collaborative feature development branches. Potentially destabilizing changes get thrashed out in a feature branch, like what happens in perforce.freebsd.org.

If you look at perforce.freebsd.org, you'll discover that probably 3/4 of the feature branches are abandoned or dead ends because they didn't work out. This is a good thing, working as intended. It is 100000 times better for a feature that doesn't pass muster to bit-rot in a branch rather than to bit-rot in head.

And this is where re-2.0 needs a stick. Potentially destabilizing patches get sorted out in feature branches and go in when they're ready. If something can't be quickly trivially fixed that otherwise impairs usability of "head" then re-2.0 gets to promptly back it out. If you didn't get a re signoff before committing and breaks something non-trivial way, then there's a free pass for an insta-backout.

And Somebody(TM), presumably core or status-reports-2.0 gets to document actual useful work-in-progress, muster support for important projects, etc.

Here's the next important thing for Somebody(TM) to do.. actively balance development agility vs the ability to abandon old code. We need somebody to give permission to break old stuff so long as the benefits outweigh the loss. That isn't something to burden re-2.0 with, that's more a core type function. It requires to make a considered cost/benefit call and consider how it fits with published timelines so we don't catch downstream consumers off guard.

And the how's...

Agile feature branching is necessary to support a feature-branch driven workflow.

I don't see svn as a contender. (As an aside, recall that cvs->svn was a stop-loss conversion to stop the continual metadata loss from cvs. It was never the "final solution"). Agile isn't a word that people often associate with svn.

The two remaining realistic possibilities are either something like p4 (assuming their upcoming "stream" feature works like it appears), or a dvcs like git / hg.

As much as I'd like to see it be p4, I no longer have confidence in their ability or willingness to actually take care of basic quality of life issues. (We're still waiting for them to fix stuff that they broke 10 years ago). Their command line interface is very second rate and clearly not something that they care about. Not to mention it being closed source.

As for dvcs's it seems fairly clear that the community at large is falling behind git. Yes, there are a few high profile exceptions, but in general the convergence seems to be on git.

Git *does* have serious problems. It also wasn't designed with our freebsd.org-1.0 workflow in mind. However, the question is whether freebsd.org-2.0 and git can put us in a better position. I think it does.

Assuming that we tear up the rulebook for "the way things are done" in freebsd.org universe, here's how it could look with git.

I've glossed over a lot of details, but it certainly can be done in such a way that isn't inconvenient like some of svn's quirks are.

And what does this buy us, after all this turmoil?

At our heart, we always have been an OS project that produces a Unix for developers.

If you look at the film industry for a parallel, the Director makes the film with the goal of making the best film they can. Meanwhile the Producer's job is to try and get a Product their employer can sell. The push/pull between Director / Producer serves their industry well - each party knows clearly what their goal is.

Splitting freebsd into developer and product has similar benefits. It allows us to get back to the core thing that we've always been good at, and that is producing a damn fine unix system. We've never been good at producing a 'damn fine OS release'. Splitting the efforts lets people get back to focusing on their areas of strength.

Of course, some hypothetical entity might come along and realize there is a living to be made in Producing a Product, but I really think we've missed the boat on that one. PCBSD makes a fine product, and we'd probably be better off encouraging them rather than competing with them.

Changing our development workflow like above (eg: the git example) goes a long way towards helping us get back towards having a vibrant 'head' development process that is actually something that most developers would *want* to run on their machines.

And maybe, just maybe, getting back to what we're good at will allow our downstream users to forgive some of our rougher edges in packaging.

What about....

TL;DR

This includes a clear responsibility split along the lines of the film industry's Director / Producer roles.

FreeBSD-ng-detail (last edited 2011-09-11 22:57:51 by PeterWemm)