Talk to be held at Devsummit 2010 in Ottawa.
Slideshow ^ |< << Slide 1 of 1 >> >| |
git primer
- what
- how
- why
- where
- next
disclaimer: I haven't worked with Perforce in any way or form yet.
what is git?
- fast
- distributed
- work offline
- local branches (private)
- remote branches (multiple namespaces)
- random merging/fetch possible, but a common ancestor really helps
- efficient
- src svn repo: 4419 MB, git: 510 MB
- ports cvs repo: 1519 MB, git: 364 MB
- svn head checkout: 1195 MB, git clone: 1027 MB
- ports cvs checkout: 685 MB, git clone: 822 MB
how does it work
- directed acyclic graph of commit objects pointing to trees (really DAGs) of blobs
blob objects stores file data
tree objects point to blobs via filenames and to other tree objects (subdirectories)
commit objects contain date, author, message; they reference (multiple) parent commits and a single tree object
tag objects are special commits, PGP signed
all identified by SHA1 sums. no delta encoding of changes, objects are put into packfiles that use delta compression, though.
the big picture
local and remote branches
- default names are "origin" and "master", by convention
local: .git/refs/heads/<name-of-branch>
remote: .git/refs/remotes/<name-of-remote>/<name-of-branch>
tags: .git/refs/tags/<name-of-tag>
- these files simply contain SHA1 values pointing to commits
all the content is stored under .git/objects though
index
- annoying at first
just use git commit . or git commit file1 file2 dir3
staging area between workspace <-> repository
- lets you craft a single commit bit by bit
real power lies in git add -i somefile
rewriting history
git commit --amend (trivial, top-most commit is replaced)
git rebase origin/master master (powerful, re-applies all mybranch changes on top of master)
- never loses any state!
git reflog shows all previous tips/states
stay around till git gc or as part of git repack
source: http://eagain.net/articles/git-for-computer-scientists/
rewriting history (cont)
git rebase -i master mybranch (interactive mode, reorder, reword, squash, rework)
- don't merge multiple feature branches into head, keep history readable!
- don't drop big fat patch
- push half dozen logically separate commits into the upstream repo
- rebasing is what allows you to craft these handful of commits locally
why? branches
svn branches under user/ are single user (or perceived as such)
svn branches under projects/ are for serious work
- local git branches are light and easy
- everybody can have a crack at stupid/silly idea
- if it works, rework history into something that's understandable
PROFIT publish!
- if it fails, no embarrassment!
people want to keep ideas to themselves! Impossible with plain svn (don't know about svk).
why? distributed
- don't send around patches, send pull requests!
- group of people can polish a set of commits (history rewriting is good!)
- this includes non-committers of course (helps mentors, Google SoC)
pull from gsoc student, git rebase -i (squash, edit, fix conflicts), push to svn
- git distinguishes author vs. committer
- no patching conflicts, as you checkout "remotes/some-guy/featureX"
why? not mercurial
- repository size is rather big
- "then use a limited history!"
- no, we need a common ancestor for easy commit/changeset sharing!
- we need the full history (head branch might suffice, though)
- git: shallow history + graft of old history if desired
- named branches don't scale, clones are more work
one workspace with fast branch switching is preferable (and you can still clone!)
- history rewriting (rebasing) is slow (last time I checked)
- named branches, local branches, mq: all work slightly different
- git's index actually helps
note: I actually like mercurial!
the bad parts
no $Id$ support whatsoever
- no monotonically increasing "revision" number
- whole-tree approach (but submodules possible)
you cannot branch vendor/foo, the whole tree is branched always
- you cannot merge subdirs only, need to cherry-pick instead
svn mergeinfo --show-revs=eligible {head,stable/8}/games/fortune/datfiles
git log --left-right --cherry-pick master...stable/8 -- games/fortune/datfiles
- the "flatten vendor tree" commits would need to be reversed
possible using git-filter-branch, but not done yet
- due to svn:keywords:
- cannot push new files via git-svn
- cannot push changed files, where svn:keyword is missing
where to get
diy: git svn init -Thead svn://svn.freebsd.org/base && git svn fetch [-r REV]
git-svn cannot convert our vendor*/* branches
svn2git can do this, but has minor bugs when branching cvs2svn branches (help wanted, looks easy!)
- using fromcvs + svn2git and "grafting" the branches together might work
or manual fixups
- due to common ancestor constraint, we should put an official version somewhere[tm]
- cloneable versions at
git://gitorious.org/freebsd/freebsd.git (head since 2005, no authormap, 279 MB, git-svn)
git://acme.spoerlein.net/freebsd-head.git (head, 431 MB, git-svn)
git://acme.spoerlein.net/freebsd.svn.git (head, stable, releng, release, 478 MB, git-svn)
git://acme.spoerlein.net/freebsd.git (all 1920 tags/branches, 510 MB, svn2git)
git://acme.spoerlein.net/freebsd.ports.git (ports, no authormap, 375 MB, fromcvs)
- NetBSD and OpenBSD available too, updated once in a while
- resource in FreeBSD cluster wanted!
coming soon: rsync of the git-svn repo suitable to use git svn dcommit
now what?
- we really need a r/o git repo for contributors
- need a way to share git repos (see how Dragonfly folks do it)
- git.freebsd.org/freebsd.git
- git.freebsd.org/~uqs/myclone.git
- git.freebsd.org/~gsoc2010/studentA.git (auth and permissions using SSH keys)
- contributors may use github.com or gitorious.org
- developer does add/fetch; reviews feature branch
git rebase master someone/feature2 && git merge master && git svn dcommit (pushes to SVN a series of commits)
now what? (cont)
git haters are not left out!
git send-email or git format-patch makes patches out of commits
gitters can incorporate those patchbombs using git am
- rework commits as usual; return patch-bomb
- do not review diffs using mail!
submitter has to re-type everything! twice the work
- just commit your changes, let them pull again
- don't put versioned diffs/patches somewhere
- a dormant git branch costs nothing, but has sensible commit messages and recorded branchpoint
- allow crazy stuff to happen!
- clang/llvm import using official repo ok; gcc45 import? (GPLv3)
ramblings
- p4 to git converters exist, grafting work onto "official" git repo is possible
- guinea pigs wanted!
- which parts of our svn repo would you like to "fix", given a time machine?
- tell me!
- who has sources of 1/2/3/4BSD? did they use an SCM back then?
- can we bring back 386BSD and FreeBSD 1.0 into the SCM?
- (graft-points can stitch different repos together)
additional goodies:
- gitweb is kick ass
git grep (damn you svn!)
git log -S<string>
git stash
conclusion
- fast and efficient
- distributed
- private and public work share same techniques
- contributors more familiar with git than p4?
- contributors get first-class access to repo