Branching/version control [was Re: cards.dll]

Wed Mar 17 09:53:06 CST 2004

On Wed, 2004-03-17 at 14:31, Dimitrie O. Paun wrote:
> arch sure sounds interesting (except for the file naming conventions :)),

Yeah, they bug me too, but I don't think it's a big issue really.

> but before we can consider switching we *must* have infrastructure
> available that's comparable to the CVS one. And here I mean:
>   - cvsweb: for web browsing

ArchZoom: http://migo.sixbit.org/software/archzoom/

ViewArch: http://arch.bluegate.org/cgi-bin/viewarch.cgi

>From playing around it seems like ViewArch is closer to what we have
now...

>   - cvsup: for fast synch

Arch already has this built in, effectively. Arch works in a
fundamentally different way to CVS - it's based on applying changesets
in order rather than keeping track of HEAD and working backwards.

ie running "tla get wine" or whatever actually downloads the first
checkin then all the patches and applies them in turn. Obviously that's
too slow for most projects so you can stow cached revisions (ie tree
snapshots) along the way so it only downloads the last revision then
works from there. That only occurs on initial checkout of course.

>From there "tla update" grabs each patch and applies it in turn.

>   - patch: for the winehq-cvs messages

Arch works with changesets natively, so you can run the relevant command
with the patch ID to get this sort of output at any time.

> When all that is in place, we may consider looking into it, to see
> if the switch is worth the pain. Please remember that virtually
> everybody knows CVS, whereas almost no one knows arch, so for new
> (and existing developers) it's a big pain to switch. So we must
> have some pretty important reasons to go down this path, and 
> "arch is a cool concept" does not qualify :)

I think a compelling argument can be made for that. While everyone knows
CVS the subset of commands we all use is tiny, mostly because only
Alexandre commits. In fact I'd guess 99% of CVS usage on wine is:

* cvs update
* cvs diff
* cvs log/status

The equivalents in arch are:

* tla update OR tla replay (difference explained below)
* tla file-diffs

There is no equivalent for file-based log/status AFAIK as arch doesn't
track files individually like CVS does. The upside is that arch
understands things like file renames/moves/symlinks.

OK, so why should we use arch?
==============================

Wine is a project that operates similar to the Linux kernel. There is a
benign dictator, who controls CVS. We all grab CVS, hack in our own
branches, separate the changes out into a patch and email it to
wine-patches. Normally, if we got it right, Alexandre will check it in
forming a logical changeset, and we then all run cvs update which
downloads everyones changes.

This works OK but has a number of disadvantages:

* No branches. 
Some Wine work would be best done in parallel to the main tree. For
instance, the filesystem work, the WM rewrite etc. Wines current modus
operandi makes this very hard, as effectively CVS HEAD must be at least
dogfoodable at all times. This also makes it hard to do R&D projects
like the shared memory wineserver while keeping the results of that R&D
usable.

* Hard to work on locally
Not every patch we write gets checked in, this is the reality of life
working on Wine. Sometimes because those patches are incomplete or
wrong, sometimes because they get forgotten or missed, sometimes because
Alexandre doesn't agree that the code belongs in the main Wine tree
(things like the system tray patch, delayed debug tracing patch etc
spring to mind).

Over time, a particular checkout of wine will accumulate debugging
cruft, random unchecked in patches and so on. I already have one tree
I've practically abandoned for new development because the differential
got so large (11,000+ lines). It gets hard to separate out individual
changes into atomic patches, especially when patches depend on each 
other.

Yes you could say I should never have allowed the diff to get so large,
and believe me I wish it hadn't happened, but in practice people still
use and apply patches like systray/debug tracing and expect me (!) to
maintain them, so I need to keep them around in some form. There are
also patches in there that bounced pending extra work that I never got
around to and so on.

Nowadays I simply use separate checkouts of CVS to try and manage it
all.

* CVS is not changeset oriented.
It's really hard to easily do regression checks because the closest CVS
has to this is date based rewinds. We try and slap changesets on top
using patch.py, which works well, but CVS is not naturally inclined
towards it. Arch lets you say "rewind to this changeset" and it'll just
do it. Binary searches become a lot simpler.

* Hard to get into "The Zone"

Sometimes, some people or teams of people will have a mad coding session
and produce a ton of patches. The Direct3D work by Jason, Raphael and
Christian last year is one good example. My WineCfg work was another.
Unfortunately our current model makes this a total pain in the ass
because the patches are against CVS not your previous work. I ended up
writing some scripts to help with this, by having two trees which I
applied patches to in sequence then generated a diff against. It was
annoying. This is especially true if AJ bottlenecks - for instance
during some of the D3D work he was on holiday.

Why does arch work better?
==========================

Arch is far from perfect, in particular it's rather quirky and has a
ridiculously complex command line interface. However, it has one feature
that is make or break : distributed branching/merging.

In arch you can have a tree in an archive somewhere up on the net, and
others can make a branch of it in their own archive and start checking
in changes. They can remerge periodically and arch will just deal with
problems like multiple remerges (in both directions). The owner of the
original tree can merge the branch into their own when the time is
right, or cherrypick changesets and just apply them.

In other words, I could branch WineHQ, and make some changes, then you
Dimi could branch *my* tree and make some changes yourself and so on.

So how might Wine development look in a post-arch world?
========================================================

Here's one vision. See what you think.

Alexandre of course remains the ultimate maintainer and dictator of
Wine. His tree is the canonical one which we all work from, and he does
releases of his tree a la Linus.

However, let's say I engage on a particular project (make WinFoo work).
Maybe I'm doing it for a customer, maybe I just want to make people in
#winehq happy. I can grab WineHQ, branch it, and start committing. Along
the way, I can submit each changeset back to wine-patches with a very
simple script: arch will generate a whole-tree changeset with one
command. When I next commit of course the diff will be reset to zero so
I don't end up with bits of other patches interfering.

Alternatively, if there are a lot of patches which depend on each other,
Alexandre can pull the tree directly and use archs built in merging
operations to synchronise the two. I can keep on working while this is
going on by the way - AJ can merge with my tree as many times as he
likes, and vice-versa.

Let's say only 90% of the patches I write to make this app work get
checked into WineHQ. Users who only care about this app are still happy
as they can just "tla get mike at navi.cx--wine--winfoo" and grab a version
of Wine that works for their app. The community is happy because the
bulk of the patches got back to the main tree anyway.

Let's say Jason, Raphael and Christian have another D3d codefest. The
best way to work on this is for one of them to branch WineHQ and start
work. The others branch this subtree *again* and begin work on their own
trees. The temporary D3D "maintainer" merges with the other guys work,
and so nobody bottlenecks on AJ or has to stop work for a few days while
the patch queue clears so they can get clean diffs easily.

The end result of that work can be then trivially merged back into
WineHQ. Alexandre can review the entire patch at once to get the
zeitgeist of it, while still seeing the progression of the code if he
wishes. Meanwhile the huge code churn doesn't impact others working on
other parts of the codebase.

It'd also make it practical for Wine to split into subprojects for
really huge pieces of work. For instance, currently MikeM is hacking on
MSI alone. If he had the ability to setup a separate project for it, and
invite others to begin work with him, it might be easier to get people
involved. Once the work has matured the project can be terminated and we
go back to the central peer-reviewed model.

So, what do people think? There are tools to sync CVS and arch, I can
set one up over the holidays (starting for me on saturday) so we can get
a flavour of it, if there is interest.

thanks -mike