Pull requests and history

I just made a mistake merging PR 262 [1]. There is nothing wrong with 
the tests that have been submitted; they have been reviewed and fixes 
made. However the history on the branch is confusing and the merge ended 
up pulling in different versions of a bunch of commits that are already 
on mainline. This isn't really trivial to undo so, since I think it is 
harmless in this case, I will leave it. However we should prevent this 
happening again, which means that test authors need to know what to do 
with their branches and mergers need to know to look out for the problem.

My best guess for what happened in this case is that, at some point, the 
author rebased master on top of their branch so that a history like

-C - A1 - A2
   \
    - B1 - B2

ended up like

-C - A1 - A2
   \
    - B1 - B2 -A1' -A2'

Then when you merge you get both A1 and A1' in the history (and 
similarly for A2).

So the first rule is to never rebase any other branch on top of your 
branch, unless you want the commits in the other branch to become part 
of your review. In particular it is never right to rebase master on top 
of your branch.

In general, the simplest rule to follow is "don't merge or rebasing on 
your branch *at all*, unless you *know* that there is a conflict between 
your work and master".

The above rule should be read to include "rebases" that only rewrite the 
branch history but don't actually change the base of the branch. There 
seems to be an unfortunate tendency amongst github users to use git 
commit --amend to continually rewrite their pull request in such a way 
as to only have a single commit on the branch at all times. This is bad 
because, as always with a branch that you have pushed, you don't know 
who else has pulled that branch and is working on it. In particular 
critic pulls all our PR branches and, although I might be able to fix it 
to deal with rebases automatically (there is currently a button that 
needs to be pushed manually), it makes the reviews ugly and hard to 
understand as all commits end up as "equivalent merge commits". It also 
makes the github review tools even more useless since you have to 
restart the review from scratch each time.

In the case where you know there is a conflict between your branch and 
master, there are two options for dealing with it. Option 1) is to merge 
master into your branch. This is straightforward (I think, I haven't 
actually ever done this, as far as I recall), but makes the history even 
more confused than it already is. Option 2) is to rebase your branch on 
master (note that this is the opposite way around to the problematic 
situation above). This is nice, because it makes the history better, but 
is a little more effort. In particular, it is very helpful for critic if 
you can push the branch as soon as you do the rebase so that it can tell 
the difference between the history change due to merge conflicts and the 
history change due to additional commits to the branch.

The one other situation where rebasing is acceptable is after your 
review is done, in order to clean up the branch history. Lots of commits 
of the form "fixup! remove trailing whitespace" don't add any value, but 
do add lots of noise. So doing an in-place rewrite to squash a branch 
down into a few logically separate commits is fine. Of course you need 
to ensure that no one merges your branch before you do this.

Can we ensure that the essential points here are in the documentation 
somewhere prominent? We should have advice both for test authors, and 
for reviews to look out for bad history before merging.

(a final aside: I think that the github UI encourages bad style in 
repository history. If we weren't constrained by that, I would strongly 
advocate using rebases, not merges, to integrate branches with master. 
This would have the significant advantage that the history of master 
would be linear and therefore easy to follow. In particular I would 
suggest a workflow of

(branch) ->- (commit) ->- (get review) ->- (squash) ->- (rebase on
           \--------------<-------------/                 master)

However since github has a big green "merge" button and no "interactive 
rebase onto master" button, that isn't easy to enforce. This makes me 
grumpy.)

[1] https://github.com/w3c/web-platform-tests/pull/262

Received on Wednesday, 21 August 2013 15:39:54 UTC