The Hackerlab at regexps.com

Idempotent Merging

up: arch

[Idempotent merging is not implemented in the current release.]

Let's suppose that we have a main development path, with several branches:

        main
        ----
        base-0
        patch-1
        patch-2-----------------------> branch-a
        patch-3                 |
        patch-4                 |
        patch-5                 |-----> branch-b
                                |
                                |
                                 -----> branch-c

What happens if each of the three branches checks in a revision that is an update against the main branch. In other words, each branch will have a delta that that summarizes patches 3..5 of the main branch.

If we try to replay from two or more of those branches, we'll wind up replaying several of those deltas that summarize 3..5 . Those patches will be redundant and will likely generate merge conflicts.

Update won't do much better in this case. The common ancestor of all three branches is their base-0 revision, which is the same as patch-2 on main .

Now suppose I start with the latest revision on branch-a , which includes patches 3..5 of the main branch, plus some changes specific to branch-a . And branch-b is similar -- it has 3..5 from main some changes specific to branch-b .

If I update my branch-A revision against branch-B , the A revision is compared to the common ancestor. In essence:

        upate_patch = delta (branch-a--latest, main--patch-2)

Note that the update patch contains all the changes needed for 3..5 from main . update will apply that patch to the latest revision of branch-B :

        update_a_from_b = update_patch [ branch-b--latest ]

but branch-b--latest already includes patches 3..5 from main. There's a good chance the merge will have conflicts.

When such messes occur, the reconcile command, introduced in the previous chapter, can help you out of them. But wouldn't it be better to avoid such problems in the first place?

The i-merge Command

A tool for tackling the problem directly is an idempotent merge:

        % larch i-merge [  --update [ARCHIVE/]REVISION
                        | --replay [ARCHIVE/]REVISION ]
                       [ARCHIVE/]SOURCE-REVISION
                       directory

That creates a project tree in DIRECTORY by using larch get to obtain the SOURCE-REVISION , then applying each of the specified update and replay commands, in the order specified.

If any merge conflicts occur, the command issues an error, and leaves the partially meged directory, along with an explanation of where it left off.

If the command succeeds, though, the project tree will be left in a special state which permits the use of the --idempotent flag to commit .

        % larch commit --idempontent

which, in fact, creates two new revisions. The first revision created is the intermediate directory, containing only SOURCE-REVISION plus the series of updates and replays you specified -- no other changes. The log message for this revision is automatically generated, and has the special header idempotent-merge: with the list of patches applied.

The second revision contains the log message you wrote, plus any subsequent changes you made.

When reply wants to apply a patch set, it checks to see if it is an idempotent patch set. If it is, and all of the patches included in the patch set are missing from the tree being patched, replay proceeds in the usual way: by applying the set of deltas in the patch set.

If some of the patches included in an idempotent merge have already been applied to the tree being patched, then `replay applies only those patches not already included.

One possible policy is that every branch should merge only from the main branch, and should always merge from the main branch using an idempotent update:

        % cd ~/wd

        % larch i-merge --update main branch-a branch-a-merged
        [...]

Each branch will then contain a number of idempotent patch sets, as in this example:

        branch-a                                            branch-b
        --------                                            --------
        base-0                                              base-0
        patch-1                                             patch-1
        patch-2..."idempotent merge w/main patches 2,3"     patch-2
        patch-3         "idempotent merge w/main patch 2"...patch-3
        patch-4                                             patch-4
        patch-5..."idempotent merge w/main patch 4"         patch-5
        patch-6         "idempotent merge w/main patch 3"...patch-6
        patch-7..."idempotent merge w/main patch 5"         patch-7
        patch-8                                             patch-8
                    "idempotent merge w/main patches 4,5"...patch-9
                                                            patch-10

What if we want to form a merge of these two branches?

idempotent Merges and the replay Command

We can start with a project tree for the latest revision of branch-a

        % larch get ~/wd/branch-a
        % cd ~/wd/branch-a

Branch-a does not already have a patch log for branch-b , though the two branches have a common ancestor, so add-sibling-log will solve that problem:

        % larch add-sibling-log branch-b
        [....]

Now we can find out what the merge needs to do:

        % larch whats-missing branch-b
        patch-1
        patch-2
        patch-3
        patch-4
        patch-5
        patch-6
        patch-7
        patch-8
        patch-9
        patch-10

If we use replay :

        % cd ~/wd
        % larch replay ~/wd/branch-a ~/wd/branch-a-merged branch-b

These patches will be applied:

        branch-b/patch-1
        branch-b/patch-2
        branch-b/patch-4
        branch-b/patch-5
        branch-b/patch-7
        branch-b/patch-8
        branch-b/patch-10

Patches 3, 6, 9 are skipped (though their log entries are added to the project tree) because branch-a already has all of the patches those idempotent patch sets include.

idempotent Patch Sets and the update Command

What will update do? It computes a patch between the project tree being updated and the common ancestor:

        update_patch = delta (branch-a--patch-8, branch-b--base-0)

then applies that to the update revision:

        branch-a-mege  = update_patch [ branch-b--patch-10 ]

The idempotent revisions don't help there. However....

idempotent Patch Sets and Partial Updates

The --partial flag to the update command takes advantage of idempotent patch sets.

As before, we assume these revisions:

        branch-a                                            branch-b
        --------                                            --------
        base-0                                              base-0
        patch-1                                             patch-1
        patch-2..."idempotent merge w/main patches 2,3"     patch-2
        patch-3         "idempotent merge w/main patch 2"...patch-3
        patch-4                                             patch-4
        patch-5..."idempotent merge w/main patch 4"         patch-5
        patch-6         "idempotent merge w/main patch 3"...patch-6
        patch-7..."idempotent merge w/main patch 5"         patch-7
        patch-8                                             patch-8
                    "idempotent merge w/main patches 4,5"...patch-9
                                                            patch-10

We have a tree for branch-a--patch-8 and it's ancestor on branch-b is base-0 . So:

        % cd ~/wd/branch-a--patch-8
        % larch whats-missing branch-b
        patch-1
        patch-2
        patch-3
        patch-4
        patch-5
        [...]
        patch-10

If we use:

        % larch update --partial ~/wd/branch-a--patch-8 \
                                ~/wd/branch-a-b-partial-update \
                                branch-b

then update uses not the latest revision of branch-b , but the revision just prior to the oldest idempotent patch set that branch-a does not yet have. In this case, patch-3 of branch-b is the oldest idempotent patch that branch-a is missing. So:

        % larch update --partial ~/wd/branch-a--patch-8 \
                                ~/wd/branch-a-b-partial-update \
                                branch-b

is eqivalent to:

        % larch update --partial ~/wd/branch-a--patch-8 \
                                ~/wd/branch-a-b-partial-update \
                                branch-b--patch-2

After which:

        % cd ~/wd/branch-a-b-partial-update
        % larch whats-missing branch-b
        patch-3
        patch-4
        patch-5
        [...]
        patch-10

When the earliest missing patch prior to a partial update is an idempotent patch set, and the tree already has all of the patches included in that patch set, update simply adds the log message for the idempotent patch to the tree being patched, and stops.

If the tree being patched has none of the patches included in that patch set, update updates against the idempotent revision in the normal way.

Finally, if the tree being patch has some but not all of the patches included in the idempotent patch set, udpate gives up with an error and suggests that you use replay to apply the idempotent patch set. (In a future release, update will do something more intellegent in this case).

The upshot of this is that you can merge branches A and B with a series of partial updates and replays:

        # update against patch-level 2
        # 
        % larch update --partial ~/wd/branch-a--patch-8
                                ~/wd/branch-a-b-partial-update
                                branch-b

        # replay patch-level 3
        #   -- we already have all the patches included in the
        #      idempotent branch-b patch, "patch-3" - so all this
        #      really does is install a log message.
        # 
        % larch replay ~/wd/branch-a-b-partial-update
                      ~/wd/branch-a-b-partial-update-2
                      branch-b--patch-3

        # update against patch-level 5
        # 
        % larch update --partial ~/wd/branch-a-b-partial-update-2
                                ~/wd/branch-a-b-partial-update-3
                                branch-b

        % larch replay ~/wd/branch-a-b-partial-update-3
                      ~/wd/branch-a-b-partial-update-4
                      branch-b--patch-5

        % larch update --partial ~/wd/branch-a-b-partial-update-5
                                ~/wd/branch-a-b-partial-update-6
                                branch-b

        % larch replay ~/wd/branch-a-b-partial-update-6
                      ~/wd/branch-a-b-partial-update-7
                      branch-b--patch-9

        % larch update --partial ~/wd/branch-a-b-partial-update-9
                                ~/wd/branch-a-b-merged
                                branch-b

arch: The arch Revision Control System
The Hackerlab at regexps.com