Fossil

Check-in [dc1a5cf7]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Updates to the branching document.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1:dc1a5cf739b0b1e83c273d5f68805a5c86d935af
User & Date: drh 2009-01-23 22:05:37
Context
2009-01-23
22:20
Update the timeline so that it's use of "Leaf" conforms to the definition given in the documentation. check-in: cb31e908 user: drh tags: trunk
22:05
Updates to the branching document. check-in: dc1a5cf7 user: drh tags: trunk
21:24
First draft of the "branching" document. check-in: 2e275c14 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to www/branching.wiki.

3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
..
25
26
27
28
29
30
31
32

33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
..
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
..
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
...
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139

140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
...
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
</h1>

In a simple and perfect world, the development of a project would proceed
linearly, as shown in figure 1.

<center><table border=1 cellpadding=10 hspace=10 vspace=10>
<tr><td align="center">
<img src="branch01.gif"><br>
Figure 1
</td></tr></table></center>

Each circle represents a check-in.  For the sake of clarity, the check-ins
are given small consecutive numbers.  In a real system, of course, the
check-in numbers would be 40-character SHA1 hashes since it is not possible
to allocate collision-free sequential numbers is a distributed system.
................................................................................
and that 1 is a <i>parent</i> of 2.  
Check-in 3 is derived from check-in 2, making
3 a child of 2.  We say that 3 is a <i>descendant</i> of both 1 and 2 and that 1
and 2 are both <i>ancestors</i> of 3.  

We call the graph of check-ins a <i>tree</i>.  Check-in 1 is the <i>root</i>
since it has no ancestors.  Check-in 4 is a <i>leaf</i> of the tree since
it has no descendants.  


Alas, reality often interferes with the simple linear development of a
project.  Suppose two programmers make independent modifications to check-in 2.
After both changes are checked in, we have a check-in graph that looks 
like figure 2:

<center><table border=1 cellpadding=10 hspace=10 vspace=10>
<tr><td align="center">
<img src="branch02.gif"><br>
Figure 2
</td></tr></table></center>

The graph in figure 2 has two leaves: check-ins 3 and 4.  Check-in 2 has
two children, check-ins 3 and 4.  We call this stituation a <i>fork</i>.

Fossil tries to prevent forks.  Suppose the two programmers who were
................................................................................
together with his own changes.  After merging, Bob could then commit
check-in 4 as a child of check-in 3 and the result would be a linear graph
as shown in figure 1.  This is how CVS works.  This is also how fossil
works in "autosync" mode.

But it might be that Bob is off-network when he does his commit, so he
has no way of knowing that Alice has already committed her changes.
Or, it could be that Bob has turned of "autosync" mode in SQLite.  Or,
maybe Bob just doesn't want to merge in Alices changes before he has
saved his own, so he forces the commit to occur using the "--force" option
to the fossil <b>commit</b> command.  For whatever reason, two commits against
check-in 2 have occurred and now the tree has two leaves.

So which version of the project is the "latest" in the sense of having
the most features and the most bug fixes?  When there is more than
one leaf in the graph, you don't really know.  So we like to have
graphs with a single leaf.

To resolve this situation, Alice can use the fossil <b>merge</b> command
to me merge in Bob's changes in here local copy of check-in 3.  Then she
can commit the results as check-in 5.  This results in a tree as shown
in figure 3.

<center><table border=1 cellpadding=10 hspace=10 vspace=10>
<tr><td align="center">
<img src="branch03.gif"><br>
Figure 3
</td></tr></table></center>

Check-in 5 is a direct child of check-in 3 because it was created by editing
check-in 3.  But check-in 5 also inherits the changes from check-in 4 by
virtual of the merge.  So we say that check-in 5 is a <i>merge child</i>
of check-in 4 and that it is a <i>direct child</i> of check-in 3.  
................................................................................
never occurred.  The resulting graph would have been linear, as shown
in figure 1.  Really the graph of figure 1 is a subset of figure 3.
Hold your hand over the check-in 4 circle of figure 3 and then figure
3 looks exactly like figure 1 (except that the leaf has a different check-in
number, but that is just a notational difference - the two check-ins have
exactly the same content).  In other words, figure 3 is really a superset
of figure 1.  The check-in 4 of figure 3 captures addition state which
is omitted from figure 1.  In check-in 4 of figure 3 is a copy
of Bob's local checkout before he merged in Alices changes.  That snapshot
of Bob's changes independent of Alice's changes is omitted from figure 1.
Some people say that the approach taken in figure 3 is better because it
preserves this extra intermediate state.  Others say that the approach
taken in figure 1 is better because it is much easier to visualize a
linear line of development and because the the merging happens automatically
instead of as a separate manual step.  We will not take sides in this
debate.  We will simply point out that fossil enables you to do it either way.

<h2>Forking Versus Branching</h2>

Forking and having more than one leaf in the check-in tree is usually
considered undesirable, and so forks are usually quickly resolved as 
shown in figure 3 above.
................................................................................
When multiple leaves are desirable, we call the phenomenon <i>branching</i>
instead of <i>forking</i>.
Figure 4 shows an example of a project where there are two branches, one
for development work and another for testing.

<center><table border=1 cellpadding=10 hspace=10 vspace=10>
<tr><td align="center">
<img src="branch04.gif"><br>
Figure 4
</td></tr></table></center>

The hypothetical scenario of figure 4 is this:  The project starts and
progresses to a point where (at check-in 2) 
it is ready to enter testing for its first release.
In a real project, of course, there might be hundreds or thousands of
check-ins before a project reaches this point, but for simplicity of
presentation we will say that the project is ready after check-in 2.
The project then splits into two branches that are used by separate
teams.  The testing team, using the blue branch, finds and fixes a few
bugs.  This is shown by check-ins 6 and 9.  Meanwhile the development

team, working on the red branch, is busy adding features for the second
release.  Of course, the development team would like to take advantage of
the bug fixes implemented by the testing team.  So periodically, the
changes in the test branch are merged into the dev branch.  This is
shown by the dashed merge arrows between check-ins 6 and 7 and between
check-ins 9 and 10.

In both figures 2 and 4, check-in 2 has two children.  In figure 2,
we called this a "fork".  In diagram 4, we call it a "branch".  What is
the difference?  As far as the internal fossil data structure are
concerned, there is no difference.  The distinction is in the intent.
In figure 2, the fact that check-in 2 has multiple children is an
accident that stems from concurrent development.  In figure 4, giving
check-in 2 multiple children is a deliberate act.  So, to a good
approximating, we define forking to be by accident and branching to
be by intent.  Apart from that, they are the same.

<h2>Tags And Properties</h2>

Tags and properties are used in fossil to help express the intent, and
thus to distinguish between forks and branches.  Figure 5 shows the
same scenario as figure 4 but with tags and properties added:

<center><table border=1 cellpadding=10 hspace=10 vspace=10>
<tr><td align="center">
<img src="branch05.gif"><br>
Figure 5
</td></tr></table></center>

A <i>tag</i> is a name that is attached to a check-in.  A
<i>property</i> is a name/value pair.  Internally, fossil implements
tags as properties with a NULL value.  So, tags and properties really
are much the same thing, and henceforth we will use the word "tag"
to mean either a tag or a property. 

A tag can be either a one-time tag or an propagating tag or a cancellation. 
A one-time tag only applies to the check-in to which it is attached.  An
propagating tag applies to the check-in to which it is attached and also
to all direct descendants of that check-in.  A <i>direct descendant</i>
is a descendant through direct children.  Tags propagation does not
cross merges.  Tag propagation also stops as soon
as it encounters another check-in with the same tag.  A cancellation tag
is attached to a single check-in in order to either override a one-time
tag that was placed on that same check-in, or to block tag propagation.
................................................................................
of timelines to be blue for check-in 4 and its descendants.

Figure 5 also shows two one-time tags on check-in 9.  (The diagram does
not make a graphical distinction between one-time and propagating tags.)
The <b>sym-release-1.0</b> tag means that check-in 9 can be referred to
using the more meaningful name "release-1.0".  The <b>closed</b> tag means
that check-in 9 is a "closed leaf".  A closed leaf is a leaf that intended
to never have any childred.

<h2>Review Of Terminology</h2>

Here is a list of definitions of key terms:


<blockquote><dl>







|







 







|
>








|







 







|











|





|







 







|
|





|







 







|












>
|













|










|










|







 







|







3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
..
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
..
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
..
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
...
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
...
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
</h1>

In a simple and perfect world, the development of a project would proceed
linearly, as shown in figure 1.

<center><table border=1 cellpadding=10 hspace=10 vspace=10>
<tr><td align="center">
<img src="branch01.gif" width=280 height=68><br>
Figure 1
</td></tr></table></center>

Each circle represents a check-in.  For the sake of clarity, the check-ins
are given small consecutive numbers.  In a real system, of course, the
check-in numbers would be 40-character SHA1 hashes since it is not possible
to allocate collision-free sequential numbers is a distributed system.
................................................................................
and that 1 is a <i>parent</i> of 2.  
Check-in 3 is derived from check-in 2, making
3 a child of 2.  We say that 3 is a <i>descendant</i> of both 1 and 2 and that 1
and 2 are both <i>ancestors</i> of 3.  

We call the graph of check-ins a <i>tree</i>.  Check-in 1 is the <i>root</i>
since it has no ancestors.  Check-in 4 is a <i>leaf</i> of the tree since
it has no descendants.  (We will give a more precise in the definition of
"leaf" later.)

Alas, reality often interferes with the simple linear development of a
project.  Suppose two programmers make independent modifications to check-in 2.
After both changes are checked in, we have a check-in graph that looks 
like figure 2:

<center><table border=1 cellpadding=10 hspace=10 vspace=10>
<tr><td align="center">
<img src="branch02.gif" width=210 height=140><br>
Figure 2
</td></tr></table></center>

The graph in figure 2 has two leaves: check-ins 3 and 4.  Check-in 2 has
two children, check-ins 3 and 4.  We call this stituation a <i>fork</i>.

Fossil tries to prevent forks.  Suppose the two programmers who were
................................................................................
together with his own changes.  After merging, Bob could then commit
check-in 4 as a child of check-in 3 and the result would be a linear graph
as shown in figure 1.  This is how CVS works.  This is also how fossil
works in "autosync" mode.

But it might be that Bob is off-network when he does his commit, so he
has no way of knowing that Alice has already committed her changes.
Or, it could be that Bob has turned off "autosync" mode in SQLite.  Or,
maybe Bob just doesn't want to merge in Alices changes before he has
saved his own, so he forces the commit to occur using the "--force" option
to the fossil <b>commit</b> command.  For whatever reason, two commits against
check-in 2 have occurred and now the tree has two leaves.

So which version of the project is the "latest" in the sense of having
the most features and the most bug fixes?  When there is more than
one leaf in the graph, you don't really know.  So we like to have
graphs with a single leaf.

To resolve this situation, Alice can use the fossil <b>merge</b> command
to merge in Bob's changes in her local copy of check-in 3.  Then she
can commit the results as check-in 5.  This results in a tree as shown
in figure 3.

<center><table border=1 cellpadding=10 hspace=10 vspace=10>
<tr><td align="center">
<img src="branch03.gif" width=282 height=152><br>
Figure 3
</td></tr></table></center>

Check-in 5 is a direct child of check-in 3 because it was created by editing
check-in 3.  But check-in 5 also inherits the changes from check-in 4 by
virtual of the merge.  So we say that check-in 5 is a <i>merge child</i>
of check-in 4 and that it is a <i>direct child</i> of check-in 3.  
................................................................................
never occurred.  The resulting graph would have been linear, as shown
in figure 1.  Really the graph of figure 1 is a subset of figure 3.
Hold your hand over the check-in 4 circle of figure 3 and then figure
3 looks exactly like figure 1 (except that the leaf has a different check-in
number, but that is just a notational difference - the two check-ins have
exactly the same content).  In other words, figure 3 is really a superset
of figure 1.  The check-in 4 of figure 3 captures addition state which
is omitted from figure 1.  Check-in 4 of figure 3 holds a copy
of Bob's local checkout before he merged in Alice's changes.  That snapshot
of Bob's changes independent of Alice's changes is omitted from figure 1.
Some people say that the approach taken in figure 3 is better because it
preserves this extra intermediate state.  Others say that the approach
taken in figure 1 is better because it is much easier to visualize a
linear line of development and because the the merging happens automatically
instead of as a separate manual step.  We will not take sides in that
debate.  We will simply point out that fossil enables you to do it either way.

<h2>Forking Versus Branching</h2>

Forking and having more than one leaf in the check-in tree is usually
considered undesirable, and so forks are usually quickly resolved as 
shown in figure 3 above.
................................................................................
When multiple leaves are desirable, we call the phenomenon <i>branching</i>
instead of <i>forking</i>.
Figure 4 shows an example of a project where there are two branches, one
for development work and another for testing.

<center><table border=1 cellpadding=10 hspace=10 vspace=10>
<tr><td align="center">
<img src="branch04.gif" width=426 height=123><br>
Figure 4
</td></tr></table></center>

The hypothetical scenario of figure 4 is this:  The project starts and
progresses to a point where (at check-in 2) 
it is ready to enter testing for its first release.
In a real project, of course, there might be hundreds or thousands of
check-ins before a project reaches this point, but for simplicity of
presentation we will say that the project is ready after check-in 2.
The project then splits into two branches that are used by separate
teams.  The testing team, using the blue branch, finds and fixes a few
bugs.  This is shown by check-ins 6 and 9.  Meanwhile the development
team, working on the top uncolored branch, 
is busy adding features for the second
release.  Of course, the development team would like to take advantage of
the bug fixes implemented by the testing team.  So periodically, the
changes in the test branch are merged into the dev branch.  This is
shown by the dashed merge arrows between check-ins 6 and 7 and between
check-ins 9 and 10.

In both figures 2 and 4, check-in 2 has two children.  In figure 2,
we called this a "fork".  In diagram 4, we call it a "branch".  What is
the difference?  As far as the internal fossil data structure are
concerned, there is no difference.  The distinction is in the intent.
In figure 2, the fact that check-in 2 has multiple children is an
accident that stems from concurrent development.  In figure 4, giving
check-in 2 multiple children is a deliberate act.  So, to a good
approximation, we define forking to be by accident and branching to
be by intent.  Apart from that, they are the same.

<h2>Tags And Properties</h2>

Tags and properties are used in fossil to help express the intent, and
thus to distinguish between forks and branches.  Figure 5 shows the
same scenario as figure 4 but with tags and properties added:

<center><table border=1 cellpadding=10 hspace=10 vspace=10>
<tr><td align="center">
<img src="branch05.gif" width=485 height=177><br>
Figure 5
</td></tr></table></center>

A <i>tag</i> is a name that is attached to a check-in.  A
<i>property</i> is a name/value pair.  Internally, fossil implements
tags as properties with a NULL value.  So, tags and properties really
are much the same thing, and henceforth we will use the word "tag"
to mean either a tag or a property. 

A tag can be either a one-time tag or an propagating tag or a cancellation. 
A one-time tag only applies to the check-in to which it is attached.  A
propagating tag applies to the check-in to which it is attached and also
to all direct descendants of that check-in.  A <i>direct descendant</i>
is a descendant through direct children.  Tags propagation does not
cross merges.  Tag propagation also stops as soon
as it encounters another check-in with the same tag.  A cancellation tag
is attached to a single check-in in order to either override a one-time
tag that was placed on that same check-in, or to block tag propagation.
................................................................................
of timelines to be blue for check-in 4 and its descendants.

Figure 5 also shows two one-time tags on check-in 9.  (The diagram does
not make a graphical distinction between one-time and propagating tags.)
The <b>sym-release-1.0</b> tag means that check-in 9 can be referred to
using the more meaningful name "release-1.0".  The <b>closed</b> tag means
that check-in 9 is a "closed leaf".  A closed leaf is a leaf that intended
to never have any direct children.

<h2>Review Of Terminology</h2>

Here is a list of definitions of key terms:


<blockquote><dl>