Fossil

Check-in [d9bef53b]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Updates to to the fileformat.wiki document.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | fossil-2.0
Files: files | file ages | folders
SHA1: d9bef53b1a27eeed5dd7b99d1f2b6426e6e9203c
User & Date: drh 2017-02-28 10:06:12
Context
2017-02-28
10:12
Update the change log for version 2.0. check-in: 89077b05 user: drh tags: fossil-2.0
10:06
Updates to to the fileformat.wiki document. check-in: d9bef53b user: drh tags: fossil-2.0
09:16
Change the version number to 2.0. check-in: 81a73593 user: drh tags: fossil-2.0
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to www/fileformat.wiki.

7
8
9
10
11
12
13
14




15
16


17
18
19
20
21
22
23
24
..
25
26
27
28
29
30
31



32
33
34
35

36
37
38
39
40
41
42
43
44



45
46
47
















48
49
50
51
52
53
54
55
56
57
58
59
60
61












62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
...
237
238
239
240
241
242
243
244

245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
...
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
...
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
...
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
...
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
...
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
...
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
...
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
endure in useful form for decades or centuries.
A fossil repository is intended to be readable,
searchable, and extensible by people not yet born.

The global state of a fossil repository is an unordered
set of <i>artifacts</i>.
An artifact might be a source code file, the text of a wiki page,
part of a trouble ticket, or one of several special control artifacts




used to show the relationships between other artifacts within the
project.  Each artifact is normally represented on disk as a separate


file.  Artifacts can be text or binary.

In addition to the global state,
each fossil repository also contains local state.
The local state consists of web-page formatting
preferences, authorized users, ticket display and reporting formats,
and so forth.  The global state is shared in common among all
repositories for the same project, whereas the local state is often
................................................................................
different in separate repositories.
The local state is not versioned and is not synchronized
with the global state.
The local state is not composed of artifacts and is not intended to be enduring.
This document is concerned with global state only.  Local state is only
mentioned here in order to distinguish it from global state.




Each artifact in the repository is named by a hash of the artifact
content.
No prefixes or meta information is added to an artifact before
its hash is computed.


Each repository uses a single hash algorithm to compute artifact names.
The default algorithm is currently SHA3-256, though this might change
in future releases of Fossil.  Historical versions of Fossil used
SHA1.  The hash algorithm for a repository can be changed.  When a hash
algorithm change occurs, a set of aliases are set up (using the
two-argument version of the M-card on cluster artifacts) so that the
older hash values can be mapped into the new hash values for artifacts
that were added to the repository before the hash algorithm change.




Some artifacts have a particular format which gives them special
meaning to Fossil.  The special artifacts are calls "structural
















artifacts".  Fossil recognizes the following kinds of structural
artifacts:

<ul>
<li> [#manifest | Manifests] </li>
<li> [#cluster | Clusters] </li>
<li> [#ctrl | Control Artifacts] </li>
<li> [#wikichng | Wiki Pages] </li>
<li> [#tktchng | Ticket Changes] </li>
<li> [#attachment | Attachments] </li>
<li> [#event | TechNotes] </li>
</ul>

These seven structural artifact types are described in the following sections.













In the current implementation (as of 2017-02-27) the artifacts that
make up a fossil repository are stored as delta- and zlib-compressed
blobs in an <a href="http://www.sqlite.org/">SQLite</a> database.  This
is an implementation detail and might change in a future release.  For
the purpose of this article "file format" means the format of the artifacts,
not how the artifacts are stored on disk.  It is the artifact format that
is intended to be enduring.  The specifics of how artifacts are stored on
disk, though stable, is not intended to live as long as the
artifact format.

All of the artifacts can be extracted from a Fossil repository using
the "fossil deconstruct" command.

<a name="manifest"></a>
<h2>1.0 The Manifest</h2>

A manifest defines a check-in or version of the project
source tree.  The manifest contains a list of artifacts for
each file in the project and the corresponding filenames, as
well as information such as parent check-ins, the name of the
programmer who created the check-in, the date and time when
the check-in was created, and any check-in comments associated
with the check-in.

Any artifact in the repository that follows the syntactic rules
of a manifest is a manifest.  Note that a manifest can
be both a real manifest and also a content file, though this
is rare.

A manifest is a text file.  Newline characters
(ASCII 0x0a) separate the file into "cards".
Each card begins with a single
character "card type".  Zero or more arguments may follow
the card type.  All arguments are separated from each other
and from the card-type character by a single space
character.  There is no surplus white space between arguments
and no leading or trailing whitespace except for the newline
character that acts as the card separator.

All cards of the manifest occur in strict sorted lexicographical order.
No card may be duplicated.
The entire manifest may be PGP clear-signed, but otherwise it
may contain no additional text or data beyond what is described here.

Allowed cards in the manifest are as follows:

<blockquote>
<b>B</b> <i>baseline-manifest</i><br>
<b>C</b> <i>checkin-comment</i><br>
<b>D</b> <i>time-and-date-stamp</i><br>
<b>F</b> <i>filename</i> ?<i>hash</i>? ?<i>permissions</i>? ?<i>old-name</i>?<br>
................................................................................
the login of the user who created the manifest.  The login name
is encoded using the same character escapes as is used for the
check-in comment argument to the C-card.

A manifest must have a single Z-card as its last line.  The argument
to the Z-card is a 32-character lowercase hexadecimal MD5 hash
of all prior lines of the manifest up to and including the newline
character that immediately precedes the "Z".  The Z-card is

a sanity check to prove that the manifest is well-formed and
consistent.

A sample manifest from Fossil itself can be seen
[/artifact/28987096ac | here].

<a name="cluster"></a>
<h2>2.0 Clusters</h2>

A cluster is an artifact that declares the existence of other artifacts.
Clusters are used during repository synchronization to help
reduce network traffic.  As such, clusters are an optimization and
may be removed from a repository without loss or damage to the
underlying project code.

Clusters follow a syntax that is very similar to manifests.
A cluster is a line-oriented text file.  Newline characters
(ASCII 0x0a) separate the artifact into cards.  Each card begins with a single
character "card type".  Zero or more arguments may follow
the card type.  All arguments are separated from each other
and from the card-type character by a single space
character.  There is no surplus white space between arguments
and no leading or trailing whitespace except for the newline
character that acts as the card separator.
All cards of a cluster occur in strict sorted lexicographical order.
No card may be duplicated.
The cluster may not contain additional text or data beyond
what is described here.
Unlike manifests, clusters are never PGP signed.

Allowed cards in the cluster are as follows:

<blockquote>
<b>M</b> <i>artifact-id</i><br />
<b>Z</b> <i>checksum</i>
</blockquote>

................................................................................
lower-case hexadecimal representation of the MD5 checksum of all
prior cards in the cluster.  The Z-card is required.

An example cluster from Fossil can be seen
[/artifact/d03dbdd73a2a8 | here].

<a name="ctrl"></a>
<h2>3.0 Control Artifacts</h2>

Control artifacts are used to assign properties to other artifacts
within the repository.  The basic format of a control artifact is
the same as a manifest or cluster.  A control artifact is a text
file divided into cards by newline characters.  Each card has a
single-character card type followed by arguments.  Spaces separate
the card type and the arguments.  No surplus whitespace is allowed.
All cards must occur in strict lexicographical order.

Allowed cards in a control artifact are as follows:

<blockquote>
<b>D</b> <i>time-and-date-stamp</i><br />
<b>T</b> (<b>+</b>|<b>-</b>|<b>*</b>)<i>tag-name</i> <i>artifact-id</i> ?<i>value</i>?<br />
<b>U</b> <i>user-name</i><br />
<b>Z</b> <i>checksum</i><br />
................................................................................
The U card is the name of the user that created the control
artifact.  The Z card is the usual required artifact checksum.

An example control artifacts can be seen [/info/9d302ccda8 | here].


<a name="wikichng"></a>
<h2>4.0 Wiki Pages</h2>

A wiki page is an artifact with a format similar to manifests,
clusters, and control artifacts.  The artifact is divided into
cards by newline characters.  The format of each card is as in
manifests, clusters, and control artifacts.  Wiki artifacts accept
the following card types:

<blockquote>
<b>D</b> <i>time-and-date-stamp</i><br />
<b>L</b> <i>wiki-title</i><br />
<b>N</b> <i>mimetype</i><br />
<b>P</b> <i>parent-artifact-id</i>+<br />
................................................................................
that terminates the W card.  The wiki text is always followed by one
extra newline.

An example wiki artifact can be seen
[/artifact?name=7b2f5fd0e0&txt=1 | here].

<a name="tktchng"></a>
<h2>5.0 Ticket Changes</h2>

A ticket-change artifact represents a change to a trouble ticket.
The following cards are allowed on a ticket change artifact:

<blockquote>
<b>D</b> <i>time-and-date-stamp</i><br />
<b>J</b> ?<b>+</b>?<i>name</i> ?<i>value</i>?<br />
................................................................................
The field name and value are both encoded using the character
escapes defined for the C card of a manifest.

An example ticket-change artifact can be seen
[/artifact/91f1ec6af053 | here].

<a name="attachment"></a>
<h2>6.0 Attachments</h2>

An attachment artifact associates some other artifact that is the
attachment (the source artifact) with a ticket or wiki page or
technical note to which
the attachment is connected (the target artifact).
The following cards are allowed on an attachment artifact:

................................................................................
If an attachment is added anonymously, then the U card may be omitted.

The Z card is the usual checksum over the rest of the attachment artifact.
The Z card is required.


<a name="event"></a>
<h2>7.0 Technical Notes</h2>

A technical note or "technote" artifact (formerly known as an "event" artifact)
associates a timeline comment and a page of text
(similar to a wiki page) with a point in time.  Technotes can be used
to record project milestones, release notes, blog entries, process
checkpoints, or news articles.
The following cards are allowed on an technote artifact:
................................................................................
technote.  The format of the W card is exactly the same as for a
[#wikichng | wiki artifact].

The Z card is the required checksum over the rest of the artifact.


<a name="summary"></a>
<h2>8.0 Card Summary</h2>

The following table summarizes the various kinds of cards that appear
on Fossil artifacts. A blank entry means that combination of card and
artifact is not legal. A number or range of numbers indicates the number
of times a card may (or must) appear in the corresponding artifact type.
e.g. a value of 1 indicates a required unique card and 1+ indicates that one
or more such cards are required.
................................................................................
<td align=center><b>1</b></td>
<td align=center><b>1</b></td>
</tr>
</table>


<a name="addenda"></a>
<h2>9.0 Addenda</h2>

This section contains additional information which may be useful when
implementing algorithms described above.

<h3>R Card Hash Calculation</h3>

Given a manifest file named <tt>MF</tt>, the following Bash shell code
demonstrates how to compute the value of the R card in that manifest.
This example uses manifest [28987096ac]. Lines starting with <tt>#</tt> are
shell input and other lines are output. This demonstration assumes that the
file versions represented by the input manifest are checked out
under the current directory.







|
>
>
>
>
|
<
>
>
|







 







>
>
>
|
<
|
|
>

<
<
<
|
|
|
|
|
>
>
>

<
<
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|












|
>
>
>
>
>
>
>
>
>
>
>
>











<
<
<

|

|
|

|




<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<







 







|
>







|







<
<
<
<
<
<
<
<
<
<
<
<
<
<
<







 







|


|
<
<
<
<
<
<







 







|

|
|
<
|







 







|







 







|







 







|







 







|







 







|




|







7
8
9
10
11
12
13
14
15
16
17
18
19

20
21
22
23
24
25
26
27
28
29
..
30
31
32
33
34
35
36
37
38
39
40

41
42
43
44



45
46
47
48
49
50
51
52
53


54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106



107
108
109
110
111
112
113
114
115
116
117




















118
119
120
121
122
123
124
...
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271















272
273
274
275
276
277
278
...
283
284
285
286
287
288
289
290
291
292
293






294
295
296
297
298
299
300
...
337
338
339
340
341
342
343
344
345
346
347

348
349
350
351
352
353
354
355
...
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
...
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
...
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
...
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
...
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
endure in useful form for decades or centuries.
A fossil repository is intended to be readable,
searchable, and extensible by people not yet born.

The global state of a fossil repository is an unordered
set of <i>artifacts</i>.
An artifact might be a source code file, the text of a wiki page,
part of a trouble ticket, a description of a check-in including all
the files in that check-in with the check-in comment and so forth.
Artifacts are broadly grouped into two types: content artifacts and
structural artifacts.  Content artifacts are the raw project source-code
files that are checked into the repository.  Structural artifacts have
special formatting rules and are used to show the relationships between

other artifacts in the repository.  It is possible for an artifact to
be both a structure artifact and a content artifact, though this is
rare. Artifacts can be text or binary.

In addition to the global state,
each fossil repository also contains local state.
The local state consists of web-page formatting
preferences, authorized users, ticket display and reporting formats,
and so forth.  The global state is shared in common among all
repositories for the same project, whereas the local state is often
................................................................................
different in separate repositories.
The local state is not versioned and is not synchronized
with the global state.
The local state is not composed of artifacts and is not intended to be enduring.
This document is concerned with global state only.  Local state is only
mentioned here in order to distinguish it from global state.

<a name='#names'></a>
<h2>1.0 Artifact Names</h2>

Each artifact in the repository is named by a hash of its content.

No prefixes, suffixes, or other information is added to an artifact before
the hash is computed.  The artifact name is just the (lower-case
hexadecimal) hash of the raw artifact.




Fossil supports multiple hash algorithms including SHA1 and various
lengths of SHA3.  Because an artifact can be hashed using multiple algorithms,
a single artifact can have multiple names.  Usually, Fossil knows
each artifact by just a single name called the "display name".  But it is
possible for Fossil to know an artifact by multiple names from different
hashes.  In that case, Fossil uses the display name for output, but continues
to accept the alternative names as command-line arguments or as parameters to
webpage URLs.



When referring to artifacts in using tty commands or webpage URLs, it is 
sufficient to specify a unique prefix for the artifact name.  If the input
prefix is not unique, Fossil will show an error.  Within a structural
artifact, however, all references to other artifacts must be the complete
hash.

Prior to Fossil version 2.0, all names were formed from the SHA1 hash of
the artifact.  The key innovation in Fossil 2.0 was adding support for
alternative hash algorithms.

<a name="structural"></a>
<h2>2.0 Structural Artifacts</h2>

A structural artifact is an artifact that has a particular format and
that is used to define the relationships between other artifacts in the
repository.
Fossil recognizes the following kinds of structural
artifacts:

<ul>
<li> [#manifest | Manifests] </li>
<li> [#cluster | Clusters] </li>
<li> [#ctrl | Control Artifacts] </li>
<li> [#wikichng | Wiki Pages] </li>
<li> [#tktchng | Ticket Changes] </li>
<li> [#attachment | Attachments] </li>
<li> [#event | TechNotes] </li>
</ul>

These seven structural artifact types are described in subsections below.

Structural artifacts are ASCII text.  The artifact may be PGP clearsigned.
After removal of the PGP clearsign header and suffix (if any) a structural
artifact consists of one or more "cards" separated by a single newline
(ASCII: 0x0a) character. Each card begins with a single
character "card type".  Zero or more arguments may follow
the card type.  All arguments are separated from each other
and from the card-type character by a single space
character.  There is no surplus white space between arguments
and no leading or trailing whitespace except for the newline
character that acts as the card separator.  All cards must be in strict
lexicographical order.  There may not be any duplicate cards.

In the current implementation (as of 2017-02-27) the artifacts that
make up a fossil repository are stored as delta- and zlib-compressed
blobs in an <a href="http://www.sqlite.org/">SQLite</a> database.  This
is an implementation detail and might change in a future release.  For
the purpose of this article "file format" means the format of the artifacts,
not how the artifacts are stored on disk.  It is the artifact format that
is intended to be enduring.  The specifics of how artifacts are stored on
disk, though stable, is not intended to live as long as the
artifact format.




<a name="manifest"></a>
<h3>2.1 The Manifest</h3>

A manifest defines a check-in.
A manifest contains a list of artifacts for
each file in the project and the corresponding filenames, as
well as information such as parent check-ins, the username of the
programmer who created the check-in, the date and time when
the check-in was created, and any check-in comments associated
with the check-in.





















Allowed cards in the manifest are as follows:

<blockquote>
<b>B</b> <i>baseline-manifest</i><br>
<b>C</b> <i>checkin-comment</i><br>
<b>D</b> <i>time-and-date-stamp</i><br>
<b>F</b> <i>filename</i> ?<i>hash</i>? ?<i>permissions</i>? ?<i>old-name</i>?<br>
................................................................................
the login of the user who created the manifest.  The login name
is encoded using the same character escapes as is used for the
check-in comment argument to the C-card.

A manifest must have a single Z-card as its last line.  The argument
to the Z-card is a 32-character lowercase hexadecimal MD5 hash
of all prior lines of the manifest up to and including the newline
character that immediately precedes the "Z", excluding any PGP
clear-signing prefix.  The Z-card is
a sanity check to prove that the manifest is well-formed and
consistent.

A sample manifest from Fossil itself can be seen
[/artifact/28987096ac | here].

<a name="cluster"></a>
<h3>2.2 Clusters</h3>

A cluster is an artifact that declares the existence of other artifacts.
Clusters are used during repository synchronization to help
reduce network traffic.  As such, clusters are an optimization and
may be removed from a repository without loss or damage to the
underlying project code.
















Allowed cards in the cluster are as follows:

<blockquote>
<b>M</b> <i>artifact-id</i><br />
<b>Z</b> <i>checksum</i>
</blockquote>

................................................................................
lower-case hexadecimal representation of the MD5 checksum of all
prior cards in the cluster.  The Z-card is required.

An example cluster from Fossil can be seen
[/artifact/d03dbdd73a2a8 | here].

<a name="ctrl"></a>
<h3>2.3 Control Artifacts</h3>

Control artifacts are used to assign properties to other artifacts
within the repository.






Allowed cards in a control artifact are as follows:

<blockquote>
<b>D</b> <i>time-and-date-stamp</i><br />
<b>T</b> (<b>+</b>|<b>-</b>|<b>*</b>)<i>tag-name</i> <i>artifact-id</i> ?<i>value</i>?<br />
<b>U</b> <i>user-name</i><br />
<b>Z</b> <i>checksum</i><br />
................................................................................
The U card is the name of the user that created the control
artifact.  The Z card is the usual required artifact checksum.

An example control artifacts can be seen [/info/9d302ccda8 | here].


<a name="wikichng"></a>
<h3>2.4 Wiki Pages</h3>

A wiki artifact defines a single version of a
single wiki page.

Wiki artifacts accept
the following card types:

<blockquote>
<b>D</b> <i>time-and-date-stamp</i><br />
<b>L</b> <i>wiki-title</i><br />
<b>N</b> <i>mimetype</i><br />
<b>P</b> <i>parent-artifact-id</i>+<br />
................................................................................
that terminates the W card.  The wiki text is always followed by one
extra newline.

An example wiki artifact can be seen
[/artifact?name=7b2f5fd0e0&txt=1 | here].

<a name="tktchng"></a>
<h3>2.5 Ticket Changes</h3>

A ticket-change artifact represents a change to a trouble ticket.
The following cards are allowed on a ticket change artifact:

<blockquote>
<b>D</b> <i>time-and-date-stamp</i><br />
<b>J</b> ?<b>+</b>?<i>name</i> ?<i>value</i>?<br />
................................................................................
The field name and value are both encoded using the character
escapes defined for the C card of a manifest.

An example ticket-change artifact can be seen
[/artifact/91f1ec6af053 | here].

<a name="attachment"></a>
<h3>2.6 Attachments</h3>

An attachment artifact associates some other artifact that is the
attachment (the source artifact) with a ticket or wiki page or
technical note to which
the attachment is connected (the target artifact).
The following cards are allowed on an attachment artifact:

................................................................................
If an attachment is added anonymously, then the U card may be omitted.

The Z card is the usual checksum over the rest of the attachment artifact.
The Z card is required.


<a name="event"></a>
<h3>2.7 Technical Notes</h3>

A technical note or "technote" artifact (formerly known as an "event" artifact)
associates a timeline comment and a page of text
(similar to a wiki page) with a point in time.  Technotes can be used
to record project milestones, release notes, blog entries, process
checkpoints, or news articles.
The following cards are allowed on an technote artifact:
................................................................................
technote.  The format of the W card is exactly the same as for a
[#wikichng | wiki artifact].

The Z card is the required checksum over the rest of the artifact.


<a name="summary"></a>
<h2>3.0 Card Summary</h2>

The following table summarizes the various kinds of cards that appear
on Fossil artifacts. A blank entry means that combination of card and
artifact is not legal. A number or range of numbers indicates the number
of times a card may (or must) appear in the corresponding artifact type.
e.g. a value of 1 indicates a required unique card and 1+ indicates that one
or more such cards are required.
................................................................................
<td align=center><b>1</b></td>
<td align=center><b>1</b></td>
</tr>
</table>


<a name="addenda"></a>
<h2>4.0 Addenda</h2>

This section contains additional information which may be useful when
implementing algorithms described above.

<h3>4.1 R-Card Hash Calculation</h3>

Given a manifest file named <tt>MF</tt>, the following Bash shell code
demonstrates how to compute the value of the R card in that manifest.
This example uses manifest [28987096ac]. Lines starting with <tt>#</tt> are
shell input and other lines are output. This demonstration assumes that the
file versions represented by the input manifest are checked out
under the current directory.