Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Improvements to the document that describes the delta format. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA3-256: |
81e61d78fdfdd2ee471217054b6de89e |
User & Date: | drh 2019-03-02 20:33:24.544 |
Context
2019-03-04
| ||
13:58 | Improvements to Append Wiki privilege suggested by jakesfr. ... (check-in: 3790dbbd user: drh tags: trunk) | |
2019-03-02
| ||
20:33 | Improvements to the document that describes the delta format. ... (check-in: 81e61d78 user: drh tags: trunk) | |
18:18 | Enable make install without first calling make workflow by adjusting the install target prerequisites. This allows make install to be called on a fresh clone/checkout of Fossil because otherwise OBJDIR is missing and make install fails. ... (check-in: 904eb8a5 user: andybradford tags: trunk) | |
Changes
Changes to www/delta_format.wiki.
1 | <title>Fossil Delta Format</title> | < | | | | > > | > > > > | > > > > > > | > > | > > | > > | > > | > > > | > > > | > > > | > > | > > > > > > > > | | | | | | | | | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | <title>Fossil Delta Format</title> <h1>1.0 Overview</h1> <p>Fossil achieves efficient storage and low-bandwidth synchronization through the use of delta-compression. Instead of storing or transmitting the complete content of an artifact, Fossil stores or transmits only the changes relative to a related artifact. </p> <p>This document describes the delta-encoding format used by Fossil. The intended audience is developers working on either <a href="index.wiki">Fossil</a> itself, or on tools compatible with Fossil. Understanding of this document is <em>not</em> required for ordinary users of Fossil. This document is an implementation detail.</p> <p>This document only describes the delta file format. A [./delta_encoder_algorithm.wiki|separate document] describes one possible algorithm for generating deltas in this format.</p> <h2>1.1 Sample Software And Analysis Tools</h2> <p>The core routines used to generate and apply deltas in this format are contained in the [../src/delta.c|delta.c] source file. Interface logic, including "test-*" commands to directly manipulate deltas are contained in the [../src/deltacmd.c|deltacmd.c] source file. SQL functions to create, apply, and analyze deltas are implemented by code in the [../src/deltafunc.c|deltafunc.c] source file. <p>The following command-line tools are available to create and apply deltas and to test the delta logic: * [/help?cmd=test-delta|fossil test-delta] → Run self-tests of the delta logic * [/help?cmd=test-delta-create|fossil test-delta-create X Y] → compute a delta that converts file X into file Y. Output that delta. * [/help?cmd=test-delta-apply|fossil test-delta-apply X D] → apply delta D to input file X and output the result. * [/help?cmd=test-delta-analyze|fossil test-delta-analyze X Y] → compute and delta that converts file X into file Y but instead of writing the delta to output, write performance information about the delta. <p>When running the [/help?cmd=sqlite3|fossil sql] command to get an interactive SQL session connected to the repository, the following additional SQL functions are provided: * <b>delta_create(</b><i>X</i><b>,</b><i>Y</i><b>)</b> → Compute a data that carries blob X into blob Y and return that delta as a blob. * <b>delta_apply(</b><i>X</i><b>,</b><i>D</i><b>)</b> → Apply delta D to input blob X return a new blob which is the result. * <b>delta_output_size(</b><i>D</i>)</b> → Return the size of the output that would result from applying delta D. * <b>delta_parse(</b><i>D</i>)</b> → This is a table-valued function that returns one row for the header, for the trailer, and for each segment in delta D. <a name="structure"></a><h1>2.0 Structure</h1> <img src="delta1.gif" align="left" hspace="10"> <p>A delta consists of three parts, a "header", a "trailer", and a "segment-list" between them.</p> <p>Both header and trailer provide information about the target helping the decoder, and the segment-list describes how the target can be constructed from the original.</p> <a name="header"></a><h2>2.1 Header</h2> <img src="delta6.gif" align="left" hspace="10"> <p>The header consists of a single number followed by a newline character (ASCII 0x0a). The number is the length of the target in bytes.</p> <p>This means that, given a delta, the decoder can compute the size of the target (and allocate any necessary memory based on that) by simply reading the first line of the delta and decoding the number found there. In other words, before it has to decode everything else.</p> <a name="trailer"></a><h2>2.2 Trailer</h2> <img src="delta5.gif" align="left" hspace="10"> <p>The trailer consists of a single number followed by a semicolon (ASCII 0x3b). This number is a checksum of the target and can be used by a decoder to verify that the delta applied correctly, reconstructing the target from the original.</p> <p>The checksum is computed by treating the target as a series of 32-bit integer numbers (MSB first), and summing these up, modulo 2^32-1. A target whose length is not a multiple of 4 is padded with 0-bytes (ASCII 0x00) at the end.</p> <p>By putting this information at the end of the delta a decoder has it available immediately after the target has been reconstructed fully.</p> <a name="slist"></a><h2>2.3 Segment-List</h2> <img src="delta2.gif" align="left" hspace="10"> <p>The segment-list of a delta describes how to create the target from the original by a combination of inserting literal byte-sequences and copying ranges of bytes from the original. This is where the compression takes place, by encoding the large common parts of original and target in small copy instructions.</p> <p>The target is constructed from beginning to end, with the data generated by each instruction appended after the data of all previous instructions, with no gaps.</p> <a name="insertlit"></a><h3>2.3.1 Insert Literal</h3> <p>A literal is specified by two elements, the size of the literal in bytes, and the bytes of the literal itself.</p> <img src="delta4.gif" align="left" hspace="10"> <p>The length is written first, followed by a colon character (ASCII 0x3a), followed by the bytes of the literal.</p> <a name="copyrange"></a><h3>2.3.2 Copy Range</h3> <p>A range to copy is specified by two numbers, the offset of the first byte in the original to copy, and the size of the range, in bytes. The size zero is special, its usage indicates that the range extends to the end of the original.</p> <img src="delta3.gif" align="left" hspace="10"> <p>The length is written first, followed by an "at" character (ASCII 0x40), then the offset, followed by a comma (ASCII 0x2c).</p> <a name="intcoding"></a><h1>3.0 Encoding of integers</h1> <p> The format currently handles only 32 bit integer numbers. They are written base-64 encoded, MSB first, and without leading "0"-characters, except if they are significant (i.e. 0 => "0"). </p> <p> The base-64 coding is described in <a href="http://www.ietf.org/rfc/rfc3548.txt">RFC 3548</a>. </p> <a name="examples"></a><h1>4.0 Examples</h1> <a name="examplesint"></a><h2>4.1 Integer encoding</h2> <table border=1> <tr> <th>Value</th> <th>Encoding</th> </tr> <tr> <td>0</td> <td>0</td> </tr> <tr> <td>6246</td> <td>1Xb</td> </tr> <tr> <td>-1101438770</td> <td>2zMM3E</td> </tr> </table> <a name="examplesdelta"></a><h2>4.2 Delta encoding</h2> <p>An example of a delta using the specified encoding is:</p> <table border=1><tr><td><pre> 1Xb 4E@0,2:thFN@4C,6:scenda1B@Jd,6:scenda5x@Kt,6:pieces79@Qt,F: Example: eskil~E@Y0,2zMM3E;</pre> </td></tr></table> |
︙ | ︙ | |||
204 205 206 207 208 209 210 | * Ticketing interface (expand this bullet) </pre></td></tr></table> | | | 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 | * Ticketing interface (expand this bullet) </pre></td></tr></table> <a name="notes"></a><h1>Notes</h1> <ul> <li>Pure text files generate a pure text delta. </li> <li>Binary files generate a delta that may contain some binary data. </li> <li>The delta encoding does not attempt to compress the content. It was considered to be much more sensible to do compression using a separate general-purpose compression library, like <a href="http://www.zlib.net">zlib</a>. </li> </ul> |