Fossil

Fossil Versus Git
Login

Fossil Versus Git

1.0 Don't Stress!

The feature sets of Fossil and Git overlap in many ways. Both are distributed version control systems which store a tree of check-in objects to a local repository clone. In both systems, the local clone starts out as a full copy of the remote parent. New content gets added to the local clone and then later optionally pushed up to the remote, and changes to the remote can be pulled down to the local clone at will. Both systems offer diffing, patching, branching, merging, cherry-picking, bisecting, private branches, a stash, etc.

Fossil has inbound and outbound Git conversion features, so if you start out using one DVCS and later decide you like the other better, you can easily move your version-controlled file content

In this document, we set all of that similarity and interoperability aside and focus on the important differences between the two, especially those that impact the user experience.

Keep in mind that you are reading this on a Fossil website, and though we try to be fair, the information here might be biased in favor of Fossil, if only because we spend most of our time using Fossil, not Git. Ask around for second opinions from people who have used both Fossil and Git.

2.0 Differences Between Fossil And Git

Differences between Fossil and Git are summarized by the following table, with further description in the text that follows.

GITFOSSILmore
File versioning only VCS, tickets, wiki, docs, notes, forum, UI, RBAC 2.1 ↓
Sprawling and inefficient Self-contained and efficient 2.2 ↓
One-off custom pile-of-files data store The most popular database in the world 2.3 ↓
Runs natively on POSIX systems only Native on common desktop & server platforms 2.4 ↓
Bazaar-style development Cathedral-style development 2.5.1 ↓
Designed for Linux kernel development Designed for SQLite development 2.5.2 ↓
Many contributors Select contributors 2.5.3 ↓
Focus on individual branches Focus on the entire tree of changes 2.5.4 ↓
One check-out per repository Many check-outs per repository 2.6 ↓
Remembers what you should have done Remembers what you actually did 2.7 ↓
Commit first Test first 2.8 ↓
SHA-2 SHA-3 2.9 ↓

2.1 Featureful

Git provides file versioning services only, whereas Fossil adds an integrated wiki, ticketing & bug tracking, embedded documentation, technical notes, and a web forum, all within a single nicely-designed skinnable web UI, protected by a fine-grained role-based access control system. These additional capabilities are available for Git as 3rd-party add-ons, but with Fossil they are integrated into the design. One way to describe Fossil is that it is "GitHub-in-a-box."

Fossil can do operations over all local repo clones and check-out directories with a single command. For example, Fossil lets you say fossil all sync on a laptop prior to taking it off the network hosting those repos. You can sync up to all of the private repos on your company network plus those public Internet-hosted repos you use. Whether going out for a working lunch or on a transoceanic airplane trip, one command gets you in sync. This works with several other Fossil sub-commands, such as fossil all changes to get a list of files that you forgot to commit prior to the end of your working day, across all repos.

Whenever Fossil is told to modify the local checkout in some destructive way (fossil rm, fossil update, fossil revert, etc.) Fossil remembers the prior state and is able to return the check-out directory to that state with a fossil undo command. You cannot undo a commit in Fossil (on purpose!) but as long as the change remains confined to the local check-out directory only, Fossil makes undo easier than in Git.

For developers who choose to self-host projects (rather than using a 3rd-party service such as GitHub) Fossil is much easier to set up, since the stand-alone Fossil executable together with a 2-line CGI script suffice to instantiate a full-featured developer website. To accomplish the same using Git requires locating, installing, configuring, integrating, and managing a wide assortment of separate tools. Standing up a developer website using Fossil can be done in minutes, whereas doing the same using Git requires hours or days.

Fossil is small, complete, and self-contained. If you clone Git's self-hosting repository, you get just Git's source code. If you clone Fossil's self-hosting repository, you get the entire Fossil website — source code, documentation, ticket history, and so forth.² That means you get a copy of this very article and all of its historical versions, plus the same for all of the other public content on this site.

2.2 Efficient

Git is actually a collection of many small tools, each doing one small part of the job, which can be recombined (by experts) to perform powerful operations. Git has a lot of complexity and many dependencies, so that most people end up installing it via some kind of package manager, simply because the creation of complicated binary packages is best delegated to people skilled in their creation. Normal Git users are not expected to build Git from source and install it themselves.

Fossil is a single self-contained stand-alone executable which by default depends only on common platform libraries. If your platform allows static linking — not all do these days! — you can even get it down to a single executable with no external dependencies at all. Most notably, we deliver the official Windows builds of Fossil this way: the Zip file contains only fossil.exe, a self-contained Fossil executable; it is not a setup.exe style installer, it is the whole enchilada.

A typical Fossil executable is about 5 MiB, not counting system libraries it shares in common with Git such as OpenSSL and zlib, which we can factor out of the discussion.

These properties allow Fossil to easily run inside a minimally configured chroot jail, from a Windows memory stick, off a Raspberry Pi with a tiny SD card, etc. To install Fossil, one merely puts the executable somewhere in the $PATH. Fossil is straightforward to build and install, so that many Fossil users do in fact build and install "trunk" versions to get new features between formal releases.

Contrast a basic installation of Git, which takes up about 15 MiB on Debian 10 across 230 files, not counting the contents of /usr/share/doc or /usr/share/locale. If you need to deploy to any platform where you cannot count facilities like the POSIX shell, Perl interpreter, and Tcl/Tk platform needed to fully use Git as part of the base platform, the full footprint of a Git installation extends to more like 45 MiB and thousands of files. This complicates several common scenarios: Git for Windows, chrooted Git servers, Docker images...

Some say that Git more closely adheres to the Unix philosophy, summarized as "many small tools, loosely joined," but we have many examples of other successful Unix software that violates that principle to good effect, from Apache to Python to ZFS. We can infer from that that this is not an absolute principle of good software design. Sometimes "many features, tightly-coupled" works better. What actually matters is effectiveness and efficiency. We believe Fossil achieves this.

The above size comparisons aren't apples-to-apples anyway. We've compared the size of Fossil with all of its many built-in features to a fairly minimal Git installation. You must add a lot of third-party software to Git to give it a Fossil-equivalent feature set. Consider GitLab, a third-party extension to Git wrapping it in many features, making it roughly Fossil-equivalent, though much more resource hungry and hence more costly to run than the equivalent Fossil setup. GitLab's basic requirements are easy to accept when you're dedicating a local rack server or blade to it, since its minimum requirements are more or less a description of the smallest thing you could call a "server" these days, but when you go to host that in the cloud, you can expect to pay about 8× as much to comfortably host GitLab as for Fossil.³ This difference is largely due to basic technology choices: Ruby and PostgreSQL vs C and SQLite.

The Fossil project itself is hosted on a very small VPS, and we've received many reports on the Fossil forum about people successfully hosting Fossil service on bare-bones $5/month VPS hosts, spare Raspberry Pi boards, and other small hosts.

2.3 Durable

The baseline data structures for Fossil and Git are the same, modulo formatting details. Both systems manage a directed acyclic graph (DAG) of Merkle tree structured check-in objects. Check-ins are identified by a cryptographic hash of the check-in contents, and each check-in refers to its parent via its hash.

The difference is that Git stores its objects as individual files in the .git folder or compressed into bespoke pack-files, whereas Fossil stores its objects in a SQLite database file using a hybrid NoSQL/relational data model of the check-in history. Git's data storage system is an ad-hoc pile-of-files key/value database, whereas Fossil uses a proven, heavily-tested, general-purpose, durable SQL database. This difference is more than an implementation detail. It has important practical consequences.

With Git, one can easily locate the ancestors of a particular check-in by following the pointers embedded in the check-in object, but it is difficult to go the other direction and locate the descendants of a check-in. It is so difficult, in fact, that neither native Git nor GitHub provide this capability short of groveling the commit log. With Git, if you are looking at some historical check-in then you cannot ask "What came next?" or "What are the children of this check-in?"

Fossil, on the other hand, parses essential information about check-ins (parents, children, committers, comments, files changed, etc.) into a relational database that can easily be queried using concise SQL statements to find both ancestors and descendants of a check-in. This is the hybrid data model mentioned above: Fossil manages your check-in and other data in a NoSQL block chain structured data store, but that's backed by a set of relational lookup tables for quick indexing into that artifact store. (See "Thoughts On The Design Of The Fossil DVCS" for more details.)

Leaf check-ins in Git that lack a "ref" become "detached," making them difficult to locate and subject to garbage collection. This detached head state problem has caused untold grief for a huge number of Git users. With Fossil, detached heads are simply impossible because we can always find our way back into the block chain using one or more of the relational indices it automatically manages for you.

This design difference shows up in several other places within each tool. It is why Fossil's timeline is generally more detailed yet more clear than those available in Git front-ends. (Contrast this Fossil timeline with its closest equivalent in GitHub.) It's why there is no inverse of the cryptic @~ notation in Git, meaning "the parent of HEAD," which Fossil simply calls "prev", but there is a "next" special check-in name in Fossil. It is why Fossil has so many built-in status reports to help maintain situational awareness, aid comprehension, and avoid errors.

These differences are due, in part, to Fossil's start a year later than Git: we were able to learn from its key design mistakes.

2.4 Portable

Fossil is largely written in ISO C, almost purely conforming to the original 1989 standard. We make very little use of C99, and we do not knowingly make any use of C11. Fossil does call POSIX and Windows APIs where necessary, but it's about as portable as you can ask given that ISO C doesn't define all of the facilities Fossil needs to do its thing. (Network sockets, file locking, etc.) There are certainly well-known platforms Fossil hasn't been ported to yet, but that's most likely due to lack of interest rather than inherent difficulties in doing the port. We believe the most stringent limit on its portability is that it assumes at least a 32-bit CPU and several megs of flat-addressed memory.⁴ Fossil isn't quite as portable as SQLite, but it's close.

Over half of the C code in Fossil is actually an embedded copy of the current version of SQLite. Much of what is Fossil-specific after you set SQLite itself aside is SQL code calling into SQLite. The number of lines of SQL code in Fossil isn't large by percentage, but since SQL is such an expressive, declarative language, it has an outsized contribution to Fossil's user-visible functionality.

Fossil isn't entirely C and SQL code. Its web UI uses JavaScript where necessary. The server-side UI scripting uses a custom minimal Tcl dialect called TH1, which is embedded into Fossil itself. Fossil's build system and test suite are largely based on Tcl.⁵ All of this is quite portable.

About half of Git's code is POSIX C, and about a third is POSIX shell code. This is largely why the so-called "Git for Windows" distributions (both first-party and third-party) are actually an MSYS POSIX portability environment bundled with all of the Git stuff, because it would be too painful to port Git natively to Windows. Git is a foreign citizen on Windows, speaking to it only through a translator.⁶

While Fossil does lean toward POSIX norms when given a choice — LF-only line endings are treated as first-class citizens over CR+LF, for example — the Windows build of Fossil is truly native.

The third-party extensions to Git tend to follow this same pattern. GitLab isn't portable to Windows at all, for example. For that matter, GitLab isn't even officially supported on macOS, the BSDs, or uncommon Linuxes! We have many users who regularly build and run Fossil on all of these systems.

2.5 Linux vs. SQLite

Fossil and Git promote different development styles because each one was specifically designed to support the creator's main software development project: Linus Torvalds designed Git to support development of the Linux kernel, and D. Richard Hipp designed Fossil to support the development of SQLite. Both projects must rank high on any objective list of "most important FOSS projects," yet these two projects are almost entirely unlike one another, so it is natural that the DVCSes created to support these projects also differ in many ways.

In the following sections, we will explain how four key differences between the Linux and SQLite software development projects dictated the design of each DVCS's low-friction usage path.

When deciding between these two DVCSes, you should ask yourself, "Is my project more like Linux or more like SQLite?"

2.5.1 Development Organization

Eric S. Raymond's seminal essay-turned-book "The Cathedral and the Bazaar" details the two major development organization styles found in FOSS projects. As it happens, Linux and SQLite fall on opposite sides of this dichotomy. Differing development organization styles dictate a different design and low-friction usage path in the tools created to support each project.

Git promotes the Linux kernel's bazaar development style, in which a loosely-associated mass of developers contribute their work through a hierarchy of lieutenants who manage and clean up these contributions for consideration by Linus Torvalds, who has the power to cherry-pick individual contributions into his version of the Linux kernel. Git allows an anonymous developer to rebase and push specific locally-named private branches, so that a Git repo clone often isn't really a clone at all: it may have an arbitrary number of differences relative to the repository it originally cloned from. Git encourages siloed development. Select work in a developer's local repository may remain private indefinitely.

All of this is exactly what one wants when doing bazaar-style development.

Fossil's normal mode of operation differs on every one of these points, with the specific designed-in goal of promoting SQLite's cathedral development model:

Where Git encourages siloed development, Fossil fights against it. Fossil places a lot of emphasis on synchronizing everyone's work and on reporting on the state of the project and the work of its developers, so that everyone — especially the project leader — can maintain a better mental picture of what is happening, leading to better situational awareness.

You can think about this difference in terms of feedback loop size, which we know from the mathematics of control theory to directly affect the speed at which any system can safely make changes. The larger the feedback loop, the slower the whole system must run in order to avoid loss of control. The same concept shows up in other contexts, such as in the OODA loop concept originally developed to explain the success of the US F-86 Sabre fighter aircraft over the on-paper superior MiG-15, then later applied in other contexts, such as business process management. Committing your changes to private branches in order to delay a public push to the parent repo increases the size of your collaborators' control loops, either causing them to slow their work in order to safely react to your work, or to overcorrect in response to each change.

Each DVCS can be used in the opposite style, but doing so works against their low-friction paths.

2.5.2 Scale

The Linux kernel has a far bigger developer community than that of SQLite: there are thousands and thousands of contributors to Linux, most of whom do not know each others names. These thousands are responsible for producing roughly 89⨉ more code than is in SQLite. (10.7 MLOC vs. 0.12 MLOC according to SLOCCount.) The Linux kernel and its development process were already uncommonly large back in 2005 when Git was designed, specifically to support the consequences of having such a large set of developers working on such a large code base.

95% of the code in SQLite comes from just four programmers, and 64% of it is from the lead developer alone. The SQLite developers know each other well and interact daily. Fossil was designed for this development model.

We think you should ask yourself whether you have Linus Torvalds scale software configuration management problems or D. Richard Hipp scale problems when choosing your DVCS. An automotive air impact wrench running at 8000 RPM driving an M8 socket-cap bolt at 16 cm/s is not the best way to hang a picture on the living room wall.

2.5.3 Accepting Contributions

As of this writing, Git has received about 4.5⨉ as many commits as Fossil resulting in about 2.5⨉ as many lines of source code. The line count excludes tests and in-tree third-party dependencies. It does not exclude the default GUI for each, since it's integral for Fossil, so we count the size of gitk in this.

It is obvious that Git is bigger in part because of its first-mover advantage, which resulted in a larger user community, which results in more contributions. But is that the only reason? We believe there are other relevant differences that also play into this which fall out of the "Linux vs. SQLite" framing: licensing, community structure, and how we react to drive-by contributions. In brief, it's harder to get a new feature into Fossil than into Git.

A larger feature set is not necessarily a good thing. Git's command line interface is famously arcane. Masters of the arcane are able to do wizardly things, but only by studying their art deeply for years. This strikes us as a good thing only in cases where use of the tool itself is the primary point of that user's work.

Almost no one uses a DVCS for its own sake; very few people get paid specifically in order to drive a DVCS. We use DVCSes as a tool to support some other effort, so we do not necessarily want the DVCS with the most features. We want a DVCS with easily internalized behavior so we can thoroughly master it despite spending only a small fraction of our working time thinking about the DVCS. We want to pick the tool up, use it quickly, and then set it aside in order to get back to our actual job as quickly as possible.

Professional software developers in particular are prone to focusing on feature set sizes when choosing tools because this is sometimes a highly important consideration. They spend all day, every day, in their favorite text editors, and time they spend learning all of the arcana of their favorite programming languages is well-spent. Skills with these tools are direct productivity drivers, which in turn directly drives how much money a developer can make. (Or how much idle time they can afford to take, which amounts to the same thing.) But if you are a professional software developer, we want you to ask yourself a question: "How do I get paid more by mastering arcane features of my DVCS?" Unless you have a good answer to that, you probably do not want to be choosing a DVCS based on how many arcane features it has.

The argument is similar for other types of users: if you are a hobbyist, how much time do you want to spend mastering your DVCS instead of on the hobby supported by use of that DVCS?

There is some minimal set of features required to achieve the purposes that drive our selection of a DVCS, but there is a level beyond which more features only slow us down while we're learning the tool, since we must plow through documentation on features we're not likely to ever use. When the number of features grows to the point where people of normal motivation cannot spend the time to master them all, the tool becomes less productive to use.

The core developers of the Fossil project achieve a balance between feature set size and ease of use by carefully choosing which users to give commit bits to, then in being choosy about which of the contributed feature branches to merge down to trunk. We say "no" to a lot of feature proposals.

The end result is that Fossil more closely adheres to the principle of least astonishment than Git does.

2.5.4 Individual Branches vs. The Entire Change History

Both Fossil and Git store history as a directed acyclic graph (DAG) of changes, but Git tends to focus more on individual branches of the DAG, whereas Fossil puts more emphasis on the entire DAG.

For example, the default behavior in Git is to only synchronize a single branch, whereas with Fossil the only sync option is to sync the entire DAG. Git commands, GitHub, and GitLab tend to show only a single branch at a time, whereas Fossil usually shows all parallel branches at once. Git has commands like "rebase" that help keep all relevant changes on a single branch, whereas Fossil encourages a style of many concurrent branches constantly springing into existence, undergoing active development in parallel for a few days or weeks, then merging back into the main line and disappearing.

This difference in emphasis arises from the different purposes of the two systems. Git focuses on individual branches, because that is exactly what you want for a highly-distributed bazaar-style project such as Linux. Linus Torvalds does not want to see every check-in by every contributor to Linux, as such extreme visibility does not scale well. But Fossil was written for the cathedral-style SQLite project with just a handful of active committers. Seeing all changes on all branches all at once helps keep the whole team up-to-date with what everybody else is doing, resulting in a more tightly focused and cohesive implementation.

2.6 One vs. Many Check-outs per Repository

Because Git commingles the repository data with the initial checkout of that repository, the default mode of operation in Git is to stick to that single work/repo tree, even when that's a shortsighted way of working.

Fossil doesn't work that way. A Fossil repository is a SQLite database file which is normally stored outside the working checkout directory. You can open a Fossil repository any number of times into any number of working directories. A common usage pattern is to have one working directory per active working branch, so that switching branches is done with a cd command rather than by checking out the branches successively in a single working directory.

Fossil does allow you to switch branches within a working checkout directory, and this is also often done. It is simply that there is no inherent penalty to either choice in Fossil as there is in Git. The standard advice is to use a switch-in-place workflow in Fossil when the disturbance from switching branches is small, and to use multiple checkouts when you have long-lived working branches that are different enough that switching in place is disruptive.

You can use Git in the Fossil style, either by manually symlinking the .git directory from one working directory to another or by use of the git-worktree feature. Nevertheless, Git's default tie between working directory and repository means the standard method for working with a Git repo is to have one working directory only. Most Git tutorials teach this style, so it is how most people learn to use Git. Because relatively few people use Git with multiple working directories per repository, there are several known problems with that way of working, problems which don't happen in Fossil because of the clear separation between repository and working directory.

This distinction matters because switching branches inside a single working directory loses local context on each switch.

For instance, in any software project where the runnable program must be built from source files, you invalidate build objects on each switch, artificially increasing the time required to switch versions. Most obviously, this affects software written in statically-compiled programming languages such as C, Java, and Haskell, but it can even affect programs written in dynamic languages like JavaScript. A typical SPA build process involves several passes: Browserify to convert Node packages so they'll run in a web browser, SASS to CSS translation, transpilation of Typescript to JavaScript, uglification, etc. Once all that processing work is done for a given input file in a given working directory, why re-do that work just to switch versions? If most of the files that differ between versions don't change very often, you can save substantial time by switching branches with cd rather than swapping versions in-place within a working checkout directory.

For another example, you might have an active long-running test grinding away in a working directory, then get a call from a customer requiring that you switch to a stable branch to answer questions in terms of the version that customer is running. You don't want to stop the test in order to switch your lone working directory to the stable branch.

Disk space is cheap. Having several working directories, each with its own local state, makes switching versions cheap and fast. Plus, cd is faster to type than git checkout or fossil update.

2.7 What you should have done vs. What you actually did

Git puts a lot of emphasis on maintaining a "clean" check-in history. Extraneous and experimental branches by individual developers often never make it into the main repository. And branches are often rebased before being pushed, to make it appear as if development had been linear. Git strives to record what the development of a project should have looked like had there been no mistakes.

Fossil, in contrast, puts more emphasis on recording exactly what happened, including all of the messy errors, dead-ends, experimental branches, and so forth. One might argue that this makes the history of a Fossil project "messy," but another point of view is that this makes the history "accurate." In actual practice, the superior reporting tools available in Fossil mean that the added "mess" is not a factor.

Like Git, Fossil has an amend command for modifying prior commits, but unlike in Git, this works not by replacing data in the repository, but by adding a correction record to the repository that affects how later Fossil operations present the corrected data. The old information is still there in the repository, it is just overridden from the amendment point forward. For extreme situations, Fossil adds the shunning mechanism, but it has strict limitations that prevent global history rewrites.

One commentator characterized Git as recording history according to the victors, whereas Fossil records history as it actually happened.

We go into more detail on this topic in a separate article, Rebase Considered Harmful.

2.8 Test Before Commit

One of the things that falls out of Git's default separation of commit from push is that there are several Git sub-commands that jump straight to the commit step before a change could possibly be tested. Fossil, by contrast, makes the equivalent change to the local working check-out only, requiring a separate check-in step to commit the change. This design difference falls naturally out of Fossil's default-enabled autosync feature.

The prime example in Git is rebasing: the change happens to the local repository immediately if successful, even though you haven't tested the change yet. It's possible to argue for such a design in a tool like Git which doesn't automatically push the change up to its parent, because you can still test the change before pushing local changes to the parent repo, but in the meantime you've made a durable change to your local Git repository. You must do something drastic like git reset --hard to revert that rebase if it causes a problem. If you push your rebased local repo up to the parent without testing first, you've now committed the error on a public branch, effectively a violation of the golden rule of rebasing.

Lesser examples are the Git merge, cherry-pick, and revert commands, all of which apply work from one branch onto another, and all of which do their work immediately without giving you an opportunity to test the change first locally unless you give the --no-commit option.

Fossil cannot sensibly work that way because of its default-enabled autosync feature. Instead of jumping straight to the commit step, Fossil applies the proposed merge to the local working directory only, requiring a separate check-in step before the change is committed to the repository. This gives you a chance to test the change first, either manually or by running your software's automatic tests. (Ideally, both!)

Another difference is that because Fossil requires an explicit commit for a merge, it makes you give an explicit commit message for each merge, whereas Git writes that commit message itself by default unless you give the optional --edit flag to override it.

We don't look at this difference as a workaround in Fossil for autosync, but instead as a test-first philosophical difference. When every commit is pushed to the parent repo by default, it encourages a working style in which every commit is tested first. We think this is an inherently good thing.

Incidentally, this is a good example of Git's messy command design. These three commands:

    $ git merge HASH 
    $ git cherry-pick HASH 
    $ git revert HASH

...are all the same command in Fossil:

    $ fossil merge HASH
    $ fossil merge --cherrypick HASH
    $ fossil merge --backout HASH

If you think about it, they're all the same function: apply work done on one branch to another. All that changes between these commands is how much work gets applied — just one check-in or a whole branch — and the merge direction. This is the sort of thing we mean when we point out that Fossil's command interface is simpler than Git's: there are fewer concepts to keep track of in your mental model of Fossil's internal operation.

Fossil's implementation of the feature is also simpler to describe. The brief online help for fossil merge is currently 41 lines long, to which you want to add the 600 lines of the branching document. The equivalent documentation in Git is the aggregation of the man pages for the above three commands, which is over 1000 lines, much of it mutually redundant. (e.g. Git's --edit and --no-commit options get described three times, each time differently.) Fossil's documentation is not only more concise, it gives a nice split of brief online help and full online documentation.

2.9 Hash Algorithm: SHA-3 vs SHA-2 vs SHA-1

Fossil started out using 160-bit SHA-1 hashes to identify check-ins, just as in Git. That changed in early 2017 when news of the SHAttered attack broke, demonstrating that SHA-1 collisions were now practical to create. Two weeks later, the creator of Fossil delivered a new release allowing a clean migration to 256-bit SHA-3 with full backwards compatibility to old SHA-1 based repositories.

In October 2019, after the last of the major binary package repos offering Fossil upgraded to Fossil 2.x, we switched the default hash mode so that from Fossil 2.10 forward, the conversion to SHA-3 is fully automatic. This not only solves the SHAttered problem, it should prevent a reoccurrence of similar problems for the foreseeable future.

Meanwhile, the Git community took until August 2018 to publish their first plan for solving the same problem by moving to SHA-256, a variant of the older SHA-2 algorithm. As of this writing in February 2020, that plan hasn't been implemented, as far as this author is aware, but there is now a competing SHA-256 based plan which requires complete repository conversion from SHA-1 to SHA-256, breaking all public hashes in the repo. One way to characterize such a massive upheaval in Git terms is a whole-project rebase, which violates Git's own Golden Rule of Rebasing.

Regardless of the eventual implementation details, we fully expect Git to move off SHA-1 eventually and for the changes to take years more to percolate through the community.

Almost three years after Fossil solved this problem, the SHAmbles attack was published, further weakening the case for continuing to use SHA-1.

The practical impact of attacks like SHAttered and SHAmbles on the Git and Fossil blockchains isn't clear, but you want to have your repositories moved over to a stronger hash algorithm before someone figures out how to make use of the weaknesses in the old one. Fossil has had this covered for years now, so that the solution is now almost universally deployed.


Asides and Digressions

  1. Many things are lost in making a Git mirror of a Fossil repo due to limitations of Git relative to Fossil. GitHub adds some of these missing features to stock Git, but because they're not part of Git proper, exporting a Fossil repository to GitHub will still not include them; Fossil tickets do not become GitHub issues, for example.

  2. The fossil-scm.org web site is actually hosted in several parts, so that it is not strictly true that "everything" on it is in the self-hosting Fossil project repo. The web forum is hosted as a separate Fossil repo from the main Fossil self-hosting repo for administration reasons, and the Download page content isn't normally synchronized with a "fossil clone" command unless you add the "-u" option. (See "How the Download Page Works" for details.) There may also be some purely static elements of the web site served via D. Richard Hipp's own lightweight web server, althttpd, which is configured as a front end to Fossil running in CGI mode on these sites.

  3. That estimate is based on pricing at Digital Ocean in mid-2019: Fossil will run just fine on the smallest instance they offer, at US $5/month, but the closest match to GitLab's minimum requirements among Digital Ocean's offerings currently costs $40/month.

  4. This means you can give up waiting for Fossil to be ported to the PDP-11, but we remain hopeful that someone may eventually port it to z/OS.

  5. "Why is there all this Tcl in and around Fossil?" you may ask. It is because D. Richard Hipp is a long-time Tcl user and contributor. SQLite started out as an embedded database for Tcl specifically. ([Reference]) When he then created Fossil to manage the development of SQLite, it was natural for him to use Tcl-based tools for its scripting, build system, test system, etc. It came full circle in 2011 when the Tcl and Tk projects moved from CVS to Fossil.

  6. A minority of the pieces of the Git core software suite are written in other languages, primarily Perl, Python, and Tcl. (e.g. git-send-mail, git-p4, and gitk, respectively.) Although these interpreters are quite portable, they aren't installed by default everywhere, and on some platforms you can't count on them at all. (Not just Windows, but also the BSDs and many other non-Linux platforms.) This expands the dependency footprint of Git considerably. It is why the current Git for Windows distribution is 44.7 MiB but the current fossil.exe zip file for Windows is 2.24 MiB. Fossil is much smaller despite using a roughly similar amount of high-level scripting code because its interpreters are compact and built into Fossil itself.

  7. Both Fossil and Git support patch(1) files, a common way to allow drive-by contributions, but it's a lossy contribution path for both systems. Unlike Git PRs and Fossil bundles, patch files collapse multiple checkins together, they don't include check-in comments, and they cannot encode changes made above the individual file content layer: you lose branching decisions, tag changes, file renames, and more when using patch files.