Fossil

Fossil Versus Git
Login

1.0 Don't Stress!

If you start out using one DVCS and later decide you like the other better, you can easily move your content

Fossil and Git are very similar in many respects, but they also have important differences. See the table below for a high-level summary and the text that follows for more details.

Keep in mind that you are reading this on a Fossil website, and though we try to be fair, the information here might be biased in favor of Fossil. Ask around for second opinions from people who have used both Fossil and Git.

2.0 Differences Between Fossil And Git

Differences between Fossil and Git are summarized by the following table, with further description in the text that follows.

GITFOSSIL
File versioning only VCS, tickets, wiki, docs, notes, forum, UI, RBAC
Sprawling, incoherent, and inefficient Self-contained and efficient
Ad-hoc pile-of-files key/value database Relational SQL database
Portable to POSIX systems onlyRuns just about anywhere
Bazaar-style developmentCathedral-style development
Designed for Linux kernel development Designed for SQLite development
Many contributors Select contributors
Focus on individual branches Focus on the entire tree of changes
One check-out per repository Many check-outs per repository
Remembers what you should have done Remembers what you actually did
SHA-1, SHA-256 256-bit SHA-3

2.1 Featureful

Git provides file versioning services only, whereas Fossil adds an integrated wiki, ticketing & bug tracking, embedded documentation, technical notes, and a web forum, all within a single nicely-designed skinnable web UI, protected by a fine-grained role-based access control system. These additional capabilities are available for Git as 3rd-party add-ons, but with Fossil they are integrated into the design. One way to describe Fossil is that it is "GitHub-in-a-box."

For developers who choose to self-host projects (rather than using a 3rd-party service such as GitHub) Fossil is much easier to set up, since the stand-alone Fossil executable together with a 2-line CGI script suffice to instantiate a full-featured developer website. To accomplish the same using Git requires locating, installing, configuring, integrating, and managing a wide assortment of separate tools. Standing up a developer website using Fossil can be done in minutes, whereas doing the same using Git requires hours or days.

Fossil is small, complete, and self-contained. If you clone Git's self-hosting repository, you get just Git's source code. If you clone Fossil's self-hosting repository, you get the entire Fossil website — source code, documentation, ticket history, and so forth.² That means you get a copy of this very article and all of its historical versions, plus the same for all of the other public content on this site.

2.2 Efficient

Git is actually a collection of many small tools, each doing one small part of the job, which can be recombined (by experts) to perform powerful operations. Git has a lot of complexity and many dependencies, so that most people end up installing it via some kind of package manager, simply because these problems are best delegated to people skilled in the creation of binary softare packages.

Fossil is a single self-contained stand-alone executable with hardly any dependencies. Fossil can be run inside a minimally configured chroot jail, from a Windows memory stick, off a Raspberry Pi with a tiny SD card, etc. To install Fossil, one merely puts the executable somewhere in the $PATH.

Some say that Git more closely adheres to the Unix philosophy, summarized as "many small tools, loosely joined," but we have many examples of other successful Unix software that violates that principle to good effect, from Apache to Python to ZFS. We can infer from that that this is not an absolute principle of good software design. Sometimes "many features, tightly-coupled" works better. What actually matters is effectiveness and efficiency. We believe Fossil achieves this.

Git fails on efficiency once you add to it all of the third-party software needed to give it a Fossil-equivalent feature set. Consider GitLab, a third-party extension to Git wrapping it in many features, making it roughly Fossil-equivalent, though much more resource hungry and hence more costly to run than the equivalent Fossil setup. GitLab's requirements are tolerable when you're dedicating a local rack server or blade to it, since that's about the smallest thing you could call a "server" these days, but when you go to host that in the cloud, you can expect to pay about 8⨉ as much to comfortably host GitLab as for Fossil. This difference is largely due to basic technology choices: Ruby and PostgreSQL vs C and SQLite.

The Fossil project itself is hosted on a very small VPS, and we've received many reports on the Fossil forum about people successfully hosting Fossil service on bare-bones $5/month VPS hosts, spare Raspberry Pi boards, and other small hosts.

2.3 Durable

The baseline data structures for Fossil and Git are the same, modulo formatting details. Both systems store check-ins as immutable objects referencing their immediate ancestors and named by a cryptographic hash of the check-in content.

The difference is that Git stores its objects as individual files in the .git folder or compressed into bespoke pack-files, whereas Fossil stores its objects in a relational (SQLite) database file. To put it another way, Git uses an ad-hoc pile-of-files key/value database whereas Fossil uses a proven, heavily-tested, general-purpose, durable SQL database. This difference is more than an implementation detail. It has important practical consequences.

With Git, one can easily locate the ancestors of a particular check-in by following the pointers embedded in the check-in object, but it is difficult to go the other direction and locate the descendants of a check-in. It is so difficult, in fact, that neither native Git nor GitHub provide this capability short of groveling the commit log. With Git, if you are looking at some historical check-in then you cannot ask "What came next?" or "What are the children of this check-in?"

Fossil, on the other hand, parses essential information about check-ins (parents, children, committers, comments, files changed, etc.) into a relational database that can be easily queried using concise SQL statements to find both ancestors and descendents of a check-in.

Leaf check-ins in Git that lack a "ref" become "detached," making them difficult to locate and subject to garbage collection. This detached head state problem has caused untold grief for countless Git users. With Fossil, all check-ins are easily located via multiple possible paths, so that detached heads are simply not possible in Fossil.

This design difference shows up in several other places within each tool. It is why Fossil's timeline is generally more detailed yet more clear than those available in Git front-ends. (Contrast this Fossil timeline with its closest equivalent in GitHub.) It's why there is no inverse of the cryptic @~ notation in Git, meaning "the parent of HEAD," which Fossil simply calls "prev", but there is a "next" special check-in name in Fossil. It is why Fossil has so many built-in status reports to help maintain situational awareness, aid comprehension, and avoid errors.

2.4 Portable

Fossil is largely written in ISO C, almost purely conforming to the original 1989 standard. We make very little use of C99, and even less of C11. Fossil does make use of POSIX and Windows APIs where necessary, but it's about as portable as you can ask given that ISO C doesn't define all of the facilities Fossil needs to do its thing. (Network sockets, file locking, etc.) There are certainly well-known platforms Fossil hasn't been ported to yet, but that's most likely due to lack of interest rather than inherent difficulties in doing the port. We believe the most stringent limit on its portability is that it assumes at least a 32-bit CPU and several megs of flat-addressed memory.³ Fossil isn't quite as portable as SQLite, but it's close.

About half of the code in Fossil is actually an embedded copy of the current version of SQLite. Much of what is Fossil-specific after you set SQLite itself aside is SQL code calling into SQLite. The number of lines of SQL code in Fossil isn't large by percentage, but since SQL is such an expressive, declarative language, it has an outsized contribution to Fossil's user-visible functionality.

Fossil also makes good use of JavaScript for its web UI, and there's a fair bit of use of the Tcl and TH1 scripting languages. These do not hamper Fossil's portability, since they are also all quite portable technologies themselves.

Git is largely portable only to POSIX platforms. About half its code is POSIX C, and about a third of it is POSIX shell code. There's also quite a lot of Perl, Tcl, and Python code in Git. Although these technologies are quite portable within the sphere of POSIX OSes, they're quite foreign to Windows, which is why the so-called "Git for Windows" distributions (both first-party and third-party) are actually an MSYS POSIX portability environment bundled with all of the Git stuff, because it would be too painful to port Git natively to Windows. Git is a foreign citizen on Windows, speaking to it only through a translator.

While Fossil does lean toward POSIX norms when given a choice — LF-only line endings are treated as first-class citizens over CR+LF, for example — the Windows build of Fossil is truly native.

The third-party extensions to Git tend to follow this same pattern. GitLab isn't portable to Windows at all, for example. For that matter, GitLab isn't even officially supported on macOS, the BSDs, or uncommon Linuxes!

2.5 Linux vs. SQLite

Fossil and Git promote different development styles because each one was specifically designed to support the creator's main software development project: Linus Torvalds designed Git to support development of the Linux kernel, and D. Richard Hipp designed Fossil to support the development of SQLite. Both projects must rank high on any objective list of "most important FOSS projects," yet these two projects are almost entirely unlike one another. So, too, are these two DVCSes.

In the following sections, we will explain how four key differences between Linux and SQLite dictated the design of each DVCS's low-friction usage path.

When deciding between these two DVCSes, you should ask yourself, "Is my project more like Linux or more like SQLite?"

2.5.1 Development Organization

Eric S. Raymond's seminal essay-turned-book "The Cathedral and the Bazaar" details the two major development organization styles found in FOSS projects. As it happens, Linux and SQLite fall on opposite sides of this dichotomy. Differing development organization styles dictate a different design and low-friction usage path in the tools created to support each project.

Git promotes the Linux kernel's bazaar development style, in which a loosely-associated mass of developers contribute their work through a hierarchy of lieutenants who manage and clean up these contributions for consideration by Linus Torvalds, who has the power to cherrypick individual contributions into his version of the Linux kernel. Git allows an anonymous developer to rebase and push specific locally-named private branches, so that a Git repo clone often isn't really a clone at all: it may have an arbitrary number of differences relative to the repository it originally cloned from. Git encourages siloed development. Select work in a developer's local repository may remain private indefinitely.

All of this is exactly what one wants when doing bazaar-style development.

Fossil's normal mode of operation differs on every one of these points, with the specific designed-in goal of promoting SQLite's cathedral development model:

Where Git encourages siloed development, Fossil fights against it. Fossil places a lot of emphasis on synchronizing everyone's work and on reporting on the state of the project and the work of its developers, so that everyone — especially the project leader — can maintain a better mental picture of what is happening, leading to better situational awareness.

Each DVCS can be used in the opposite style, but doing so works against their low-friction paths.

2.5.2 Scale

The Linux kernel has a far bigger developer community than that of SQLite: there are thousands and thousands of contributors to Linux, most of whom do not know each others names. These thousands are responsible for producing roughly 89⨉ more code than is in SQLite. (10.7 MLOC vs. 0.12 MLOC according to SLOCCount.) The Linux kernel and its development process were already uncommonly large back in 2005 when Git was designed, specifically to support the consequences of having such a large set of developers working on such a large code base.

95% of the code in SQLite comes from just four programmers, and 64% of it is from the lead developer alone. The SQLite developers know each other well and interact daily. Fossil was designed for this development model. As well, we think the fact of Fossil's birth a year later than Git allowed it to learn from some of the key design mistakes in Git.

We think you should ask yourself whether you have Linus Torvalds scale software configuration management problems or D. Richard Hipp scale problems when choosing your DVCS. An automotive air impact wrench running at 8000 RPM driving an M8 socket-cap bolt at 16 cm/s is not the best way to hang a picture on the living room wall.

2.5.3 Accepting Contributions

As of this writing, Git has received about 4.5⨉ as many commits as Fossil resulting in about 2.5⨉ as many lines of source code. The line count excludes tests and in-tree third-party dependencies. It does not exclude the default GUI for each, since it's integral for Fossil, so we count the size of gitk in this.

It is obvious that Git is bigger in part because of its first-mover advantage, which resulted in a larger user community, which results in more contributions. But is that the only reason? We believe there are other relevant differences that also play into this which fall out of the "Linux vs. SQLite" framing: licensing, community structure, and how we react to drive-by contributions. In brief, it's harder to get a new feature into Fossil than into Git.

A larger feature set size is not necessarily a good thing. Git's command line interface is famously arcane. Masters of the arcane are able to do wizardly things, but only by studying their art deeply for years. This strikes us as a good thing only in cases where use of the tool itself is the primary point of that user's work.

Most DVCS users are not using a DVCS for its own sake, so we do not want the DVCS with the most features, we want the one with a more easily internalized behavior set, which we can pick up, use quickly, and then set aside in order to get back to our actual job as quickly as possible. There is some minimal set of features required to achieve that, but there is a level beyond which more features only slow us down while we're learning about the DVCS, as we must plow through documentation on features we're not likely to ever use. When the number of features grows to the point where people of normal motivation cannot spend the time to master them all, you make the tool less productive to use.

We achieve this balance between feature set size and ease of use by carefully choosing which users to give commit bits to, then in being choosy about which of the contributed feature branches to merge down to trunk.

The end result is that Fossil more closely adheres to the principle of least astonishment than Git does.

2.5.4 Individual Branches vs. The Entire Change History

Both Fossil and Git store history as a directed acyclic graph (DAG) of changes, but Git tends to focus more on individual branches of the DAG, whereas Fossil puts more emphasis on the entire DAG.

For example, the default "sync" behavior in Git is to only sync a single branch, whereas with Fossil the only sync option it to sync the entire DAG. Git commands, GitHub, and GitLab tend to show only a single branch at a time, whereas Fossil usually shows all parallel branches at once. Git has commands like "rebase" that help keep all relevant changes on a single branch, whereas Fossil encourages a style of many concurrent branches constantly springing into existance, undergoing active development in parallel for a few days or weeks, then merging back into the main line and disappearing.

This difference in emphasis arises from the different purposes of the two systems. Git focuses on individual branches, because that is exactly what you want for a highly-distributed bazaar-style project such as Linux. Linus Torvalds does not want to see every check-in by every contributor to Linux, as such extreme visibility does not scale well. But Fossil was written for the cathedral-style SQLite project with just a handful of active committers. Seeing all changes on all branches all at once helps keep the whole team up-to-date with what everybody else is doing, resulting in a more tightly focused and cohesive implementation.

2.6 One vs. Many Check-outs per Repository

A "repository" in Git is a pile-of-files in the ".git" subdirectory of a single check-out. The check-out and the repository are located together in the filesystem.

With Fossil, a "repository" is a single SQLite database file that can be stored anywhere. There can be multiple active check-outs from the same repository, perhaps open on different branches or on different snapshots of the same branch. Long-running tests or builds can be running in one check-out while changes are being committed in another.

Git version 2.5 adds a feature to emulate Fossil's decoupling of the repository from the check-out tree, which it calls "git-worktree." This command sets up a series of links in the filesystem to allow a single repository to host multiple check-outs. However, the interface is sufficiently difficult to use that most people find it easier to create a separate clone for each check-out. There are also practical consequences of the way it's implemented that make worktrees not quite equivalent to the main Git repo + checkout tree.

With Fossil, the complete decoupling of repository and check-out tree means every working check-out tree is treated equally. It's common in Fossil to have a check-out tree for each major working branch so that you can switch branches with a "cd" command rather than replace the current working file set with a different file set by updating in place, as Git prefers.

2.7 What you should have done vs. What you actually did

Git puts a lot of emphasis on maintaining a "clean" check-in history. Extraneous and experimental branches by individual developers often never make it into the main repository. And branches are often rebased before being pushed, to make it appear as if development had been linear. Git strives to record what the development of a project should have looked like had there been no mistakes.

Fossil, in contrast, puts more emphasis on recording exactly what happened, including all of the messy errors, dead-ends, experimental branches, and so forth. One might argue that this makes the history of a Fossil project "messy." But another point of view is that this makes the history "accurate." In actual practice, the superior reporting tools available in Fossil mean that the added "mess" is not a factor.

Like Git, Fossil has an amend command for modifying prior commits, but unlike in Git, this works not by replacing data in the repository, but by adding a correction record to the repository that affects how later Fossil operations present the corrected data. The old information is still there in the repository, it is just overridden from the amendment point forward. For extreme situations, Fossil adds the shunning mechanism, but it has strict limitations that prevent global history rewrites.

One commentator characterized Git as recording history according to the victors, whereas Fossil records history as it actually happened.

2.8 Hash Algorithm: SHA-3 vs SHA-2 vs SHA-1

Fossil started out using 160-bit SHA-1 hashes to identify check-ins, just as in Git. That changed in early 2017 when news of the SHAttered attack broke, demonstrating that SHA-1 collisions were now practical to create. Two weeks later, the creator of Fossil delivered a new release allowing a clean migration to 256-bit SHA-3 with full backwards compatibility to old SHA-1 based repositories.

Here in mid-2019, that feature is now in every OS and package repository known to include Fossil so that the next release as of this writing (Fossil 2.10) will default to enforcing SHA-3 hashes by default. This not only solves the SHAttered problem, it should prevent a reoccurrence for the forseeable future. Only repositories created before the transition to Fossil 2 are still using SHA-1, and then only if the repository's maintainer chose not to switch them into SHA-3 mode some time over the past 2 years.

Meanwhile, the Git community took until August 2018 to announce their plan for solving the same problem by moving to SHA-256 (a variant of the older SHA-2 algorithm) and until February 2019 to release a version containing the change. It's looking like this will take years more to percolate through the community.

The practical impact of SHAttered on Merkle tree / block chain structred data stores like Git and Fossil isn't clear, but Fossil moved on the problem quickly and had a widely-deployed solution to it years ago.

3.0 Missing Features

Most of the capabilities found in Git are also available in Fossil and the other way around. For example, both systems have local check-outs, remote repositories, push/pull/sync, bisect capabilities, and a "stash." Both systems store project history as a directed acyclic graph (DAG) of immutable check-in objects.

There are many areas where one system has a feature that is simply missing in the other, however. We covered most of those above, but there are some others we haven't gotten to yet.

3.1 Features found in Fossil but missing from Git

Fossil keeps track of all repositories and check-outs and allows operations over all of them with a single command. For example, in Fossil is possible to request a pull of all repositories on a laptop from their respective servers, prior to taking the laptop off network. Or it is possible to do "fossil all changes" to see if there are any uncommitted changes that were overlooked prior to the end of the workday.
Whenever Fossil is told to modify the local checkout in some destructive way (fossil rm, fossil update, fossil revert, etc.) Fossil remembers the prior state and is able to return the local check-out directory to its prior state with a simple "fossil undo" command. You cannot undo a commit, since writes to the actual repository — as opposed to the local check-out directory — are more or less permanent, on purpose, but as long as the change is simply staged locally, Fossil makes undo easier than in Git.

3.2 Features found in Git but missing from Fossil

Because of its emphasis on recording history exactly as it happened, rather than as we would have liked it to happen, Fossil deliberately does not provide a "rebase" command. One can rebase manually in Fossil, with sufficient perserverence, but it is not something that can be done with a single command.
The fossil push, fossil pull, and fossil sync commands do not provide the capability to push or pull individual branches. Pushing and pulling in Fossil is all or nothing. This is in keeping with Fossil's emphasis on maintaining a complete record and on sharing everything between all developers.

Asides and Digressions

  1. Many things are lost in making a Git mirror of a Fossil repo due to limitations of Git relative to Fossil. GitHub adds some of these missing features to stock Git, but because they're not part of Git proper, exporting a Fossil repository to GitHub will still not include them; Fossil tickets do not become GitHub issues, for example.

  2. The fossil-scm.org web site is actually hosted in several parts, so that it is not strictly true that "everything" on it is in the self-hosting Fossil project repo. The web forum is hosted as a separate Fossil repo from the main Fossil self-hosting repo for administration reasons, and the Download page content isn't normally sync'd with a "fossil clone" command unless you add the "-u" option. (See "How the Download Page Works" for details.) There may also be some purely static elements of the web site served via D. Richard Hipp's own lightweight web server, althttpd, which is configured as a front end to Fossil running in CGI mode on these sites.

  3. We have yet to hear from someone who has ported Fossil to z/OS, for example, though it should be quite possible.

  4. Both Fossil and Git support patch(1) files, a common way to allow drive-by contributions, but it's a lossy contribution path for both systems. Unlike Git PRs and Fossil bundles, patch files collapse mulitple checkins together, they don't include check-in comments, and they cannot encode changes made above the individual file content layer: you lose branching decisisions, tag changes, file renames, and more when using patch files.