Using fossil for town government
(1) By Dave St.Germain (davestg) on 2024-03-15 19:49:47 [link] [source]
I'm working on a repository to track changes to my town's bylaws -- not as a legal document, but for historical reference. The laws change once or twice per year, at most. So, the ongoing history is not very complicated to model.
I had thought I would start with the earliest records I could find (1890's or earlier), and build up the document(s), using a checkin for each change, with the date set to the date that the law was enacted. But the problem is that the town laws weren't very coherent until about 20 years ago. I can easily make a comprehensive repository of the last 20 years of changes. Prior to that will involve sorting through paper records and scanning them in, individually.
My question is how to make a clean representation of the history, if I'm not able to start as early the 1800's. Can I create the recent history of changes, and then work backwards, in piecemeal fashion?
I tried testing this -- making changes to a document, with the date set before the document was created. It works, but it creates forks in the history, which appear confusing on the timeline (because "trunk" moves around in time).
What's the best way to model this history, allowing for adding items to the past?
Oh, and the laws are just markdown files per section, with an index. In addition, I'll have other files alongside that record results of votes or meeting notes.
Thanks!
P.S. I saw the prior conversation about VCS for laws but decided to start a new thread because my question is rather specific.
(2) By Richard Hipp (drh) on 2024-03-15 20:17:27 in reply to 1 [source]
It will be difficult to add content going backwards in time.
The underlying data structure is an acyclic directed graph. The initial check-in is the root and there are edges moving from the root to all subsequent check-ins. The edges are stored with the child nodes. Each check-in contains the hashes of all its parent nodes. The hash of the parent check-ins become part of the hash of the new check-in. Since the name of a check-in is a hash of its content, and its content includes the name of all its parents, the names of the parents of each check-in are baked into the name of each child check-in.
Hence it is difficult to construct a parent check-in after the child already exists.
That said, there is a reparent command which you could, in theory, use to generate the graph starting at the leaves and working backwards toward the root. But that might be tricky. I recommend running some experiments first, to make sure the reparent command will work for you.
(3) By anonymous on 2024-03-15 20:36:38 in reply to 1 [link] [source]
You have (effectively) a linear history that you don't yet have all of the old parts for.
Back-filling history in a single fossil repository is really not what it is made for.
But -- is there any reason why you must have one repository that will remain present?
I suggest: make fossil repositories to throw away.
That is -- make sure that you can script adding all of the files you know about so far, according to the date that you want them to have. Do that for the past 10 years into a new repository, and admire the UI and the change history and the diffs.
Then get the files ready for the previous 10 years as well, and re-do the script for the entire 20 years, in order, into a new fossil repository. Admire the UI, etc, and discard (or copy away) the original repository.
Whenever you have a big enough "new batch" of files to add, re-run the entire input sequence into a new repository.
Put a big scary warning telling users not to clone the repository in order to make lasting changes to it, because it will be replaced in the future.
After you have "all" of the history, you can bless it as "complete" and from then on, you can advertise it as "not going to be discarded again" and others can safely make changes to their copy, expecting that they will be able to re-sync as-needed.
Cheers,
(4) By Warren Young (wyoung) on 2024-03-15 22:13:49 in reply to 3 [link] [source]
That's essentially what they did with the Unix History Repo, where a group of code archaeologists reverse-engineered a commit history for Unix, as if Git were created on the PDP-7 and used continuously thenceforth.
The tools they used to create that are here, for what that's worth. It takes the tree of historical files and rebuilds the repo at the first link.
(6) By Dave St.Germain (davestg) on 2024-03-15 23:35:51 in reply to 4 [link] [source]
That looks like a very cool project.
Even though git would be more accessible to people and tooling, one major reason why I rejected it for my project is because support for dates prior to the unix epoch is complicated or nonexistent.
Fossil has no problem with dates in the 1700's.
(7) By Warren Young (wyoung) on 2024-03-16 00:17:00 in reply to 6 [link] [source]
Git is incidental here. You can do the same type of scripting with Fossil.
I rebuilt one of my biggest repos from tarball backups once, when the original Subversion hosting site shut down without warning. Like the Unix repo I pointed to, it only has commits at the release level at the point I did the conversion, but that was good enough for my purposes, showing a deep sampling of the project's historical record prior to that point.
That in turn gives us a broader lesson for this topic: make sure your repo gets widely cloned, to avoid a similar single point of failure.
(5) By Dave St.Germain (davestg) on 2024-03-15 23:32:14 in reply to 3 [link] [source]
Thanks. That sounds like a pragmatic way to do it. And since I'm the only one working on the project, I don't have to worry about somebody cloning the repo before it's "done".
I think it could also be accomplished with a single "scratch" repo. For every batch of previous years' changes, I could create a named branch from the starting point (of trunk) and add commits in order, setting the correct dates. Then fossil timeline --oneline -n 0
would give the linear list of changes, which I could reconstruct in the clean repository.