Fossil Forum

Shameless plug: libfossil for daily fossil work
Login

Shameless plug: libfossil for daily fossil work

Shameless plug: libfossil for daily fossil work

(1) By Stephan Beal (stephan) on 2021-06-25 04:21:56 [source]

As some of you may recall, libfossil recently, thanks to a gentle kick in the butt by Dan "The Poker" Shearer, got revived from the dead and updated to work with fossil's version 2 hashes, its recently internal timeline changes, and whatnot:

For the who don't know this, and who may feel a bit adventurous, libfossil can be used for much of one's daily work with fossil. For example, it can check in changes, show a timeline or the current checkout status, perform an "update" from one version to another, and manage tags. One of the test apps does what amounts to a partial "rebuild" but, for reasons beyond my ken, it runs orders of magnitude more slowly than a rebuild using the core fossil app (i have a suspicion about the speed difference, but it's unproven).

Sidebar for those concerned about it corrupting a repository: fossil's own repository pre-commit integrity self-checks were literally the second feature added to the library (right behind the pseudo-recursive transaction support), long before it could ever actually write any artifacts to the database. Its artifact parser is stricter than fossil's own, and it was in fact responsible for detecting a handful of malformed artifacts in fossil's own repository and thereby uncovering old (now fixed) fossil bugs. libfossil has never introduced a malformed artifact into a fossil repository ("knock on wood" and all that) and it goes way out of its way to ensure that it is literally unable to do so.

The notably missing features for day-to-day work are the sync protocol and the "merge" command. (Its diff support also isn't nearly as slick as fossil's "diff -tk", though, so i stick with fossil for diffing). It can merge changes (that's needed for doing an update) but the "merge" command functionality has not yet been ported over because, frankly, it's a bit of a beast and my hands haven't felt up to it.

For example:

# Sync using fossil app:
[pi@pi4b8:~/fossil/fossil]$ f pull
Pull from https://stephan@fossil-scm.org/home
Round-trips: 1   Artifacts sent: 0  received: 0
Pull done, sent: 405  received: 1525  ip: <snip>

# Update with libfossil:
[pi@pi4b8:~/fossil/fossil]$ f-update
********************************************************
WARNINGS:
  1) The update API is still in development.
  2) Use at your own risk.
  3) There is no UNDO support.
  4) See #2.
  5) See #2.
********************************************************

Version to update to: 255a28b37a22 (RID 52497)
[u ]    20942   25 Jun 2021  src/bisect.c
[u ]    48067   25 Jun 2021  src/blob.c
[u ]    76077   25 Jun 2021  src/file.c
[u ]    13406   25 Jun 2021  src/http_transport.c
[u ]    71270   25 Jun 2021  src/main.mk
[u ]    65372   25 Jun 2021  src/makemake.tcl
[+ ]    31014   25 Jun 2021  src/patch.c
[u ]    26279   25 Jun 2021  src/stash.c
[u ]    22454   25 Jun 2021  src/util.c
[u ]    31968   25 Jun 2021  win/Makefile.dmc
[u ]    82974   25 Jun 2021  win/Makefile.mingw
[u ]    71326   25 Jun 2021  win/Makefile.msc
[u ]    15543   25 Jun 2021  www/build.wiki
[u ]    17440   25 Jun 2021  www/caps/index.md
[u ]    91418   25 Jun 2021  www/changes.wiki
[u ]     7513   25 Jun 2021  www/mkindex.tcl
[+ ]     2855   25 Jun 2021  www/patchcmd.md
[u ]     8285   25 Jun 2021  www/permutedindex.html

Processed 996 file(s) from [current] [255a28b37a224130].
18 SCM'd file(s) written to disk.
978 file(s) left unchanged on disk.

repository-db:       /home/pi/fossil/fossil.fsl
checkout-root:       /home/pi/fossil/fossil/
checkout-version:    255a28b37a2241300b708b457f577bfcafe4bda7b3cd5a21318c3475f99ff672 2021-06-24 18:40:48 local (RID 52497)
parent:              6d2e48b4cd38e0283863b88c0b5b96091c642fea3d61d2a5b4f131f86ad476a8
user:                drh
tags:                trunk
comment:             Improvements to comments on the filename shell quoting logic and test logic.
No changes to code.

There are several other features which "would be nice" for daily work but which have not yet been ported ("undo" and "stash" immediately come to mind) but which are not used (by me) frequently enough that i've bothered to port them. That said: contributors willing to dive in and add and/or port features are welcomed with open arms! This isn't a one-man show! Some level of C know-how is required but one need not be a C wizard.

Aside from day-to-day work using the example apps, libfossil is first and foremost a library. All of the features demonstrated by these apps are primarily implemented using library APIs. i.e. they can be integrated into one's own software. They also provide some features which fossil doesn't, such as the ability to programmatically create checkins without a checkout.

So... if you're feeling a bit adventurous and/or would like to contribute, please tap on one of the links above. (We have a forum over there but it's not currently open for self-registration and it suffers from considerable mail-sending delays from the hoster. Feel free to contact me if you'd like access and i'll get you set up.)

(2) By Stephan Beal (stephan) on 2021-06-25 04:35:38 in reply to 1 [link] [source]

And here's an example of doing a checkin using libfossil's tools:

[pi@pi4b8:~/fossil/libfossil/f-apps]$ f-ci -m "Removed an unnecessary #include."
QUEUED: f-apps/f-test-ciwoco.c
New version: 681c930f1d48b809229b80f937ec54531b292240 (10052)
Note that libfossil currently has no remote sync support, so to push your changes you will need to learn the Fossil sync protocol and speak it to the remote server of your choice over a telnet connection.
Or you can just use fossil(1)'s "push" feature.

[pi@pi4b8:~/fossil/libfossil/f-apps]$ f push
Push to https://stephan@fossil.wanderinghorse.net/r/libfossil
Round-trips: 1   Artifacts sent: 2  received: 0
Push done, sent: 1035  received: 2951  ip: 173.231.209.32

[pi@pi4b8:~/fossil/libfossil/f-apps]$ f-timeline 
checkin  [681c930f1d48] @ 2021-06-25 06:26:57 by [stephan] branch [trunk] *CURRENT*

	Removed an unnecessary #include.
...

Here are the apps which do that:

noting that most of the real work is done at the library level, but the applications still have a fair amount of stuff to sort out as well.

(3) By sean (jungleboogie) on 2021-06-25 15:38:09 in reply to 1 [link] [source]

Congrats on the progress you've made so far! (Most of which I don't know how I would make use of it.)

(4) By Brian Tiffin (btiffin) on 2021-06-25 17:36:13 in reply to 1 [link] [source]

One of the test apps does what amounts to a partial "rebuild" but, for reasons beyond my ken, it runs orders of magnitude more slowly than a rebuild using the core fossil app (i have a suspicion about the speed difference, but it's unproven).

Just to spread a little performance profiling wisdom, Stephan.

You can find most bottlenecks by casually looking at call stacks. If a routine is eating 90% of the time, then 9 times out of 10 the hotspot will be included in the call stack backtrace when you look.

This can be a very relaxing and utterly satisfying way of improving performance.

Take random, casual looks at the running process call stack and the time sucks begin to reveal themselves. This technique can be used to hone in on problems in the large and in the small. Usually. Data dependent quirks might complicate things a little bit, but the technique can help reveal those too.

Happy hunting.

(5) By Stephan Beal (stephan) on 2021-06-25 18:10:52 in reply to 4 [link] [source]

Just to spread a little performance profiling wisdom, Stephan.

By all means!

You can find most bottlenecks by casually looking at call stacks. If a routine is eating 90% of the time, then 9 times out of 10 the hotspot will be included in the call stack backtrace when you look.

In this particular case the underlying algorithms were ported over 1-to-1, so the intuitive answer is that their performance is similar. The major code difference, however, is:

Fossil:

some_op();
another_op();
yet_another_op();
...

libfossil:

int rc;
rc = some_op();
if(rc) { update context's error state; goto end; }
rc = another_op();
if(rc) { update context's error state; goto end; }
rc = yet_another_op();
if(rc) { update context's error state; goto end; }
...
end:
cleanup...;
return rc;

ad nauseam. Fossil's fail-fast approach to APIs reduces explicit error handling on the part of the developer to very nearly zero in most cases. The lib API, by virtue of being a lib API, requires error checking on very nearly every line of code :/. e.g. a huge amount of fossil's work is with memory buffers, and every single change to one of those might allocate, ergo might fail to allocate, ergo requires checking its result, ergo has at least as much error checkin/handling code as business logic.

My suspicion (without having spend much time profiling it other than a few hours digging through callgrind output), is that the truly epic amount of error handling/branching is a significant factor. callgrind, however, can't really tell me whether the branching is as expensive as it seems. It can tell me CPU instruction counts, which can be compared to their counterparts in fossil, but that's a rabbit hole i have yet to go down. "First make it work, then make it fast." Aside from the so-called crosslinking (which is much of what rebuild does), the rest is/seems plenty fast (as fast as fossil, as far as us humans can perceive).

This can be a very relaxing and utterly satisfying way of improving performance.

i don't know about relaxing, but "it takes all kinds!"

Take random, casual looks at the running process call stack and the time sucks begin to reveal themselves. This technique can be used to hone in on problems in the large and in the small. Usually. Data dependent quirks might complicate things a little bit, but the technique can help reveal those too.

callgrind (a valgrind tool) and kcachegrind (a UI for the generated data) have been a big help in locating some of the hotspots, and they've been directly responsible for at least a 30-ish percent speedup in the slow parts so far.

(6) By Brian Tiffin (btiffin) on 2021-06-26 16:57:19 in reply to 5 [link] [source]

i don't know about relaxing, but "it takes all kinds!"

Oops, wrong verb tense. s/relaxing/relaxed/. And agree, it takes more than one technique to win this game.

Plus an omission in the first reply. Glad to see you're feeling well enough to be coding, Stephan. RSI blows.

I'm not sure if any hints were dropped 8ish years ago, but there were experiments with integrating libfossil in GnuCOBOL applications when I first heard about it.

There are grand potentials in this library for programming in the large. Along with applications, I hope programming language designers catch wind of it. A programming environment with baked in self-hosted revision control and community management facilities would be a huge boon to the trade.

Have good, make well.

(7) By Stephan Beal (stephan) on 2021-06-26 17:33:44 in reply to 6 [link] [source]

I'm not sure if any hints were dropped 8ish years ago, but there were experiments with integrating libfossil in GnuCOBOL applications when I first heard about it.

Not that i recall, but i have the memory of a goldfish or might have tuned out at "cobol" ;).

There are grand potentials in this library for programming in the large. Along with applications, I hope programming language designers catch wind of it. A programming environment with baked in self-hosted revision control and community management facilities would be a huge boon to the trade.

Indeed, which is why the project needs more than one active developer ;). My RSI demonstrates the very real danger of The Bus Factor. Even now, after nearly 7 years, my hands can't handle typing full time for more than a couple of months before needing a break for a month or more :/.