URL for serving hash of latest check-in on a branch

(1) By Joel Dueck (joeld) on 2024-01-19 14:59:44 [source]

Does the Fossil web UI offer a URL which serves only the hash of the latest commit on a given branch?

I would like use Fossil to host/serve packages for the Racket programming language. If not using git, you have to use “manual deployment”: you provide two URLs, one for a packagename.zip file containing the package contents, and a second packagename.zip.CHECKSUM URL whose response uniquely identifies the package’s release.

The .zip file looks like it will be easy to serve from Fossil, but I can't find anything in the docs for serving just the checksum.

(2.2) By Stephan Beal (stephan) on 2024-01-19 15:14:49 edited from 2.1 in reply to 1 [link] [source]

I can't find anything in the docs for serving just the checksum.

There isn't one. You can get that info from a local checkout easily, but there's currently no web interface which provides just that value:

$ fossil pull
$ fossil whatis -verbose BRANCHNAME | awk '/^artifact:/{print $2}'

Edit: or...

$ fossil pull
$ fossil info subDir | awk '/^hash:/{print $2}'

While we have never guaranteed that the output format of CLI commands is stable enough to rely on in scripts, it's unlikely that either of the above output formats will be suddenly changed.

(5) By Joel Dueck (joeld) on 2024-01-19 16:42:08 in reply to 2.2 [link] [source]

OK, I guess I'll just have to do this with a script that updates some unversioned files then.

(6) By Warren Young (wyoung) on 2024-01-19 16:42:14 in reply to 2.2 [link] [source]

there's currently no web interface

Challenge…accepted!

  $ curl -s https://example.com/code/raw/trunk | fossil sha3sum - | cut -f1 -d' '
  eb4efc8ccd3eaaab23ce7d57fad3c665699415bcac021f66ba5bf52c94b664b0
  $ fossil info eb4efc8ccd3ea
  hash:         eb4efc8ccd3eaaab23ce7d57fad3c665699415bc 2024-01-18 15:20:39 UTC
  parent:       ab05475997fc58b967da852a47b90834aba4f149 2024-01-18 14:50:58 UTC
  tags:         trunk
  comment:      Document the --linenum and --invert command-line options to 'fossil diff'. (user: danield)

(7) By Stephan Beal (stephan) on 2024-01-19 16:46:42 in reply to 6 [link] [source]

Challenge…accepted!

Ooooh... that's a clever way to do it. Your prize, sir:

🏆

(8) By Joel Dueck (joeld) on 2024-01-19 16:53:00 in reply to 6 [link] [source]

That's cool and all, but not useful for my purpose. I need to supply a URL to the package server that it can use to determine when my package has been updated.

Specifically, I need to supply a URL of the form [package URL].CHECKSUM where [package URL] is the URL for downloading a zip file of the package’s most recent release.

I had thought that whatever URL fossil provides for both these things, I could create web aliases to serve them in the way the package manager expects. But I also find now that although Fossil will serve /zip/trunk/download.zip, you cannot create a web alias for that URL: Fossil complains "/racket-package.zip" aliased to "/zip/trunk/download.zip" but "/zip/trunk/download.zip" does not exist.

(9) By Joel Dueck (joeld) on 2024-01-19 16:58:28 in reply to 8 [link] [source]

Fossil complains "/racket-package.zip" aliased to "/zip/trunk/download.zip" but "/zip/trunk/download.zip" does not exist.

Update: if I set the web alias destination to /zip?name=racket-package.zip it works fine. Needed to read the help text more closely (must be a / followed by a single path element).

(10.1) By Stephan Beal (stephan) on 2024-01-19 17:00:57 edited from 10.0 in reply to 8 [link] [source]

... you cannot create a web alias for that URL: Fossil complains

IIRC, fossil aliases only support a single path level but does support URL args, so perhaps you can alias /racket-package.zip to /zip?name=trunk/download.zip? That works for me locally, ~~noting that the directory name in the zip file will be from the alias (racket-package in this case)~~ (edit: not true - user error on my part).

(12) By Warren Young (wyoung) on 2024-01-19 17:23:53 in reply to 8 [link] [source]

I need to supply a URL of the form [package URL].CHECKSUM

So write a short shell script under /ext and add an Alias of the required form under Admin, then. Problem solved.

Or, add a hook that updates a /uv entry of the appropriate name on each trunk commit.

(15) By Joel Dueck (joeld) on 2024-01-19 20:38:49 in reply to 12 [link] [source]

So write a short shell script under /ext and add an Alias of the required form under Admin, then. Problem solved.

OK yeah that might work, if setting an alias to /ext/script doesn’t run afoul of the same limitation that tripped me up earlier (Both LHS and RHS can only have a single path element). I suspect that UV files are the only realistic option here. Thanks

(16) By Andy Bradford (andybradford) on 2024-01-20 17:19:46 in reply to 15 [link] [source]

> I suspect that UV files are the only realistic option here.

They are indeed and here are more reasons why:

Even if Fossil did provide a URL for  the hash of the latest commit on a
given branch it  wouldn't meet the needs. The hash  of the latest commit
will never match  the hash of a  "package" that needs to  be provided to
the package  server. The  package server would  take the  hash provided,
hash the "package" which  is a zip file (or tarball)  and they would not
match and I assume the package would be rejected at that point.

Furthermore, there's no guarantee that the same commit being zipped with
/zip or  /tarball will have  the same  hash between the  different times
that  the  package server  might  check,  so  the package  server  might
incorrectly think the  package has been updated when in  fact it has not
(see where this has been discussed already in [1] and [2]).

Basically UV files  allow you to define "package" and  what the "package
hash" will be.

Andy

[1] https://www.fossil-scm.org/forum/forumpost/4903c3fcc1275dea
[2] https://www.fossil-scm.org/forum/forumpost/f82802d7404aff7f

(3) By Daniel Dumitriu (danield) on 2024-01-19 15:15:44 in reply to 1 [link] [source]

You don't need the check-in hash, you need the package's "checksum," that is, its SHA1 hash.

(4) By Joel Dueck (joeld) on 2024-01-19 16:41:12 in reply to 3 [link] [source]

Can you clarify what you mean here? Are you basing “what I need” on the Racket documentation I linked above?

(11) By Daniel Dumitriu (danield) on 2024-01-19 17:01:33 in reply to 4 [link] [source]

Obviously.

Then, upload the archive and its checksum to your site:

scp ‹package›.zip ‹package›.zip.CHECKSUM your-host:public_html

and further

checksum — a string that identifies different releases of a package. A package can be updated when its checksum changes, whether or not its version changes. The checksum normally can be computed as the SHA1 (see openssl/sha1) of the package’s content.

(13) By Joel Dueck (joeld) on 2024-01-19 19:47:53 in reply to 11 [link] [source]

I understand how you may have gotten this idea, but I think you have not read those paragraphs carefully. The docs define a checksum for this purpose as a string that uniquely identifies the package’s release, full stop. This string normally (not always) can be (not must be) computed as the SHA1 of the package’s content.

There is not a requirement that a particular hash algorithm be used, or that it specifically be used on the package contents. This is the straightforward interpretation of the documentation, and Racket’s core developers have already confirmed as much to me personally.

Additionally, when Racket packages are hosted on git, it is in fact the git commit hash that the package server uses as the checksum.

(14) By Daniel Dumitriu (danield) on 2024-01-19 19:50:09 in reply to 13 [link] [source]

Fair enough.

(17) By Andy Bradford (andybradford) on 2024-01-20 17:47:28 in reply to 13 [link] [source]

> The docs define a checksum for  this purpose as a string that uniquely
> identifies the package’s release, full stop.

I see I did not fully grasp  this in my last response. So the "checksum"
isn't a hash  of the package contents  at all, just a  unique string? If
so, are there limits on the length of the string?

Andy

(18.2) By Warren Young (wyoung) on 2024-01-20 20:51:02 edited from 18.1 in reply to 17 [link] [source]

…are there limits on the length of the string?

And if not, you can just use the URL in the curl command I gave above. The manifest is a unique string identifying that commit, albeit a long one with several LFs in the middle. SHA3 hashing it merely condenses that down to the same format Fossil uses to identify that manifest, thus the commit.

(19) By Andy Bradford (andybradford) on 2024-01-20 21:01:19 in reply to 18.2 [link] [source]

> The manifest is a unique string identifying that commit, albeit a long
> one with several LFs in the middle.

Yes,  that's precisely  what I  was thinking.  If any  string that  only
changes  when "trunk"  changes is  as  good as  the next,  why not  just
/raw/trunk ?  It's a  fairly big  string, but  if it's  not going  to be
displayed anywhere, and  is only used for the purpose  of comparing when
"trunk" changes, it can work.

Andy

(20) By Joel Dueck (joeld) on 2024-01-26 16:18:29 in reply to 19 [link] [source]

Thanks, I'll look into that. But it once again brings me up against a question relating to the limitations of Fossil's URL aliases: the alias target can only have a single path element, which means I have to use the ?param=val format for passing parameters.

The docs for the raw page say I can use /raw/ARTIFACTID or /raw?ci=BRANCH&filename=NAME. I can't use /raw/trunk as the target of a URL alias, but I also can't use /raw?ci=trunk — it doesn’t work.

Is there a way to get the raw content of the last trunk checkin using normal URL parameters?

As a last resort I could fudge this in my Apache config but it would be nice to have it self-contained within the repo.

(21) By Stephan Beal (stephan) on 2024-01-26 16:40:29 in reply to 20 [link] [source]

The docs for the raw page say I can use /raw/ARTIFACTID or /raw?ci=BRANCH&filename=NAME. I can't use /raw/trunk as the target of a URL alias, but I also can't use /raw?ci=trunk — it doesn’t work.

This is a limitation of the URL path dispatcher, which uses only the first /pathComponent for the dispatching and converts /everything/after/that internally to ?name=everything/after/that. It could certainly be expanded to support multi-path dispatching, but (A) nobody's volunteered to do it and (B) there's a risk of breakage in hypothetical corner cases where /foo/bar/baz resolves differently under the new scheme.

(24.1) By Joel Dueck (joeld) on 2024-01-26 17:51:45 edited from 24.0 in reply to 21 [link] [source]

Right, I’m not asking about changing the dispatcher. I’m saying the /raw page apparently does not have normal-URL-paramater equivalent of /raw/trunk. If not using /raw/ARTIFACTID, then /raw can only serve the contents of individual files. Might it not be simpler to add support for, say, /raw?artifact=trunk?

[EDIT: a closer reading of your comment led me to try /raw?name=trunk, which actually does work! So I can use that as the target of an alias. It would be great if the docs were updated to include this info.]

(25) By spindrift on 2024-01-26 18:24:33 in reply to 24.1 [link] [source]

I absolutely agree with your wider point than the URL alias docs are not exactly... long-winded. Or even thorough. One might even accuse them of being terse and difficult to find...

(22) By Daniel Dumitriu (danield) on 2024-01-26 16:50:22 in reply to 20 [link] [source]

This works:

https://fossil-scm.org/home/raw/trunk

What do you expect instead?

(23) By spindrift on 2024-01-26 17:42:21 in reply to 22 [link] [source]

I think the problem here is that Fossil can't use that as a URL alias.

(26) By Joel Dueck (joeld) on 2025-03-21 17:16:07 in reply to 1 [link] [source]

Update, one year later:

I just revisited this issue and tried deploying a Racket package from a public Fossil repo using these URL aliases:

/pluto.zip → /zip?name=pluto.zip
/pluto.zip.CHECKSUM → /raw?name=trunk

All the detail can be read at the test repo I made for this purpose. The upshot is that it doesn’t work (on Racket’s end), even though Racket devs initially indicated it should work.

It does still seem funny that Fossil doesn't offer a quick way to get only the hash of a named artifact, which would come in handy for “dumb” clients that just want to know if trunk has had any additional commits since last time it was checked. But presumably if I wrote a manifest parser for the Racket package manager, the manifest obtained from /raw?name=trunk could be used for this purpose (looking at the Z card).

(27.1) By Stephan Beal (stephan) on 2025-03-21 17:38:42 edited from 27.0 in reply to 26 [link] [source]

... the manifest obtained from /raw?name=trunk could be used for this purpose (looking at the Z card).

The Z-card would work for the purpose of determining whether it's changed since the last checkin but would not be the hash of the checkin itself. The Z-card is an MD5 hash of all content of the manifest itself up to, and including, the newline before the start of the Z-card.

AFAIK there's currently no web-accesible way, short of web scraping, to convert a fossil name to its SHA hash (edit: except for Warren's approach shown up-thread). Even if there were, the name could refer to something different 0.1 seconds later.

(28) By Florian Balmer (florian.balmer) on 2025-03-21 18:05:01 in reply to 27.1 [link] [source]

... the name could refer to something different 0.1 seconds later ...

Does that matter? For any system, the latest "item" may become outdated at any time, especially in a distributed SCM. And even with Warren's approach you're recommending.

This can happen all the time, even on our local machines when working with files. (That's why I prefer not checking whether a file exists in code or scripts, but just do the intended action, and be prepared for "not found" errors. Because the file may have been deleted in between the check whether it exists and the use, so the check is usually redundant. Probably fast, with today's file system caches, but still redundant.)

TOCTTOU, or so.

(29) By Joel Dueck (joeld) on 2025-03-21 19:07:30 in reply to 27.1 [link] [source]

AFAIK there's currently no web-accesible way, short of web scraping, to convert a fossil name to its SHA hash

For the simple purpose of determining whether or not the latest checkin on trunk matches the last one fetched, I suppose I could fire up a complete XML parser and get the hash from the <item> element in /timeline.rss?y=ci&tag=trunk&n=1. But it seems like overkill for this purpose.

Maybe a better idea: expose the functionality of the fossil timeline command (not the web page) to the web, e.g. /timeline.raw?b=trunk&n=1&fmt=full. /timeline.json might be even more useful, not sure if that represents too novel of a direction/precedent though.

For comparison, the Git HTTP protocols make it straightforward to fetch info about available references as structured data, e.g. info/refs?service=git-upload-pack (1)

(34) By Andy Bradford (andybradford) on 2025-03-22 00:07:21 in reply to 29 [link] [source]

> I suppose I could fire up a  complete XML parser and get the hash from
> the <item> element

Or just do something like:

curl -dump 'https://www.fossil-scm.org/home/timeline.rss?y=ci&tag=trunk&n=1' | sed -e '/pubDate/d' | md5

Record that MD5 hash somewhere then next time you fetch, compare to previous MD5 hash.

No need to parse XML (with the exception of using sed to chop out the pubDate).

Andy

(35) By Andy Bradford (andybradford) on 2025-03-22 00:34:05 in reply to 34 [link] [source]

> Record that MD5 hash somewhere then next time you fetch

Or just get the guid from the RSS:

curl -dump 'https://www.fossil-scm.org/home/timeline.rss?y=ci&tag=trunk&n=1' | grep guid | sed -ne 's,^.*info/\([a-f0-9][a-f0-9]*\)</guid>.*$,\1,p'

There's the vaunted hash.

Andy

(30) By Richard Hipp (drh) on 2025-03-21 20:04:09 in reply to 26 [link] [source]

It does still seem funny that Fossil doesn't offer a quick way to get only the hash of a named artifact,

Fossil has the "fossil whatis" command that you can invoke from the command line that gives you all kinds of information about an object. It would be relatively simple to use that command as a prototype to write a webpage that returned JSON with equivalent information. The reason that no such page exists is that nobody who wants it (I'm looking at you Joel) has taken the time to implement it, and among the people who have been adding features to Fossil over the past year, none of them care one scintilla about about this.

If this is something that is important to you and you can make a reasonable case for why it is important, then send us patches and we will give them careful consideration.

(31) By Joel Dueck (joeld) on 2025-03-21 21:47:06 in reply to 30 [link] [source]

Fossil has the "fossil whatis" command that you can invoke from the command line that gives you all kinds of information about an object. It would be relatively simple to use that command as a prototype to write a webpage that returned JSON with equivalent information. …

Thanks and sure — I have seen several variations of this speech over the years of following the mailing list/forums ;) Maybe I should have refrained from editorializing as I did there. For my part, I take great care to get a clear expression of interest and preferred direction before investing my time. I don't want to spend time on learning a codebase and implementing an enhancement only to have it turned down.

If and when I have time, I'll may take a stab at it. At that point it will be good to have this thread to refer to.

I note that Fossil already has a whatis web page, so a JSON-serving endpoint would need to distinguish itself in some way. There could also be other JSON-serving functions added in the future. Is there some naming/parameter/other convention that should be followed here to keep things clean and clear?

(32.1) By Stephan Beal (stephan) on 2025-03-21 22:09:23 edited from 32.0 in reply to 31 [link] [source]

I note that Fossil already has a whatis web page, so a JSON-serving endpoint would need to distinguish itself in some way.

/whatis?json should do the trick.

Edit: with a caveat: any errors triggered by fossil code not very specifically written to return them in JSON form would emit them in HTML form.

Is there some naming/parameter/other convention that should be followed here to keep things clean and clear?

There's not currently one and, AFAIK, we don't have any collision with a "json" argument to any commands, so that might be a useful convention for that purpose.

We have a dedicated /json API but (A) it's not enabled by default and (B) was added long before sqlite could natively speak JSON. Ideally it would be reimplemented on top of sqlite-native JSON support, but we've yet to have any volunteers for that and it would be a relatively large undertaking.

(33) By Richard Hipp (drh) on 2025-03-21 23:55:41 in reply to 31 [link] [source]

Hints For Adding A New /jwhatis API

I suggest using the /juvlist interface at src/unversioned.c:704-760 as a template, as it is short and it generates JSON.
Start by copying that code somewhere else (I suggest next to the "whatis" command in src/info.c, but it doesn't matter) and then change the "WEBPAGE: juvlist" to "WEBPAGE: jwhatis" ("J" for "Joel" not "JSON" ;-)). Also change the function name. After that it should compile and the new webpage will be available, though obviously it will still work the same as /juvlist since you copied exactly that code.
Slowly start making changes to the copy to make it work more like what you want, compiling and testing as you go, using the "whatis" implementation has your guide to how to get the data you need.
Follow the indentation, commenting, and variable naming conventions you find in the "whatis" and "juvlist" implementations that you are copying.
Remember that each HTTP request forks a separate process which exits right after the HTTP reply. So /jwhatis has the entire address space entirely to itself. You can use static variables and you don't need to stress over memory leaks, yielding to other threads, and similar. Way, way, easier than implementing a new API using a so-called "modern" framework.

(38) By Joel Dueck (joeld) on 2025-03-22 20:16:13 in reply to 33 [link] [source]

Thank you, this is a great start. Much appreciated.

I dimly recall filling out a CLA years ago but am unsure if I sent it in or faxed it or whatever the method was at the time.

(40) By Richard Hipp (drh) on 2025-03-22 23:10:18 in reply to 38 [link] [source]

All CLAs are stored in a file folder in the safe here in my office. I just checked, and I do not have one for you, at least not among those whose last name starts with "D". If I do have your CLA, then it was mis-filed.

(36.1) By Andy Bradford (andybradford) on 2025-03-22 15:16:19 edited from 36.0 in reply to 1 [link] [source]

> Does the Fossil web  UI offer a URL which serves only  the hash of the
> latest commit on a given branch?

How about enhancing the /whatis page to this:

https://www.fossil-scm.org/home/info/12ecd0ab38a94d42

Works for me:

$ echo $(curl -s 'http://localhost:8080/whatis/12ecd0ab38a94d42?hash')
12ecd0ab38a94d4299a22681ff387c37b5bda4a9140a32ead805dd5d0ff91914

Should the output include a newline?

And of course what you were really looking for was something like:

$ echo $(curl -s 'http://localhost:8080/whatis/trunk?hash')           
d62ca2b85f1eb7d98f58860736dfe52a2992f00828524bb91b850a7071f68e8c

Because /whatis will resolve symbolic names too.

Andy

(37) By Joel Dueck (joeld) on 2025-03-22 20:13:59 in reply to 36.1 [link] [source]

This would work perfectly! I'd suggest no newline, but the client can always trim the string of course.

(39) By Andy Bradford (andybradford) on 2025-03-22 20:24:14 in reply to 37 [link] [source]

> This would work perfectly!

I'll wait a couple  of days for others to comment and  then go ahead and
merge it as-is into trunk if there are no concerns.

Thanks,

Andy