Fossil Forum

Stable download URL?
Login

Stable download URL?

Stable download URL?

(1) By Jack Hill (jackhill) on 2022-07-05 14:38:04 [source]

In the Fossil package for GNU Guix, the source tarball we've been using is not stable. That is, we record a checksum of the tarball when we package a release, but at a later date the checksum at the same URL is different. We're currently using URLs of the form: https://www.fossil-scm.org/home/tarball/f48180f2ff3169651a725396d4f7d667c99a92873b9c3df7eee2f144be7a0721/fossil-src-2.17.tar.gz

Is there a better URL for us to use that will be stable over time?

(2) By Stephan Beal (stephan) on 2022-07-05 15:14:02 in reply to 1 [link] [source]

Is there a better URL for us to use that will be stable over time?

See 3fc9f49485125fce for a long discussion on that topic initiated by a package maintainer such as yourself.

(3) By Jack Hill (jackhill) on 2022-07-05 15:15:03 in reply to 2 [link] [source]

Thanks!

(4) By Jack Hill (jackhill) on 2022-07-05 15:34:11 in reply to 2 [link] [source]

Now, having read that thread, it seems to me that it is about finding the download URLs, which is an aid to packagers in keeping all the packages we care for up to date. However, my question is different: having found the URL, we store it, and a checksum of what is downloaded in our package definition so that folks can re-build that package at a later date. However, it seems that the contents of the URL change (despite having the hash as part of the URL?), which complicates re-building the package.

Is there a URL that we can use whose contents are guaranteed not to change? Does that make sense, or am I missing something?

Thanks!

(5) By Martin Gagnon (mgagnon) on 2022-07-05 15:50:26 in reply to 4 [link] [source]

May be this Thread then.

(6) By Stephan Beal (stephan) on 2022-07-05 16:00:39 in reply to 4 [link] [source]

Is there a URL that we can use whose contents are guaranteed not to change?

The short answer is "no." Even if fossil itself guaranteed hash-compatible exports of data on each export (which it doesn't), fossil cannot guaranty (for example) that passing the same set of bytes through libz produces the same hash on every execution on every platform.

Similarly, fossil permits "amending" checkins after they're made, e.g. updating their timestamps. That, in turn, can affect the results of the tar/zip output.

The longer answer is unfortunately also "no." There is no 100% reliable way for fossil to make such guarantees. A single bit of difference in how it generates a tar/zip file will create a different hash while creating a semantically identical archive in which every file has the same hash as their counterparts in "that other archive," and we cannot lock fossil's tar/zip creation into a 100% code freeze simply to avoid that eventuality. Even if we could, any given update to zlib may well cause a different hash to be generated.

This has come up before and we collectively understand how that's an annoyance for package maintainers, but it's not a problem we can sensibly solve for folks who choose to pull files directly out of the SCM (as opposed to downloading prebuilt binaries or source bundles, which do have stables hashes but have indeterminate and finite lifetimes (that is, fossil-scm.org only retains a very limited number of previous pre-built binaries/source bundles)).

(8) By Andy Bradford (andybradford) on 2022-07-06 02:45:19 in reply to 6 [link] [source]

> There is no 100% reliable way for fossil to make such guarantees.

Fossil the  binary, code, whatever does  not, but it certainly  would be
possible to export  a zip from a  given hash and store that  as a static
file in  the unversioned  area. Then package  maintainers can  obtain it
from there rather than using the  developers versioning tags as a method
to locate the appropriate source.

Andy

(9) By Stephan Beal (stephan) on 2022-07-06 03:20:14 in reply to 8 [link] [source]

Fossil the binary, code, whatever does not, but it certainly would be possible to export a zip from a given hash and store that as a static file in the unversioned area

Richard does so, but he can't reasonably be expected to store an arbitrary number of old releases in there, especially when it's being done only to solve a problem which is completely external to this project. As our only unversioned-capable user, those files become part of his local clone and backups.

Despite the hash of an on-the-fly generated archive not being stable, the contents of the archive are except for the generated files which get injected into the archive: manifest and manifest.*. (The former has been known to change in the past in order to ensure that it cannot be treated as a manifest if it's re-imported in a repo, but that is no longer the case. manifest.tags can change at any time.)

Unpacking the archive and comparing the hashes of the files with those from the manifest is the intended way for downstream folks to validate a fossil-exported archive, noting that the manifest itself is doubly-cryptographically sealed and can itself be validated by sha3 hashing it, the result of which is the version hash of the archive export.

(10) By Stephan Beal (stephan) on 2022-07-06 03:57:15 in reply to 9 [link] [source]

Unpacking the archive and comparing the hashes of the files with those from the manifest is the intended way for downstream folks to validate a fossil-exported archive, noting that the manifest itself is doubly-cryptographically sealed and can itself be validated by sha3 hashing it, the result of which is the version hash of the archive export.

For completeness's sake: the very bottom section of this page demonstrates a very closely-related check which can also be used to validate the contents of an archive via the so-called R-card, noting that not all repositories add an R-card to manifests because it's expensive to calculate (but fossil's does).

(12) By Jack Hill (jackhill) on 2022-07-06 06:13:23 in reply to 10 [link] [source]

Thanks for the pointers on how to validate archives. I'll have to think about if and how I can make use of those features, but it definitely increased my knowledge about the exported archives which is always good. I certainly wasn't asking for static archives to be created if they didn't exist, but I know that some projects out there do create and keep them, so didn't want to be missing out if they did exist. I think that the Nix person's comment is on the right track, and similar to how we work with other SCMs when separate, stable archives aren't available (or when it's just more convenient — I don't view pulling from the SCM as a second tier option, and sometimes prefer it over separate archives).

In the meantime, we do have a separate fallback to generate bit-identical archives using metadata generated with Disarchive and file data from Software Heritage. As it is we have a (quite) long-standing issue about this fallback not working in all scenarios, which meant I couldn't take advantage of it, and why I ended up here.

Thanks again for your time and explanations, they really do help.

(7) By anonymous on 2022-07-05 23:21:31 in reply to 4 [link] [source]

As a user of sibling project (Nix/NixOS), you may find their Fossil fetcher useful.

As already noted, the tarball generated isn't guaranteed to remain consistent between generations (the same is true for other SCM providers, GitHub, GitLab, etc.), however, the hash between checkouts/versions should stay the same (assuming a fossil rebuild hasn't taken place in-between).

Therefore, your best bet for consistency/reproducibility is to clone the particular repository and try to open it (at hash X.Y.Z version).

(11) By Jack Hill (jackhill) on 2022-07-06 05:44:58 in reply to 7 [link] [source]

As a user of sibling project (Nix/NixOS), you may find their Fossil fetcher useful.

As already noted, the tarball generated isn't guaranteed to remain consistent between generations (the same is true for other SCM providers, GitHub, GitLab, etc.), however, the hash between checkouts/versions should stay the same (assuming a fossil rebuild hasn't taken place in-between).

Therefore, your best bet for consistency/reproducibility is to clone the particular repository and try to open it (at hash X.Y.Z version).

Cool! Yes, we have similar solutions for other SCMs, so doing the same for Fossil would be pretty cool (but I'm making no promises about implementing it though ☺)!