Successfully cloned massive NetBSD src fossil repository with resume
(1) By Andy Bradford (andybradford) on 2023-12-17 05:54:06 [link] [source]
I've been running and testing the code from the clone-resume branch for a couple of weeks now trying to clone the massive NetBSD src fossil repository[1]. I didn't think it would take as long as it did or I would have probably paid more attention to when I actually started the clone, however, I think I can get the picture from the /rcvfromlist page which has 41 entries, the first of which is: 1 2023-11-29 04:57:18 amb sha3 163.172.4.16 Given that initial entry, I've been trying to clone this repository for over 2 weeks. I believe that the 41 entries each represent an interrupted clone and resuming thereafter. Most of the interruptions were self-inflicted as I tested the ability to resume, however, there was one unexpected 30 minute network outage that interrupted the clone. The Rebuild phase of the clone started sometime on Monday the 11th of December and just finished sometime today, the 16th of December. For some reason, the output stopped reporting on progress when it got to 48.9% and just sat there for 3--4 days. It may have continued reporting sometime during the night when I wasn't paying attention. The delta compression phase took much less time than the rest of the rebuild but still took all day today to complete. Here is the output from the last resumed clone operation followed by the rebuild and delta compression. The reason why there are 2 lines representing the percentage complete is because I suspended the rebuild so I could do something else with the computer because it was taxing the I/O quite a bit. Clone done, wire bytes sent: 28323 received: 1214181826 remote: src.fossil.net bsd.org (163.172.4.16) Uncompressed payload sent: 10834 received: 1214152124 Rebuilding repository meta-data... 48.9% complete... 100.0% complete... Extra delta compression... 757855 deltas save 11,729,081,754 bytes Vacuuming the database... project-id: 3109ce34a3d43fd786dcf0133c6a96a6de40c573 server-id: fbbbc5a67d6d465dcef5d86f1d05fe52d2352a1e admin-user: amb (password is "BMhnetHYKr") Here are some statistics: $ time fossil dbstat -R netbsd-src.fossil project-name: NetBSD src repository-size: 50,270,945,280 bytes artifact-count: 4,304,686 (stored as 421,870 full text and 3,882,816 deltas) artifact-sizes: 60,499 average, 29,093,223 max, 260,430,148,484 total compression-ratio: 5:1 check-ins: 1,954,758 files: 526,433 across all branches wiki-pages: 0 (0 changes) tickets: 1 (1 changes) events: 0 tag-changes: 0 latest-change: 2023-12-11 21:57:41 - about 5 days ago project-age: 11,479 days or approximately 31.43 years. project-id: 3109ce34a3d43fd786dcf0133c6a96a6de40c573 schema-version: 2015-01-24 fossil-version: 2023-12-08 15:30:14 [29e9e84a1e] [2.24] (clang-13.0.0 ) sqlite-version: 2023-11-24 11:41:44 [ebead0e723] (3.44.2) database-stats: 6,136,590 pages, 8192 bytes/pg, 0 free pages, UTF-8, delete mode 0m10.89s real 0m04.55s user 0m05.76s system Here is my attempt to pull today after rebuild was complete: $ fossil pull -R netbsd-src.fossil Pull from https://src.fossil.netbsd.org/ Round-trips: 4 Artifacts sent: 0 received: 116 Pull done, wire bytes sent: 6366 received: 49913 remote: 163.172.4.16 Andy [1] https://www.fossil-scm.org/forum/forumpost/8cbc83e5d08a86b4
(2) By Florian Balmer (florian.balmer) on 2023-12-17 11:39:13 in reply to 1 [link] [source]
My first reaction:
https://i.giphy.com/1M9fmo1WAFVK0.webp
My second reaction:
Is the finished clone really usable? Is it possible to open a new check-out in reasonable time? Is it fast to detect changed files? Is browsing the repository through the web UI still a pleasant experience? Can you load CLI diff and timeline views without having to wait? If all answers are "yes", I'm definitely impressed!
(3) By Stephan Beal (stephan) on 2023-12-17 12:38:38 in reply to 2 [link] [source]
@Andy: congratulations!!! That improvement is long overdue!
@Florian:
Is the finished clone really usable?
Whether that particular repo was ever really "usable" with fossil is a matter of debate ;).
(For those who don't know: the netbsd repository is the single-largest known use of fossil.)
Is it possible to open a new check-out in reasonable time? Is it fast to detect changed files? Is browsing the repository through the web UI still a pleasant experience?
Fossil's manifest format, which lists all files in each checkin, doesn't scale well to repos of that size, causing some pain in all of the cases you mention. The "delta manifest" format was, IIRC, added because of that repository, but that format still requires (for many (most?) purposes) loading both the small delta manifest and its full-length parent manifest, so the delta manifest offers a storage savings but does not improve runtimes for many cases (e.g. browsing /dir).
(8) By Vadim Goncharov (nuclight) on 2024-04-28 00:00:51 in reply to 3 [link] [source]
Fossil's manifest format, which lists all files in each checkin, doesn't scale well to repos of that size, causing some pain in all of the cases you mention. The "delta manifest" format was, IIRC, added because of that repository, but that format still requires (for many (most?) purposes) loading both the small delta manifest and its full-length parent manifest, so the delta manifest offers a storage savings but does not improve runtimes for many cases (e.g. browsing /dir).
Seems some new VCS is needed, FossilNG perhaps? :) Or how this problem should be fixed? Is it inevitable architecture problem? Git is able to handle such repos, but I would not say git's format is fundamentally different from Fossil's...
(9.1) By Stephan Beal (stephan) on 2024-04-28 06:51:03 edited from 9.0 in reply to 8 [link] [source]
FossilNG perhaps?
See the like-named wiki page at src:/wiki?name=Fossil-NG
Git is able to handle such repos, but I would not say git's format is fundamentally different from Fossil's...
i know literally nothing about how git stores the list of files/hashes associated with each checkin, so can't comment on it.
However, i can, with some degree of authority, comment on...
Fossil has an optimization to reduce the number of files listed in a manifest, drastically decreasing the size of manifests for repos like the pkgsrc one. They're called delta manifests and, IIRC, they were added specifically because of the NetBSD pkgsrc repo. In practice, however, they don't save much space and they cost more RAM. See:
src:/doc/trunk/www/delta-manifests.md
Delta manifests are a trade-off, in any case. The main fossil repo and its sibling, the sqlite repo, are both explicitly configured to not generate delta manifests because their use makes it far more difficult for downstream clients to validate the contents of their downloads (see the above article for why).
(4) By Warren Young (wyoung) on 2023-12-18 01:32:59 in reply to 2 [link] [source]
But why?
Presumably because it's a fine torture-test of the new resumable cloning feature.
(5.1) By Andy Bradford (andybradford) on 2023-12-19 15:30:42 edited from 5.0 in reply to 2 [source]
> Is the finished clone really usable? It seems to be. Certain operations take noticeably longer than others so there is definitely some patience required. Also, one contributing factor is that I'm running this on OpenBSD and it's disk I/O isn't necessarily the fastest in some cases so it's hard to say that all the slowness is due to Fossil and the size of the repository. That being said, here are some numbers. "fossil open" has to produce over 200,000 files on my filesystem: $ time fossil open netbsd-src.fossil --workdir netbsd-src > netbsd-src.out 17m58.50s real 1m31.82s user 2m24.87s system $ find netbsd-src -type f | wc -l 208798 $ time fossil status repository: /home/amb/Downloads/netbsd-src.fossil local-root: /home/amb/Downloads/netbsd-src/ config-db: /home/amb/.fossil checkout: a83039c6283ce6e2276dd568192d7a50c0474b90 2023-12-17 00:19:11 UTC parent: f5a17444ddea101eabb78fbb2126955796c57718 2023-12-16 23:40:33 UTC tags: trunk comment: tests/make: add basic tests for the ':M' modifier (user: rillig) WARNING: multiple open leaf check-ins on trunk: (1) 2023-12-17 00:19:11 [a83039c628] (current) [39 more open leaves for trunk hidden] 0m12.95s real 0m00.92s user 0m07.79s system Diff with no files changed yet: $ time fossil diff 0m07.97s real 0m00.69s user 0m06.85s system Diff with a random subset of files changed: $ time fossil status repository: /home/amb/Downloads/netbsd-src.fossil local-root: /home/amb/Downloads/netbsd-src/ config-db: /home/amb/.fossil checkout: a83039c6283ce6e2276dd568192d7a50c0474b90 2023-12-17 00:19:11 UTC parent: f5a17444ddea101eabb78fbb2126955796c57718 2023-12-16 23:40:33 UTC tags: trunk comment: tests/make: add basic tests for the ':M' modifier (user: rillig) EDITED crypto/external/bsd/libsaslc/dist/src/Makefile.bsd EDITED crypto/external/bsd/openssl.old/dist/test/pkits-test.pl EDITED crypto/external/bsd/openssl/lib/libcrypto/arch/powerpc/aes.inc EDITED external/apache2/llvm/dist/clang/include/clang/AST/DeclarationName.h EDITED external/bsd/nvi/dist/TODO EDITED external/cddl/dtracetoolkit/dist/Bin/bitesize.d EDITED external/gpl2/gettext/bin/msgmerge/Makefile EDITED external/gpl3/binutils/dist/gold/target-select.cc EDITED external/gpl3/gcc.old/usr.bin/gcc/arch/mipsn64el/gthr-default.h EDITED external/gpl3/gcc.old/usr.bin/gcc/arch/mipsn64el/insn-modes.h EDITED external/gpl3/gcc/dist/libgcc/config/nds32/isr-library/save_fpu_regs_01.inc EDITED external/gpl3/gdb.old/dist/gold/ChangeLog-2017 EDITED external/gpl3/gdb.old/dist/ld/testsuite/ld-mmix/loc2.d EDITED external/gpl3/gdb/dist/gdb/testsuite/gdb.mi/mi-ns-stale-regcache.exp EDITED external/gpl3/gdb/dist/sim/testsuite/arm/xscale/mra.cgs EDITED external/mpl/bind/dist/bin/tests/system/checkconf/check-root-static-ds.conf EDITED external/mpl/bind/dist/bin/tests/system/dyndb/driver/zone.h EDITED lib/libarch/i386/i386_get_ldt.c EDITED lib/libisns/isns_fileio.h EDITED sys/external/gpl2/dts/dist/arch/arm/boot/dts/sama5d36.dtsi EDITED usr.bin/renice/renice.c WARNING: multiple open leaf check-ins on trunk: (1) 2023-12-17 00:19:11 [a83039c628] (current) [39 more open leaf on trunk messages suppressed] 0m08.15s real 0m00.82s user 0m07.30s system $ time fossil diff | wc -l 8417 0m08.01s real 0m00.82s user 0m07.10s system Opening the ui: $ fossil ui #opened page in browser This page was generated in about 0.001s by Fossil 2.24 [29e9e84a1e] 2023-12-08 15:30:14 Browsing commits from the timeline seems fine. Clicking on a random commit seems fine, it renders the diff quickly enough in 0.021s. Here are some other things I clicked on from the UI: /timeline?r=trunk&c=2023-12-17+00%3A19%3A11 57.271s /dir?ci=tip 0.981s /finfo?name=rescue/list.crypto&m&ci=tip 84.731s /reports 22.951s /fileage?name=b1629468d8b9146d 74.721s /hash-collisions 10.711s /vdiff?from=3da478793e72b5fb&to=7e31584767cf23bd [out of memory] /vdiff?from=6cf62525d8c7094e&to=a83039c6283ce6e2 1.531s Time to commit a private set of changes: $ time fossil ci --private -m "Test private commit" New_Version: 6735ec7a8a729d4ffef6a2d5ff0816b987cf27505130a76ef26c67c88bb94e07 6m50.40s real 1m32.38s user 1m34.31s system So overall it seems to work. Almost 7 minutes to commit is a long time, but then, disabling the R-card would probably change that. I tend to prefer leaving the R-card enabled. Andy
(6) By Vadim Goncharov (nuclight) on 2024-02-17 19:40:26 in reply to 5.1 [link] [source]
Huh, when we tried to convert FreeBSD src repo from git to Fossil (you can find this on this forum), conversion took many days, but times afterall were around 40 seconds - meaning it's somewhat usable...
(7) By Andy Bradford (andybradford) on 2024-02-17 21:31:05 in reply to 6 [link] [source]
> but times afterall were around 40 seconds - meaning it's somewhat usable It's definitely useable if one has the patience to clone it. I haven't pulled from the repository since December 18, so I decided to "fossil pull" today to get more of the repository. It took 8 minutes to complete the pull (which is largely a function of network latency and bandwidth I imagine): $ fossil pull -R netbsd-src.fossil Pull from https://src.fossil.netbsd.org/ Round-trips: 15 Artifacts sent: 0 received: 5718 Pull done, wire bytes sent: 279596 received: 24921431 remote: 163.172.4.16 And then it took another minute or two before it output that there was a fork: ***** WARNING: a fork has occurred ***** use "fossil leaves -multiple" for more details. Navigating the timeline is fine as far as I can tell. I tried a big diff from the UI (clicking two nodes) and I must have picked some pretty nasty ones: Difference From 738b8b5e0c633827 To 3f7ad32ff303270b It took Fossil a couple of minutes to generate a response and about 1.5GB of RAM. Firefox, on the other hand, took 9GB of RAM to render the page, but then suddenly Firefox stopped outputing anything... I saw this output from Fossil: ------------- 2024-02-17 21:23:54 UTC ------------ panic: Timeout after 600 seconds during web-page reply - user 208,420,000 µs, sys 26,610,000 µs HTTP_HOST=localhost:8080 HTTP_REFERER=http://localhost:8080/timeline HTTP_USER_AGENT=Mozilla/5.0 (X11; OpenBSD amd64; rv:121.0) Gecko/20100101 Firefox/121.0 PATH_INFO=/vdiff QUERY_STRING=from=738b8b5e0c633827&to=3f7ad32ff303270b REMOTE_ADDR=127.0.0.1 REQUEST_METHOD=GET REQUEST_URI=/vdiff?from=738b8b5e0c633827&to=3f7ad32ff303270b So it looks like Fossil gave up waiting for Firefox to consume the data. I wonder if I can tune the timeout so that I can actually see the page complete. But it seems that Fossil did just fine otherwise. Andy