Using Fossil with SugarSync

(1) By EganSolo on 2018-10-09 23:00:59 [link] [source]

Hi guys,

First post here on the forum. I've read the docs on Fossil and it seems to fit the bill quite well. I'm working on several free-pascal / Lazarus IDE projects and I use a cloud-based storage to backup and replicate the folders of these files on multiple computers.

So here's how I'm thinking I'd like to use Fossil:

1.Install Fossil on computer 1 and computer 2

2.Assume I have the same account on both computers.

3.Create a local repository on computer 1.

3.1. Folder c:fossil holds the repository

3.2. Folder c:project holds the files for my project.

4.Work on computer 1, check in and check out some files in c:project

5.While I'm working the content of these two folders is replicated into my cloud storage and from there down to my computer 2 in exactly the same two folders.

6.Switch to computer 2 and continue working with Fossil.

Are there any downsides to using Fossil with a replicated repo? To simplify the question, just imagine that I created a repo on computer 1, then I zipped that repo (folder) and unzipped it on computer 2 preserving its structure. Will something break?

Thanks for any guidance you can provide here!

Egan

(2) By Stephan Beal (stephan) on 2018-10-09 23:12:05 in reply to 1 [link] [source]

It's potentially, but not necessarily, risky to sync checked-out copies that way because any fossil operation may update the so-called checkout db and you don't want that synced to 2 copies - it's intended to be used only in a single place. What generally works well, in my limited experience, is to put a clone of the repository db in cloud storage and open (in non-synced storage) a separate checkout of that repo on each machine. Even so, syncing a db that way is not considered good practice and it could (depending on the sync software) get corrupted. Likewise, never share an repo db you value over a network-shared drive.

(4) By sean (jungleboogie) on 2018-10-09 23:28:20 in reply to 2 [link] [source]

Yes, listen to Stephen.

never share an repo db you value over a network-shared drive.

(3) By sean (jungleboogie) on 2018-10-09 23:20:00 in reply to 1 [link] [source]

To simplify the question, just imagine that I created a repo on computer 1, then I zipped that repo (folder) and unzipped it on computer 2 preserving its structure. Will something break?

The general rule when doing this has been to fossil close repo.fossil prior to copying it to a thumb drive/zipping it.

Many of these 'Can I do this?' questions come down to trying what you want on a factitious repo. I haven't heard of sugarsync, so I don't know what it does for you.

fossil close

(5) By EganSolo on 2018-10-09 23:33:30 in reply to 1 [link] [source]

Thanks to the fast responses!

SugarSync is a cloud storage solution much like Box. The connection to the cloud is encrypted and the storage is also encrypted.

Here's the thing that I'm not understanding: if, as I understood, a fossil repo is actually a sqllite db, then where is the harm if that db is actually synced between computers? Is the fossil process constantly monitoring and working on a repo or does it become active only when I issue a command?

If it's a later, then why would it matter if the sqllite db has changed in between invocations?

(6) By Stephan Beal (stephan) on 2018-10-09 23:41:30 in reply to 5 [link] [source]

Sqlite creates temporary files while working on db files and you definitely don't want those to get synced, nor do you want your sync software to mis-sync a being-written-to sqlite file. It "should" work out okay, and very probably will, but why risk corruption at all when the possibility is easily avoided?

(7) By Stephan Beal (stephan) on 2018-10-09 23:46:18 in reply to 6 [link] [source]

In short, you'd be relying on multiple levels of unrelated voodoo to work flawlessly together despite none of them knowing anything about the others. They very probably would, but... better safe than sorry. IMO.

(8) By EganSolo on 2018-10-10 00:04:19 in reply to 7 [link] [source]

Got it,

I'll have to scratch that option and rely on a https accessible repo. Any recommendation on where to host a fossil repo on the web?

(9) By Stephan Beal (stephan) on 2018-10-10 00:21:45 in reply to 8 [link] [source]

Any cheap hoster which supports CGI will do (mine is $5/month or so), but you don't necessarily need to do that...

While i strongly recommend against syncing checkouts, i have experimented with syncing repos via dropbox. i never had a mis-sync, but it can't strictly be ruled out because Voodoo. Simply (fossil open), in non-synched storage, on both machines and then check in and update as you normally would.

(10) By Richard Hipp (drh) on 2018-10-10 01:38:05 in reply to 8 [link] [source]

A 5 USD/month Linode (https://www.linode.com/) or a Droplet (https://www.digitalocean.com/) works well for this. Some setup required. I use both companies (for diversity) and am happy with both but I like linode the best. The main Fossil website runs on a linode, but www3.fossil-scm.org runs on a droplet.

I also set up a Scaleway ARM machine (3 euro/month) some time ago and ran a Fossil server it for a month or two. It worked fine as well.

(11) By sean (jungleboogie) on 2018-10-10 02:01:42 in reply to 8 [link] [source]

As the two experts have said, just about any place will do. With linode and digital Ocean, you get root access to run the server how you see fit.

Meanwhile, take a look at this article: https://www.fossil-scm.org/index.html/doc/trunk/www/server.wiki

(12) By anonymous on 2018-10-10 04:46:16 in reply to 1 [link] [source]

Are you planning to sync two-way or PC1 is a "master", while PC2 is a "slave"?

In case it's two-way, it might work ok provided:

Your PCs accounts are really mirrored (I guess, it's Windows, so some fossil files are in AppData). So it's not just about where the your /fossils and /projects folders are, but also where the /Users/yourusername home is.
SugarSync could run predictably, kind of triggered on-demand. This way you would have to force the sync of the relevant folders so that Fossil is not sync'ed while some db operation is still in progress (in fact, the db would likely be locked and may not sync properly)

Said that, Fossil is not designed for such use. You totally give up the arguably best part of Fossil -- its sync ability.

Meanwhile, you may host a private Fossil repo:

From PC1 -- seems like your upstream connection is fast enough for SugarSync, should be ok Fossil traffic too. Setting up some SSH and you've got a secure tunnel between your PCs.
On ChiselApp.com or similar

Finally, as already suggested, join the world of "host-lords" and get to manage your own host somewhere.

If you do insist on SugarSync'ing your repos, perhaps, alternative SCMs (ahem, ?Git) may serve such a need better, because such repos are more "static" in that sense being file-based vs. db. You may still get caught mid-operation with a Git repo, hard to say how consistently that would mess things up due to locking. That's why sync-on-demand is needed.

(13) By jvdh (veedeehjay) on 2018-10-10 09:07:26 in reply to 12 [link] [source]

most everything has been said, but I would second the last post that "one can do this syncing, if care is taken". actually, I have been doing it with two machines (desktop + labtop) and some 50+ repos (or rather, with my complete home directory ...) for many years.

what _is_ important, is to do the sync "on demand" and to be able to ensure that the repos are not touched during the sync. in my case, I just run a bidirectional file synchronizer (not just `rsync'...) that detects conflicting changes once in the morning and once in the evening and modify the repos in between only(!) on _one_ of both machines. in case one forgets this (it happens), the synchronizer detects and reports (potentially conflicting) changes on both ends and one has manually to enforce a "direction" of the sync in order to avoid that both repos/checkouts mix (this would obviously cause serious conflicts: consider checkin of file1 on machine A and checkin of file2 on machine B somewhat later. if you do then simply sync and decide to resolve the conflicting change of the repos by choosing, say, B, that one will overwrite A and the checkin of file1 on A will simply be gone from the repo while file1 itself will show up as modified in both repos which at least messes up the history of the project...).

if one is aware of this pitfall, syncing/copying repos in this way can work very well. (and it of course is not a problem specific to fossil but concerns all applications storing content in some data base...).

(14) By EganSolo on 2018-10-10 16:20:54 in reply to 13 [source]

Guys,

Thank you so very much for all the great comments, direction and help! Your quick and detailed replies truly encourage me to use Fossil. I tried Git before but what drew me to Fossil is the ease of use, the integrating ticketing and wiki and the simplicity of the ui.

To close this thread, I thought I would use SugarSync with Fossil IF the two worked together naturally and without any concerns. Seeing the warnings and the counter-indications that Fossil was not meant to work in this fashion, I'll steer clear from that option and setup a public repos instead. I would rather do that and have peace of mind rather than live in a constant "what-if" world.

By the way, SugarSync cannot be configured to sync on demand. As soon as a file has been touched, it's synced and versioned on the cloud. This, of course has great benefits, especially when I need to switch between two machines separated by a stringent firewall, but it has also its drawbacks.

Public repo it is then.

Thanks for all the suggestions, you guys are a wealth of information!

Truly happy to join the community,

Egan

(15) By sean (jungleboogie) on 2018-10-10 16:33:13 in reply to 14 [link] [source]

Public repo it is then.

It doesn't necessarily need to be public. You can require a login to view files, timeline, forum, wiki, etc.

(16) By Warren Young (wyoung) on 2018-10-10 16:37:58 in reply to 14 [link] [source]

Public repo it is then.

The repo doesn't have to be public. You have two easy alternatives:

In Admin → Security-Audit, click the "Take it private" link to reset your repo's permissions so that only logged-in users can access content in the repo. I have one set up that way, and you can't see much other than the skin. There's the Help menu, the Login link, and that's about the end of it until you log in.
Sync over SSH instead of HTTP. Then people can't even see that your repository exists without a valid SSH login on your Internet-facing server.

Since it's generally easier to get permission to tunnel SSH to some off-site server than to expose a new HTTP server to the Internet, you might be able to set this up without any overt cost. Perhaps a family member has a Raspberry Pi on his home LAN and is willing to port-forward SSH to it on a random Internet-facing port on his router, for example. Add one of the many dynamic DNS services, and you have reliable access to the remote Fossil clone.

(17) By anonymous on 2018-10-11 13:20:31 in reply to 16 [link] [source]

Normally, I prefer to use SSH, but there are other options that make it so that "people can't even see that your repository exists".

Specifically:

3. Use HTTP_AUTH
4. Use HTTPS
5. Or use them both

As long as the document root of your HTTPS server doesn't include links to Fossil repositories then "people can't see" that the repository exists.  Short of MITM, none should be able to see the URLs used in HTTPS requests.

Of course simply using HTTP_AUTH is sufficient to prevent access and discovery of content, but won't stop passive analysis.

(18) By Stephan Beal (stephan) on 2018-10-23 07:07:18 in reply to 1 [link] [source]

Returning to this topic for a moment because it's currently relevant to my work on relaunching my aging and ugly web site and i've given it some consideration the past couple of days...

Management summary: if .fslckout (fossil's checkout database) is sync'd between multiple systems, each of which have their own clone of the associated repository, Bad Things can happen when the blob.rid fields get out of sync. Never, ever, ever share a single checkout database across multiple clones of a given repository db. Ever. Bad Things.

Details...

Consider this setup: we have a checkout in cloud-synced storage (be it Dropbox, GDrive, SugarSync, or whatever), syncing across two or more computers. On each computer we also have a clone (not cloud-synced) which we have (fossil open)'d in that cloud-synced folder. In this setup, all of those clones are sharing the same checkout database (the file named .fslckout, found in the root directory of all checkouts) but have their own repository clone.

All blobs in fossil have a public ID: their hash. They also have an internal ID, the blob.rid field (colloquially known as RID), which is a simple incremental integer which is _only consistent within a given clone of a repository_. There is no guaranty whatsoever that the RID for a given blob is the same across multiple clones of a given repo. The .fslckout database uses RIDs to associate its state with that of the underlying repo clone. If a single copy of .fslckout is synchronized over cloud storage, the RIDs it refers to may or may not match all clones of that repository. If a .fslckout refers to IDs which exist in a given repo clone, but are not _semantic_ matches then fossil would behave unpredictably by cross-referencing blobs incorrectly. As Dr. Venkman so eloquently put it: "Cats and dogs living together! Mass hysteria!"

That said...

i _think_ there's a relatively safe way to do this (which i'm currently testing with gdrive) if the cloud-sync software supports filtering out (not syncing) of certain files.

The new setup (for this syncing purpose) is the same as above except that we tell the cloud-sync software NOT to sync .fslckout. That is, each participating computer has its own local copy of .fslckout and its own clone (not in cloud storage) of the associated repository. We also need to tell the sync software to ignore sqlite-generated files named *-journal and *-wal (see: https://www.sqlite.org/tempfiles.html).

If syncing of .fslckout goes haywire, no permanent harm is done because .fslckout holds only transient state. Worst case, we do (fossil close) and then (fossil open /path/to/repo --keep) to recreate .fslckout. Depending on the type of source tree, it may also be a good idea to disable syncing of other files (*.o, emacs temp/lock files, etc.).

Now, when we work on any given computer, all of our changes are sync'd to the cloud as we make them. We can freely switch between PCs, so long as the sync is working (if it's disabled/paused, extra coordination/recovery might be necessary). Running (fossil status) from any of those machines "should" Do The Right Thing, in that it will compare its local .fslckout against its local clone of the repository, with no semantic mismatch between RIDs. Running a checkin from one machine will only update that machine's copy of the checkout/repository, though. Switching to another computer at that point, we'll have the updated files but our local repository clone will say that the files are modified compared to its state. That is solved by simply running (fossil update), which will (assuming auto-sync is on) update that machine-local repo clone from the remote copy, and (fossil status) will once again say "no changes".

Yes, it is possible to keep the repo file itself in cloud storage and work directly from that copy. That would eliminate the "extraneous" (fossil update) calls when switching machines, but would risk corrupting the repository database (as Dr. Spengler so eloquently put it: "that would be bad"). How high that risk is depends on many factors, but it's always going to be "significantly higher than zero". Corrupting the repo db is a sure-fire way to lose any and all contents in that repo.

So far, syncing of my checkout _minus_ the checkout db, is working okay for me, and lets me freely switch between my desktop (upstairs) and laptop (downstairs). That's not to say that's bullet-proof, or even necessarily a good idea, but it seems to be working okay.