Missing date modification checks in xfer?
(1) By Warren Young (wyoung) on 2019-08-27 03:48:53 [link] [source]
While working on the new user caps doc, I noticed that there's a check for Setup capability on several timestamp editing paths:
fossil ci --date-override
fossil amend --date-override
- the undocumented API
/ci_edit?chngtime
- the timestamp override field on new tickets
The thing is, all of these things can be done to your local repo, where you are the Setup user by default. I did not see any corresponding checks in xfer.c
to reject artifacts that have date overrides or obviously-bogus timestamps. Did I miss them? If not, shouldn't they be present, else what's the point of doing the checks on local repos or over web APIs?
(2) By Richard Hipp (drh) on 2019-08-27 10:03:49 in reply to 1 [link] [source]
Those checks help to discourage users from abusing the system by making use of those options. I do not see any harm in leaving the checks in place.
I don't think there is a way to reject artifacts in xfer that have bad timestamps. All you can do is try to make sure you do not let untrustworthy people push, and if you do have issues, shun the invalid artifacts.
(3) By Warren Young (wyoung) on 2019-08-27 15:46:51 in reply to 2 [link] [source]
I don't know about "harm," but I think the existing checks are basically useless. You are almost always "Setup" to your local repo clone, and that's the one doing the checks.
For the most part, Fossil does user cap checks only on web interfaces, since those may be facing out to non-local (and thus non-Setup) users. Therefore, the checks belong on /xfer
, not on fossil ci --date-override
and such.
As for rejecting bad artifacts, it'd have to parse them and pull the timestamps out of the manifest and then make a decision about it. Then you're up against a loose definition of "plausible," since a sync could push 2-week-old artifacts because someone's been offsite and is just now sending them. But a sync probably shouldn't be allowed to send 2-week-hence artifacts or 2-year-old artifacts.
The latter would solve the problem where someone starts out as an anonymous cloner of a repo, makes checkins to the local repo thinking they're effectively private, since they won't sync. Then maybe years later, they're granted a login on the remote repo, so they change their remote-url
to include their new user name, and on the next sync they dump a bunch of ancient check-ins into the parent repo, which were intended to be private.
This is also the same basic problem as the user impersonation problem, where I argued that pushes should check that the user name in the manifest for new material should be checked against the sync user's name.
(4) By Florian Balmer (florian.balmer) on 2019-08-27 15:54:53 in reply to 3 [link] [source]
As for rejecting bad artifacts, it'd have to parse them and pull the timestamps out of the manifest and then make a decision about it.
The bad artifact could be followed by a control artifact to correct the timestamp, so this analysis would have to be quite "holistic" ...
(6) By Warren Young (wyoung) on 2019-08-27 16:01:58 in reply to 4 [link] [source]
Hold the questionable artifacts in abeyance until the end of the sync. If no later record comes in to make the artifacts left in that set plausible, auto-shun it.
(7.1) By Florian Balmer (florian.balmer) on 2019-08-27 16:28:57 edited from 7.0 in reply to 6 [link] [source]
Yes, but filtering individual contents (vs. fully aborting the sync) during transfer still sounds very complex. There's many follow-up decisions: Should a control artifact referring to a blocked manifest also be blocked? And what if that control artifact refers to multiple manifests? Should children of blocked manifests be blocked? Should files appearing only in blocked manifests also be blocked?
The recently added check-in lock to prevent accidental forks, for example, works on a much lower level, and is independent of the transferred contents.
(8) By Andy Bradford (andybradford) on 2019-08-28 14:31:58 in reply to 6 [source]
Execpt that the way the sync protocol works today would require some back-pedaling or server side-tracking of individual round-trips. Sync works by breaking up the artifacts into enough chunks of data that will fit in a pre-defined limit per round-trip. Each round-trip is committed (in the SQL sense) to the Fossil repository independently of any other round-trip chunks. Not only that, in HTTP/HTTPS sync protocols, the entire set of round-trips are all run on independent Fossil "server" instances---only SSH keeps 1 connection open for the entire sync, all other transports open multiple connections per sync (where sync is defined as a set of round-trips made by the client to push/pull content). The sync protocol is perfectly capable of handling this (phantoms is the mechanism that Fossil uses recognize that it is missing content and how it knows what to request of the sync clients). So Fossil is eventually consistent as long as users continue to push/pull. Also, I'm not sure that shunning should be done automatically in this fashion. Thanks, Andy
(9) By Florian Balmer (florian.balmer) on 2019-08-29 12:39:30 in reply to 6 [link] [source]
There's also odd (theoretical) corner cases: an "outdated" manifest removed from the server by accidental shunning (then un-shunning) or purging wouldn't be restored on the server on subsequent pushes from clients.
(5) By Warren Young (wyoung) on 2019-08-27 15:59:28 in reply to 3 [link] [source]
Thinking more about this, this should be a mode enabled only on sites like those for public FOSS projects, where there is some reason to be skeptical about who is pushing what.
When it happens, the mechanism behind it can be an auto-shun: the "bad" artifacts' IDs are dropped into the shun table and that client can then no longer send those to the parent repo. The sync can continue, with the parent accepting only those artifacts believed to be good.
If the parent repo rejects an artifact incorrectly, the Setup user on the parent repo can modify the shun table via Fossil UI then have the client try again, either under a more privileged login class (e.g. skip it for Admin and up?) or with the sanity-checking feature temporarily turned off on the parent. Since the sync protocol will prevent re-transfer of the same artifacts, the feature can be turned back on afterward.