Missing date modification checks in xfer?

(1) By Warren Young (wyoung) on 2019-08-27 03:48:53 [link] [source]

While working on the new user caps doc, I noticed that there's a check for Setup capability on several timestamp editing paths:

fossil ci --date-override
fossil amend --date-override
the undocumented API /ci_edit?chngtime
the timestamp override field on new tickets

The thing is, all of these things can be done to your local repo, where you are the Setup user by default. I did not see any corresponding checks in xfer.c to reject artifacts that have date overrides or obviously-bogus timestamps. Did I miss them? If not, shouldn't they be present, else what's the point of doing the checks on local repos or over web APIs?

(2) By Richard Hipp (drh) on 2019-08-27 10:03:49 in reply to 1 [link] [source]

Those checks help to discourage users from abusing the system by making use of those options. I do not see any harm in leaving the checks in place.

I don't think there is a way to reject artifacts in xfer that have bad timestamps. All you can do is try to make sure you do not let untrustworthy people push, and if you do have issues, shun the invalid artifacts.

(3) By Warren Young (wyoung) on 2019-08-27 15:46:51 in reply to 2 [link] [source]

I don't know about "harm," but I think the existing checks are basically useless. You are almost always "Setup" to your local repo clone, and that's the one doing the checks.

For the most part, Fossil does user cap checks only on web interfaces, since those may be facing out to non-local (and thus non-Setup) users. Therefore, the checks belong on /xfer, not on fossil ci --date-override and such.

As for rejecting bad artifacts, it'd have to parse them and pull the timestamps out of the manifest and then make a decision about it. Then you're up against a loose definition of "plausible," since a sync could push 2-week-old artifacts because someone's been offsite and is just now sending them. But a sync probably shouldn't be allowed to send 2-week-hence artifacts or 2-year-old artifacts.

The latter would solve the problem where someone starts out as an anonymous cloner of a repo, makes checkins to the local repo thinking they're effectively private, since they won't sync. Then maybe years later, they're granted a login on the remote repo, so they change their remote-url to include their new user name, and on the next sync they dump a bunch of ancient check-ins into the parent repo, which were intended to be private.

This is also the same basic problem as the user impersonation problem, where I argued that pushes should check that the user name in the manifest for new material should be checked against the sync user's name.

(4) By Florian Balmer (florian.balmer) on 2019-08-27 15:54:53 in reply to 3 [link] [source]

As for rejecting bad artifacts, it'd have to parse them and pull the timestamps out of the manifest and then make a decision about it.

The bad artifact could be followed by a control artifact to correct the timestamp, so this analysis would have to be quite "holistic" ...

(6) By Warren Young (wyoung) on 2019-08-27 16:01:58 in reply to 4 [link] [source]

Hold the questionable artifacts in abeyance until the end of the sync. If no later record comes in to make the artifacts left in that set plausible, auto-shun it.

(7.1) By Florian Balmer (florian.balmer) on 2019-08-27 16:28:57 edited from 7.0 in reply to 6 [link] [source]

Yes, but filtering individual contents (vs. fully aborting the sync) during transfer still sounds very complex. There's many follow-up decisions: Should a control artifact referring to a blocked manifest also be blocked? And what if that control artifact refers to multiple manifests? Should children of blocked manifests be blocked? Should files appearing only in blocked manifests also be blocked?

The recently added check-in lock to prevent accidental forks, for example, works on a much lower level, and is independent of the transferred contents.

(8) By Andy Bradford (andybradford) on 2019-08-28 14:31:58 in reply to 6 [source]

Execpt that  the way the  sync protocol  works today would  require some
back-pedaling or  server side-tracking  of individual  round-trips. Sync
works by breaking up the artifacts  into enough chunks of data that will
fit in a pre-defined limit  per round-trip. Each round-trip is committed
(in the SQL  sense) to the Fossil repository independently  of any other
round-trip  chunks. Not  only that,  in HTTP/HTTPS  sync protocols,  the
entire set  of round-trips  are all run  on independent  Fossil "server"
instances---only SSH  keeps 1 connection  open for the entire  sync, all
other  transports open  multiple  connections per  sync  (where sync  is
defined  as  a set  of  round-trips  made  by  the client  to  push/pull
content).

The sync protocol is perfectly capable of handling this (phantoms is the
mechanism that Fossil uses recognize that  it is missing content and how
it knows what  to request of the sync clients).  So Fossil is eventually
consistent as long as users continue to push/pull.

Also, I'm  not sure that shunning  should be done automatically  in this
fashion.

Thanks,

Andy

(9) By Florian Balmer (florian.balmer) on 2019-08-29 12:39:30 in reply to 6 [link] [source]

There's also odd (theoretical) corner cases: an "outdated" manifest removed from the server by accidental shunning (then un-shunning) or purging wouldn't be restored on the server on subsequent pushes from clients.

(5) By Warren Young (wyoung) on 2019-08-27 15:59:28 in reply to 3 [link] [source]

Thinking more about this, this should be a mode enabled only on sites like those for public FOSS projects, where there is some reason to be skeptical about who is pushing what.

When it happens, the mechanism behind it can be an auto-shun: the "bad" artifacts' IDs are dropped into the shun table and that client can then no longer send those to the parent repo. The sync can continue, with the parent accepting only those artifacts believed to be good.

If the parent repo rejects an artifact incorrectly, the Setup user on the parent repo can modify the shun table via Fossil UI then have the client try again, either under a more privileged login class (e.g. skip it for Admin and up?) or with the sanity-checking feature temporarily turned off on the parent. Since the sync protocol will prevent re-transfer of the same artifacts, the feature can be turned back on afterward.