Feature request: "leaf" version identifier

(1) By andygoth on 2021-04-22 20:50:11 [link] [source]

I suggest adding a new version identifier "leaf" to join the pantheon of special check-in names alongside "current", "prev", "next", and "tip". "leaf" identifies the leaf version of the currently checked-out branch.

This could be useful in an update script I've written for my enterprise that (among other things) uses cat to read the contents of the "latest" version of a certain file. But what does "latest" mean? It ought to mean the leaf of whichever branch is selected.

Is this something we really want? The same result can be had by typing $(f branch current), though this becomes harder to do in Windows.

Sorry I'm not in a position to code this myself, maybe this weekend. Just posting about it so the idea gets captured and we can discuss.

(2) By Stephan Beal (stephan) on 2021-04-22 23:27:34 in reply to 1 [link] [source]

This could be useful in an update script I've written for my enterprise that (among other things) uses cat to read the contents of the "latest" version of a certain file. But what does "latest" mean? It ought to mean the leaf of whichever branch is selected.

i don't think that's possible, pedantically speaking. A branch name resolves to the latest checkin with that branch name. Thus, so long as you have no forks and do not re-use a given branch name, the branch's name will resolve to what you are asking for. However, as soon as you have a fork, i.e. multiple leaves, all bets are off: how is the algorithm supposed to be able to know which leaf to choose?

Suppose you have this checkout:

https://fossil-scm.org/home/timeline?c=73ebf81b9331ee89

And then ask it for the leaf of that branch.

Which one is it going to choose? (Noting that f2348f27 is probably supposed to be closed by now.)

(3.2) By andygoth on 2021-04-23 00:52:22 edited from 3.1 in reply to 2 [link] [source]

Like I said, "leaf" does the same thing as $(f branch current), so it would grab the newer commit. That is definitely an edge case though. We often refer to "the" leaf of a branch even though it's possible for there to be multiple, but we also try hard to discourage forks and make them easy to merge or rename into separate branches.

Reading again and thinking some more, I see I missed what you said about reusing branch names. Yeah, that is a difference from $(f branch current). So, in that case the "leaf" algorithm would have to take the newest commit that (1) has the same branch name and (2) is "current" or has it as a direct or indirect predecessor.

(Editing once more.) (2) above would also handle the fork situation, but not in the way I intended. Maybe it's better that way though, to not jump to the "other" fork just because it's newer.

(4) By andygoth on 2021-04-23 01:04:30 in reply to 3.2 [link] [source]

Let me reframe the discussion. "leaf" refers to the default version you'd update to if you typed "fossil update" with no further arguments. That's really what I'm looking for, a way to get information about that leaf without having to first update to it. Like I said, my use case involves cat.

What happens when there's a fork and you try to update? I believe the update command warns you, but it still picks something on its own. Right? I guess I can check...

Okay, it follows descendants until it hits a leaf or there are multiple descendants. In the latter case it aborts. So, use of the name "leaf" would be an error when "current" precedes a fork, but if "current" comes after a fork then it just goes to the fork's leaf.

Interestingly, I noticed that "next" does not abort when there are multiple descendants. Rather, it seems to pick the newest among them.

(5) By Stephan Beal (stephan) on 2021-04-23 01:52:45 in reply to 4 [link] [source]

Let me reframe the discussion. "leaf" refers to the default version you'd update to if you typed "fossil update" with no further arguments.

That's functionally identical to running update branch-name.

You can try that by doing, from the main fossil tree:

$ fossil co 73ebf81b9331ee89 # this is one step before a fork
$ fossil update # picks the newest of the forks
$ fossil undo
$ fossil update brlist-timeline # same as update w/o args

Are you looking for a way to get around having to know the branch name at all? If so, perhaps we could introduce an alias which resolves to the current checkout's branch name. Until/unless that's done, you can get the current branch name with something like:

$ fossil whatis current | awk '/tags:/{print $2}'

(6) By Stephan Beal (stephan) on 2021-04-23 01:55:17 in reply to 3.2 [link] [source]

Like I said, "leaf" does the same thing as $(f branch current), so it would grab the newer commit.

i wasn't even aware that branch current was a thing. Again what learned. So ignore what i said about awk, as that's a less efficient way of doing that same thing.

(7) By andygoth on 2021-04-23 02:13:41 in reply to 6 [link] [source]

And I didn't know about whatis. I've been parsing the output of info!

(8) By andygoth on 2021-04-23 02:17:02 in reply to 5 [link] [source]

No, as we've discovered, it's not functionally identical in the face of forks and reused branch names.

For reused branch names, "fossil update branch-name" jumps to the newest instance of the branch name, whereas "fossil update" finds the leaf of the currently-selected instance of the branch name.

For forks, "fossil update branch-name" jumps to the newest leaf, regardless of which fork was currently selected. "fossil update" instead takes pains to remain on the current fork, and it aborts if it encounters any new forks while looking for the leaf.

(9) By george on 2021-04-23 13:53:09 in reply to 7 [source]

As an alternative to 'fossil branch leaf' the functionality you want could be called 'fossil whatis --leafof current'.

BTW, something strange is going on in 'fossil whatis current':

  $ f checkout 797134331481
  ...
  $ f whatis current | grep tags:
  tags:       trunk, wcontent-subsets

Why 'trunk' is there? Bug or feature?

(10) By Stephan Beal (stephan) on 2021-04-23 15:04:28 in reply to 9 [link] [source]

Why 'trunk' is there? Bug or feature?

That definitely seems like a bug, but the the problem (if it is one) is in the tagxref data, not in the whatis display:

$ f sql
...
SELECT substr(tagname,5) FROM tag JOIN tagxref ON 
  tag.tagid=tagxref.tagid WHERE tagxref.rid=
  (select rid from blob where
    uuid='797134331481cb9c0e89f72511e4355b05ec03ef37cccc3f2d6cca11b248795c')
  AND tagname GLOB 'sym-*';
'trunk'
'wcontent-subsets'

SELECT tagname,value FROM tag JOIN tagxref ON
  tag.tagid=tagxref.tagid WHERE tagxref.rid=
  (select rid from blob where
    uuid='797134331481cb9c0e89f72511e4355b05ec03ef37cccc3f2d6cca11b248795c')
;
'branch','wcontent-subsets'
'sym-trunk',NULL
'sym-wcontent-subsets',NULL

My initial guess was that the propagation of the trunk tag is not being cancelled at the branch point, but that's not the case. The manifest of that checkin cancels that tag:

T -sym-trunk *

The next suspect is that the crosslinking of tags during/after the checkin process is not removing that particular tagxref. My left hand is killing me today, so i'm not going to dig into it right now, but will do so as soon as my hand allows for it.

(11) By anonymous on 2021-04-23 16:25:58 in reply to 8 [link] [source]

...as we've discovered, it's not functionally identical in the face of forks and reused branch names.

Is it possible to make Fossil behavior defined consistently across such cases?

"fossil update branch-name" jumps to the newest instance of the branch name.

In my opinion, this is an expected behavior from a regular user prospective. If branch has been renamed, then it is the current state of "truth". I understand Fossil renames a branch all the way back to its root.

"fossil update current" finds the leaf of the currently-selected instance of the branch name.

I assume the repository is in the sync'ed state. If the sync'ed state already defines the branch name as a new reference, then this would invalidate the current's branch name. So the update should rely on the current "truth" and detect the mismatch. The very least, there should be a warning; the resulting state should point to the now-current branch name. So user stays on the same descent line of commits, not of the branch name.

"fossil update branch-name" jumps to the newest leaf, regardless of which fork was currently selected.

This one depends on the current state. The same principle is to maintain the currently selected descent line. On the other hand, if user is on a different branch and tries to update to the multileaf branch, then at least there needs to be a warning about ambiguity requiring an explicit hash for the leaf, or a leaf number (branchname#1, for example).

"fossil update current" instead takes pains to remain on the current fork, and it aborts if it encounters any new forks while looking for the leaf.

Here it seems that update reasonably tries to maintain the current line of descent. Again, if sync brought in an upstream fork, it's reasonable to show an ambiguity warning and request the explicit reference as above.

Just to reiterate the question, would it be possible to define a consistent behavior to harmonize these cases?

(12) By Stephan Beal (stephan) on 2021-04-23 17:41:40 in reply to 11 [link] [source]

Just to reiterate the question, would it be possible to define a consistent behavior to harmonize these cases?

Sure it's possible, but...

Is it possible without breaking existing usage patterns/expectations? That's likely unknowable until we break it and hear from someone who's experiencing grief caused by the change.
Is it otherwise worth the effort and potential breakage? The cases in which this hair-splitting makes a functional difference are few, far between, and quickly resolved. (Not to belittle hair-splitting - it's practically a hobby of mine.)

(13) By andygoth on 2021-04-23 18:25:59 in reply to 11 [link] [source]

"fossil update current" doesn't do anything, except sync if autosync is turned on. "current" refers to the current checkout, not to the leaf. What I'm shooting for is "fossil update" to be 100% an alias for "fossil update leaf".

(14) By andygoth on 2021-04-23 18:32:44 in reply to 9 [link] [source]

Rather than -leafof, I'm tempted to have "leaf:" be a prefix like "root:" and "merge-in:". Thus, the "leaf" I proposed above would instead be spelled "leaf:current". (The special tag "leaf" could itself be an alias for "leaf:current".) This would make leaf finding be usable in more contexts without an intervening call to whatis.

(15) By anonymous on 2021-04-23 19:17:06 in reply to 13 [link] [source]

"fossil update current" doesn't do anything

Sorry for the confusion, that should be "fossil update". I rather meant it as a "current state", not the check-in id, as current references it.

(16) By andygoth on 2021-04-23 21:17:20 in reply to 15 [link] [source]

Interestingly, I am using "fossil update current" in one place in order to exploit the autosync side effect, since this is driven by a script that always calls "fossil update" and will pass an argument if given. I don't have a separate script that calls "fossil sync" instead, since "fossil update current" does the same thing.

I sometimes need this manual sync in combination with taking the Fossil server down, since the Fossil binary itself is in the repository and can't overwrite itself while it's also running as a server from which it's expected to sync. Thus, the process is to manually sync, stop the server, update (ignoring the sync error), then start the server again.

(17) By Stephan Beal (stephan) on 2021-04-24 00:04:45 in reply to 10 [link] [source]