Fossil discussion on Lobsters

(1) By Richard Hipp (drh) on 2019-08-23 16:10:15 [link] [source]

Just FYI: Some discussion of Fossil vs Git on Lobsters.

(2.3) By Steve Schow (Dewdman42) on 2019-08-23 17:26:49 edited from 2.2 in reply to 1 [link] [source]

The discussion keeps rearing its head.. :-)

I have been using fossil for a couple years now and I still think, the sheer simplicity of setting it up is worth its price in gold for many of us. Its just unbelievably easy to setup and administrate and for small projects like me...MORE then adequate in terms of scalability.

Recently I have been investigating the possibility of moving from fossil to GIT. Why? Not because of the SCM itself. Fossil is so much more intuitive to use, while GIT still scares me a bit. But mainly its because of all the integration with other tools. GIT is everywhere now. Like it or not, GIT has become the defacto standard. Microsoft is getting behind it too now so I can easily see GIT being the defacto standard for decades in the future. Like it or not. That means even more integration with GIT will continue to be everywhere and GIT skills would not be a bad thing to have.

It keeps coming back, however to the fact that fossil is just so easy to setup. A single DB file and its all in there. That is a thing of beauty. Easy to backup. The fact that tickets and wiki are built into it, also are simply wonderful in one executable. I have been investigating lately some alternatives like gitea, redmine, phabricator, gitlab and a few others. Some of them are very nice, but setting them up, making sure they are backed up properly, and all the rest...not trivial. Fossil is truly awesome as long as you don't need anything it doesn't do. So far I have just told myself to use fossil, live without the things it can't do...and frankly..that is probably good enough.

But nonetheless, due to the fact that GIt is becoming the defacto standard, that git tools are becoming quite nice, that there are so many great git based cloud services, etc. I am considering moving to a git repo workflow. Here are some of the reasons

Redmine, phabricator and other project management solutions integrate with GIT and they provide certain things that fossil does not. For example the ability to have multiple repos sharing one project management DB, with projects that span across scm repos, etc. The ability to have a single consolidated place to see my tickets across multiple repos, project planning capabilities beyond simple tickets, etc.
Rebase. Rebase can be abused, I realize. But I am one of those people that would prefer to be able to commit away with reckless abandon as I work on something and then squash commits down to one or two meaningful commits at certain points of time. That is the not the fossil way and never will be from a philosophical standpoint. When I use fossil I find that I tend to avoid committing things too much in order to avoid getting too many trivial commits. But I would rather commit often, regardless of how trivial, while working on a feature branch...but then what gets pushed back to a central repo will not have so many trivial commits, just one or a couple meaningful commits.
Easier to go back and forth between a cloud host such as GitHub. Let's face it, GitHub is the defacto standard for sharing open source projects and likely will be for decades in the future. Simple as that. BitBucket is pretty darn good too and growing like crazy.
Lots of GUI tools available for GIT that are quite good. I'm really impressed lately with something called "Fork". Also endless integration with just about all IDE's.
Even most web hosting services support git repos. It just is what it is, everyone is using GIT.

All that being said, I'm still torn on it, because fossil is just so so so simple to setup and use, my repos are not huge, I have no scalability concerns whatsoever with fossil. I don't have to scale to many many users, etc. Fossil also keeps me out of trouble by not allowing rebase or other scary merges that could screw up the repo, as has allegedly happened to many git users trying to get tricky with it. For my use case, fossil is really ideal in so many ways, but the above points, especially the first two, are why I am thinking about transitioning to more of a git-all-the-time workflow.

Still undecided though and as I said, so far I'm finding the ticket and wiki aspects for git based systems to be a bit much to deal with. Gitea is not bad, they store all the tickets and wiki in a sqlite db, but keep the git repo in the filesystem. It is pretty darn simple though... but its ticket system is just ok, I like fossil's ticket report capabilities a lot more. Main advantage though is that you have one common ticket DB that can span across multiple SCM repos...which fossil can't do...fossil keeps each repo independent, which granted is part of its elegant simplicity, but its also a limiting factor for me.

(93) By anonymous on 2019-08-26 23:36:53 in reply to 2.3 [link] [source]

[1] Redmine, phabricator and other project management solutions

As was discussed in other posts, Fossil could be enhanced to provide some of those features. Tickets and wiki pages can certainly be centralized, the weak link being an easy way to link to commits (and other artifacts) in the project repos, and vice-versa. User IDs can be shared via a login-group, though permissions are on a per-repo basis. As for project planning, I assume you mean like Gantt Charts? While the document file (and maybe a SVG or PDF export) could be stored in Fossil, I don't see Fossil including its own Gantt Chart engine.

[2] Rebase. ... I am one of those people that would prefer to be able to commit away with reckless abandon as I work on something and then squash commits down to one or two meaningful commits at certain points of time.

The only thing preventing you from committing "with reckless abandon" is your own inhibitions.

If you really feel the need to hide that, work on private branches and publish to a public branch.

Also, if I am remembering correctly, from a git repo, you can tell git which branch to push. So, do your work in Fossil, use fossil git export to update your local git repo, then git push <repo> <refspec> to push your "clean" history to your public git repo.

[3] Easier to go back and forth between a cloud host such as GitHub.

See above for pushing out. Bringing contributions in is harder. If you use github, it doesn't have a way to disable pull requests. I don't know about gitlab or others.

(I wonder how pull many requests DRH is geting from his github mirror of the sqlite repo.)

But, Sourceforge does host Fossil repos: fossilrepos.sourceforge.net (I just tested it, It was working as of when I posted.)

[4] Lots of GUI tools ... integration with just about all IDE's

Not sure what to say about GUI tools. TortoiseFossil would be nice to have. I think someone tried to do that.

As best I can tell, the main obstacle to IDE integration is insistence on a "proper plug-in". Apparently, a plug-in that uses fork/exec or spawn to run Fossil is "improper". Of course, the lack of a common standard for even VCS plug-ins is also a huge obstacle.

[5] Even most web hosting services support git repos

Not sure what to say, other than noting that Sourceforge supports Fossil (see above).

(95) By Warren Young (wyoung) on 2019-08-27 02:47:19 in reply to 93 [link] [source]

I wonder how pull many requests DRH is geting from his github mirror of the sqlite repo.

Unless the stats have somehow been reset recently, none.

That trips my confirmation bias: as a contributor to several FOSS projects not using the current hotness in development tools, I've repeatedly seen posts to the project's user forum like "I'd contribute if I could only send GitHub PRs!"

I realize SQLite wouldn't accept PRs even if they were sent due to contribution policies, not even in patch form, but you'd think if lack of an easy public PR mechanism were the real reason people aren't contributions, there would be some PRs hanging out in there, unable to be accepted or rejected.

What this tells us is that we probably haven't missed much in our collective public Fossil repos by not having a push/pull request feature.

(98) By anonymous on 2019-08-27 10:03:49 in reply to 95 [link] [source]

I've repeatedly seen posts to the project's user forum like "I'd contribute if I could only send GitHub PRs!"

I didn't mean to imply that pull requests are a requirement, rather that I think it's likely others would assume any project on github accepts pull requests.

Personally, I'd rather not deal with git, but projects I value do use git. Fossil is far better. And I say that after having used git for 2 years before encountering Fossil.

(3) By Marcelo Huerta (richieadler) on 2019-08-23 17:28:15 in reply to 1 [link] [source]

They are saying this:

They claim Sqlite is a superior storage method, yet it is widely known for getting corrupted (probably the reason they run integrity checks all the time), lacks the ability of multiple entities accessing it at the same time, and almost all its column types are silently converted to strings columns with no type checks.

This strikes me as untrue in at least two notions: 1) is corruption, really, a frequent complaint about SQLite? 2) are not the values stored depending on type, and not "silently converted to string columns with no type check"?

(4) By Steve Schow (Dewdman42) on 2019-08-23 17:39:02 in reply to 3 [link] [source]

my opinion, SQLITE is being used a LOT of places and I do not hear anyone complaining about corruption.

"Widely known"? I haven't seen that but maybe I've missed it?

I'm sure it can theoretically happen and must happen in some rare cases. I'm sure it can happen with any SQL DB. But that is what backups are for. Another thing of beauty in fossil is that you have very easy distributed system where all branches are always sync'd everywhere (typically), so if you have your repo cloned and sync'd to several different places...for the most part it will be a complete backup right there. All repos always have everything that everyone is working on. And the tickets and wikis too!

(6.1) By Marcelo Huerta (richieadler) on 2019-08-23 17:55:36 edited from 6.0 in reply to 4 [link] [source]

"Widely known"? I haven't seen that but maybe I've missed it?

My thoughts exactly. I cannot post there but at some point someone is going to have to challenge that...

Another thing of beauty in fossil is that you have very easy distributed system where all branches are always sync'd everywhere (typically), so if you have your repo cloned and sync'd to several different places.

And I'm sure the gitters hate that. They have only the parts of the repo that are relevant to the branch they're on.

In any case, what they hate the most is that they cannot rebase. They do all kinds of manipulation in their staging areas (which we don't have in Fossil) when they could commit as often as they wanted in a local private sub-branch and then commit to the feature branch in one go. Nobody else has to see your mess... (That, if the "hidden" tag isn't enough for you.)

What I like the most about Fossil, besides the all-in-one functionality, is that branches are simply labels which you can edit after the fact. Lovely.

(8) By Richard Hipp (drh) on 2019-08-23 17:57:15 in reply to 3 [link] [source]

Sqlite is ... widely known for getting corrupted....

That's just misinformation.

I probably hear more SQLite corruption cases than most people, since I'm the one people complain to. There are over 1 trillion SQLite database files in active use and yet we rarely hear of problems. And when a problem does arise, it is usually traceable to some outside cause.

(30) By mattwell on 2019-08-23 19:58:38 in reply to 8 [link] [source]

If the context is a git multi-file datastore vs. fossils single file store then it is clear that the single file store is far more vulnerable to system and user created problems. However only a very small subset of users actively pushing multi-user and or multi-machine systems hard will ever be affected.

As a wish list item it would be great if fossil had a feature for storing large data blobs as external files. Large sqlite3 databases alone are not the problem, it is when a large commit is made against a large database in a busy system. Fossil appears to have places where it does not wait long for a lock and things break. Because fossil must store the blob internal to the db the locks are held for a long time on some commits and anything going wrong in that time will wreak havoc. A ^C or worse "kill -9 ..." of the fossil process and you'll have a locked db. On Linux the lock will stick to the file until you recreate the recreate the file - it will even follow a "mv ...". I can see where inexperienced users might conclude that sqlite3 is brittle.

Making the db lock timeout a configurable item and giving the user feedback when waiting for the lock would also be helpful enhancements for those of us trying to use fossil in megacorp environments :) . Make sure you get all the dbs. We have occasionally had problems with all the dbs, home dir .fossil, .fslckout and the *.fossil files.

(5) By sean (jungleboogie) on 2019-08-23 17:50:18 in reply to 1 [link] [source]

Well it's not surprising that there are many folks who have strong feelings about git. As Warren pointed out in another thread recently, git will likely be the dominate source control choice for years and years to come. github is popular, power and priced right for people who want to advertise their work.

This is encouraging, though.

Honestly? I don’t actually care which SCM system I use, as long as it talks to GitHub, because that’s where all my code and all my employer’s code lives.

What is surprising is the strong feeling people have about rebase. I had no idea it was so widespread and a requirement for many developers. I don't know what's wrong with recording mistakes made in code later showing it was cleaned up. Anyone have any insight on that?

I'm pleased drh's and Warren's hard work is being read in this article, but I'm surprised and disappointed by the commentary surrounding it.

(7) By Marcelo Huerta (richieadler) on 2019-08-23 17:54:33 in reply to 5 [link] [source]

What is surprising is the strong feeling people have about rebase. I had no idea it was so widespread and a requirement for many developers.

I don't think this is intrinsically so. I'd venture that they have been spoiled with the relative ease Git provides for rewriting history.

I don't know what's wrong with recording mistakes made in code later showing it was cleaned up. Anyone have any insight on that?

Too much concern about appearances? "To look as a good programmer one must never commit silly mistakes, or at the very least never acknowledge them"? I don't get it either.

I'm pleased drh's and Warren's hard work is being read in this article, but I'm surprised and disappointed by the commentary surrounding it.

100% agreed.

(9.3) By Steve Schow (Dewdman42) on 2019-08-23 18:14:57 edited from 9.2 in reply to 5 [link] [source]

all the downside of GIT is that you can theoretically re-write history with reckless abandon, thusly creating as many problems as you solve. Right?

But since you ask why, I want to commit often. I do not want to come back a year later and look through a bunch of commits that have no individual meaning. In fact I would prefer to have exactly one commit per ticket!

I think that is why people want rebase more than anything else. When doing code reviews and such, it simplifies things a lot if the final commits that people will review, including myself or others...both now and/or later in the future will have more meaningful atomicity.

Fossil basically takes a stand that re-writing history is verboten, and with good rationale, its perfectly possible for history re-writing to be abused. But its also possible to not abuse it, and the fact that fossil doesn't really allow commit squashing...is a problem for a lot of people.. I have been living without it because fossil is just so easy to use, but truthfully...if I switch to git, I will look forward to being able to consolidate my commits, no question about that.

And yes a lot of people use that as a rationale for disliking fossil, though I feel that on its own that is an overblown reason. I have been using fossil for quite a while now without being able to squash my commits and it hasn't really slowed me down. But I do tend to commit less often than I probably should be.

I think also in the fossil world, it makes a lot less sense to allow rebase because of the way repos are sync'd up.. You really don't have that much of a local repo like you an have with git. With Git you can be isolated in your local repo, and commit, branch, merge or whatever you want to do it, without consequence to the parent repo until you push it back. So creating a lot of commits and then rewriting those commits later...is not a problem and should not be. its no different then manipulating your text files as you work on it, but you're manipulating the way those source files are versioned in your local view. Only when you're ready to push back to a parent repo do you care what the commits look like and the ability squash them becomes extremely useful. That is git.

Fossil tends to be more oriented towards syncing often and keeping all the repos in sync, even with the feature branches being worked on... so now your commits are kind of already final the minute you commit them. Rebasing would be problematic in fossil because of that.

(13) By Warren Young (wyoung) on 2019-08-23 18:34:33 in reply to 9.3 [link] [source]

I would prefer to have exactly one commit per ticket!

What if you had one commit per ticket on the trunk? Would that suffice?

If so, then you work on each ticket on a branch, then merge it to trunk when it's complete.

That's Fossil's answer to "rebase." The merge commit provides "squash all work to a single commit."

It doesn't have to be a separate branch per ticket. It could be one branch per user, or one branch per task type, or whatever else makes sense to you. Fossil is perfectly happy to merge multiple times from the same branch back into its parent.

(18) By Steve Schow (Dewdman42) on 2019-08-23 19:00:28 in reply to 13 [link] [source]

merging feature branches back to a single commit on mainline during merge is the next best thing and definitely how I prefer to work for exactly that reason, even though I am just a one man shop that doesn't really need feature branches otherwise, but it does give me the ability to target one commit as the one representing a certain ticket, even if there were numerous smaller commits inside the branch.

Still...I prefer to not have many little commits of meaningless atomicity. So with fossil I tend to not commit anything unless I reach a certain checkpoint in my coding where my instincts tell me its a good time to commit. But what if my instincts are wrong? And sometimes they are wrong. So I tend to not commit as often as I should while working because I want my final commits to have more meaning. This commit embodies some set of changes that make sense together. The next commit is another set of meaningful commits. Honestly I like having exactly one commit per ticket, but its certainly possible that I might end up with 2 or 3 meaningful commits that associate with one ticket...for some reason..maybe a followup commit..

what often happens a lot of time is that you commit something and then realize you needed something else and pretty soon you have a few annoying commits when it all should have been in one easy to grok commit and nobody should of had to even think about looking at several commits together to grok the overall change.

In team code review I prefer to have one commit for the ticket, others might review it, then a second commit for the changes after peer review. 6 months later if someone wants to come back and look at it, they know there are a couple of commits for the ticket and the commits each have meaning more than "forgot to add a comment"

(102) By anonymous on 2019-08-27 19:49:17 in reply to 18 [link] [source]

In team code review I prefer to have one commit for the ticket

fossil diff --from TIPOFTRUNK --to TIPOFBRANCH

will show you the changes that will be applied to trunk in a merge. Basically, a merge preview. Doesn't matter how many commits are in the branch. Doesn't matter if you squash all the commits down to 1. The result will be the same.

99.999% (or more) of the time, the only thing the team members reviewing the changes are going to look at is this diff.

A tool my team used to use, reviewboard, presented exactly this diff to the reviewers. We stopped using reviewboard shortly after we started using Fossil as it was easy enough to do in Fossil and having all tickets in Fossil is better than what we used to do (issue tickets in bugzilla and review tickets in reviewboard).

Interestingly, one thing we now do, "review check sheets" (mandated by management), turned out to be easier to implement around Fossil than would have been with reviewboard.

Sorry I can't share the script, but what it does:

Use fossil ticket show to extract the review details.
Use fossil diff -side-by-side to get the actual differences.
Formats the diff and other information into a spreadsheet: One row per diff line and columns for Ok/NotOk and Review Comments.

Unfortunately, our IT division doesn't allow applications (other than Outlook) they didn't develop to send email.

(103) By sean (jungleboogie) on 2019-08-27 19:51:49 in reply to 102 [link] [source]

Unfortunately, our IT division doesn't allow applications (other than Outlook) they didn't develop to send email.

So no one ever reviews Fossil diffs in a web browser?

(107) By anonymous on 2019-08-27 21:04:34 in reply to 103 [link] [source]

So no one ever reviews Fossil diffs in a web browser?

Actually, we do, but at some point we have to have a formal review. That's what the review check sheets are for.

(Has to do with demonstrating compliance with several process related standards - ISO-9000 and several others.)

(104) By Marcelo Huerta (richieadler) on 2019-08-27 19:59:29 in reply to 18 [link] [source]

Still...I prefer to not have many little commits of meaningless atomicity. So with fossil I tend to not commit anything unless I reach a certain checkpoint in my coding where my instincts tell me its a good time to commit. But what if my instincts are wrong? And sometimes they are wrong.

You can commit to a private sub-branch of your feature branch, as frequently as you want. When you reach a point you are willing to share with others, you merge the private sub-branch into your public feature branch; it will appear as only one commit in the public branch.

6 months later if someone wants to come back and look at it, they know there are a couple of commits for the ticket and the commits each have meaning more than "forgot to add a comment"

Fossil allows you to show only checkins in the timeline. An edit to fix a comment or to change a tag is not a checkin, so it will not appear as an additional checkin entry. I think the default is to show only the checkins.

(10) By Richard Hipp (drh) on 2019-08-23 18:13:39 in reply to 5 [link] [source]

I think I understand why many people like rebase.

The way Git works is that the user tends to only see one branch at a time, not the whole tree as you do in Fossil. If you are only seeing one branch at a time, then you don't want excess clutter on that branch. Rebase can then come in handy in helping to keep things neat and tidy.

With Fossil, you put your crazy experimental stuff on a transient branch. "Rebasing" in Fossil means merging from your experimental branch onto trunk. So, for example, I recently "rebased" Warren's server documentation changes to trunk. (here). In a sense, Fossil does have rebase - we just call it "merge". The main difference is that rebase discards the history of the branch you merged from.

Fossil also lets you move check-ins from one commit to another. If you are happily committing away and then decide that your previous 7 commits are not as awesome as you originally supposed, you can just move that sequence of commits to a branch after-the-fact, and park them there permanently, or merge them back to trunk after cleanup, as you see fit.

I think that if I was forced to use a system that made it difficult to see more than once branch at a time, and that prevented you from moving check-ins to a new branch after-the-fact, I'd probably want a rebase capability too.

Thankfully, I'm not forced to use such a system :-)

(11) By Steve Schow (Dewdman42) on 2019-08-23 18:20:57 in reply to 10 [link] [source]

in the git world there is a so called "golden rule" that you're not supposed to rebase anything that has already been pushed back to a repo shared by others. Basically that is the thing fossil avoids altogether by not even having rebase.

Rebase and merge are very similar but rebase can be abused much more easily if done in the wrong direction and if its to something that was already pushed, it can seriously mess other people up...so that's the thing.

With git, the overall working model is to work more in an isolated cloned repo, making as many commits as you want without pushing back to anyone. In that kind of world, being able to manipulate all of your commits and rewrite history as you wish is useful, some may consider mandatory. Thus the arguments for git rebase.

I think with fossil, that is just not really part of the paradigm. If fossil ever did have rebase I think it would be better served to be a bit more limited then the rebase in git, providing maybe only a few ability to squash commits in some limited ways...and not rebase in the wrong direction and/or break the so called "golden rule".

(12) By Richard Hipp (drh) on 2019-08-23 18:33:31 in reply to 11 [link] [source]

Thanks for the analysis.

This gives me an idea. Perhaps we should consider a "fossil rebase" command that has the restriction of only working for check-ins that you have not been pushed to another repo. What would the syntax of such a command be?

Note that the repository database schema has a UNSENT table that conveniently records the recordID or "rid" of every unpushed artifact.

(15) By Chris (crustyoz) on 2019-08-23 18:51:47 in reply to 12 [link] [source]

If I understand correctly, that would mean turning OFF automatic sync prior to commits. The fossil rebase command would be responsible for turning ON sync just for its operation.

I use automatic sync during commits as a form of default backup to another server, in part because I sometimes work on two local machines, and in part because I don't trust my main computer to start up in the morning (it's getting cranky in its old age - like me). In the latter case, working without sync would deny me access to yesterday's changes.

(16) By Steve Schow (Dewdman42) on 2019-08-23 18:55:35 in reply to 15 [link] [source]

yep. and I view Fossil's automatic sync as one of the great features about it. I actually don't want to turn it off. But this does make it so that any attempt to add some kind of rebase in fossil should be done in a more limited fashion to avoid the kinds of problems the git users can potentially get into when they use rebase in a shared situation.

It only matters if you rebase a branch that was already pushed back AND someone else is also working off that feature branch. if you're the only one working on that feature branch, then it doesn't matter if its pushed back to the parent repo... its all still good.

But if you push back a rebased branch and someone else is working on the same branch in non-rebased form, then ugly and hard to untwist merge conflicts will likely arrive to the other person who doesn't know WTF you did and will need your help to unravel it.

(19.1) By Richard Hipp (drh) on 2019-08-23 19:05:34 edited from 19.0 in reply to 16 [link] [source]

What if "fossil rebase" worked like "git rebase" in that it would operate on any commits in the repository, except that instead of deleting the commits you rebased from, it merely marks them as "hidden"? Then other people who are working off of changes you rebase can continue to do so without harm. It is just that the rebased check-ins no longer appear in the timeline.

(I'm a little skeptical of this idea myself. I just want to put it out there for discussion.)

(Edit: I'm also skeptical of the "hidden" flag, fwiw. If it were only up to me, I wouldn't have a "hidden" flag. :-))

(21) By Steve Schow (Dewdman42) on 2019-08-23 19:11:38 in reply to 19.1 [link] [source]

me personally I don't like the idea of hiding commits at all. If they are there, leave them showing. They are there and need to be shown.

Its not only squashing, but sometimes people might do say 4 commits in a row, but then they realize commits 1 and 3 are related to each other and 2 and 4 are related to each other and they'd rather have two commits with the related changes in each one. This is because they were coding away and commiting often and they didn't know where it would end up as they executed commit, but now in retrospect they wish they had those two commits for future posterity or for peer review.

So with git you can interactively rebase and reorganize the commits, merge the commits together, into one or several commits as you wish. This is very handy...so long as nobody else is working on the same feature branch using any of these referenced commits already...in which case...it should not be allowed!

Also remember that rebase is not only about squashing commits, though a lot of people I think use it for that, its also being an alternative to merge....that brings the parent's changes up to the feature branch in a bit cleaner way...the end result is when you merge the feature branch back to the trunk it will seem more linear.

Pro-Git book explains merge vs rebase very well...

(31) By Stéphane Aulery (LkpPo) on 2019-08-23 21:55:41 in reply to 21 [link] [source]

Also remember that rebase is not only about squashing commits, though a lot of people I think use it for that, its also being an alternative to merge....that brings the parent's changes up to the feature branch in a bit cleaner way...the end result is when you merge the feature branch back to the trunk it will seem more linear.

I had colleagues who swear by the rebase. They spread like others the idea that working with merge is impossible, too noisy, prone to errors and pollute the history. There is a fashion effect around rebase. It's THE things to do.

They came to this idea using git intuitively without RTFM, destroying repositories or losing commits. After a few experiences like that, they went through training and things started to work. Instead of concluding that they would have RTFM and use the features on purpose, they concluded that merge is evil and the last thing to do.

It's probably not everyone's way but I suspect laziness is pregnant with the difficulty of embracing a large amount of data when the history grows.

(32.1) By Steve Schow (Dewdman42) on 2019-08-23 22:39:12 edited from 32.0 in reply to 31 [link] [source]

There are plenty of people that realize that neither merge or rebase is inherently evil nor inherently better either way. There is a time and place for both things and for some people just a preference. There main problem with git is that it doesn't guide you towards making the right choice about when to use something like rebase vs when to use merge. And that is why people get into trouble with it.

At my last big development job, which was admittedly 10 years ago, we were using ClearCase, where rebase was part of our workflow and I just got used to it for at least 5 years.

I personally do like using Rebase at the end of a feature just before I merge it back to mainline. I would rebase from mainline into my feature branch, resolve any conflicts, build and test again and them merge it back to mainline. With Fossil I can't rebase, so I merge into feature branch before merging back, which is ALMOST but not quite the same.

When you merge in, merge out, you end up with a string of commits in your branch followed an odd merge commit where the mainline head merges into your branch, which is always fun to sort through...but its kind of reverse order. On the other hand, when you rebase into the feature branch, first off your branch kind of starts over by splitting off from a later point at the head of mainline...eliminating many merge conflicts...and then your own changes are added to the end of that...which is very linear and sensible, but might involve some merge conflicts,...then after building and testing again...you have a very easy merge back to mainline, and the resulting series of commits on mainline are very sensible with the same mainline commits followed by one commit from your feature.

There is nothing inherently problematic with rebasing as a concept... But with git, its all too easy to use it when you probably shouldn't be.

(34) By Stéphane Aulery (LkpPo) on 2019-08-23 22:54:11 in reply to 32.1 [link] [source]

There is nothing inherently problematic with rebasing as a concept... But with git, its all too easy to use it when you probably shouldn't be.

I do not have a problem either with rebase or merge but notes that people would prefer that there is only one good way, and behaves as if it were the case.

(62) By anonymous on 2019-08-26 16:37:18 in reply to 32.1 [link] [source]

...eliminating many merge conflicts...and then your own changes are added to the end of that...

Conflicts mean you made changes that overlap someone else's changes. If you then re-apply your changes after the other person's changes, how can you be sure you aren't corrupting their changes? [1]

Sure, you might have made your changes the same even if you really did your changes after the other person's, but can you really be sure?

At least if you really make your changes after the other's, you've seen the code you are changing. With rebase, you are blindly remaking your changes to code you haven't seen.

Resolving conflicts gives you the chance to see what others have done and think about adapting your changes. If rebasing is "eliminating many merge conflicts", it is also eliminating clues to potential problems.

[1] If you didn't make changes to foo.c, then merging the mainline back into your branch should not result in conflicts with foo.c - if you are seeing conflicts, then there are other problems that need to be investigated.

(63) By Richard Hipp (drh) on 2019-08-26 16:56:03 in reply to 32.1 [link] [source]

when you rebase into the feature branch, ... your branch kind of starts over ...eliminating many merge conflicts....

I don't think you get any fewer merge conflicts when you rebase than when you merge. Do you have a counter-example?

Perhaps you are arguing that you can go ahead and deal with the merge conflicts at the point where you rebase, so that you don't have to deal with them again in the future. But that is equally true if you merge trunk into the feature branch - you deal with the merge conflicts then and there so that they do not bother you down the road.

Note also that having a conflict-free rebase or merge does not mean that the resulting code will actually compile and work. There might be incompatible changes on trunk and the feature branch, that while far enough separated from each other that the rebase/merge algorithm works without conflict, nevertheless prevent the code from running and/or compiling.

(64) By anonymous on 2019-08-26 17:30:13 in reply to 31 [link] [source]

the idea that working with merge is impossible, too noisy, prone to errors and pollute the history

"pollute the history" is a matter of perspective. Most often, once the end result of a branch is merged into trunk, I don't need to see that branch anymore. Sometimes, though, there will be a problem that is more easily resolved when the branch history is available.

Perhaps if Fossil's timeline (both command line and web views) to not show closed branches, this would mitigate complaints about "polluted history".

As for the other merge complaints...

The git documentation describes rebase as diffing the first branch commit against the common ancestor and applying that to the tip of the destination branch (usually "master", what Fossil calls "trunk"), then commiting the result. Then the diff to the next branch commit is applied to the new tip and committed. And etc until all the branch commits have been processed.

If you, when rebasing, choose to squash the branch down to a single commit, you are effectively applying the difference from the common ancestor to the branch tip to the master's tip.

Now, what is a merge if not taking a diff between the common ancestor and the branch tip and applying it to the master's tip (or tip of another destination branch)?

Of course, rebase is not recording commits from the source branch as co-parents of the new commits, Merging does record that information.

Maybe the incremental, commit-by-commit approach used by rebase makes merging easier.

Still, rebasing is merging.

(66) By Richard Hipp (drh) on 2019-08-26 17:46:41 in reply to 64 [link] [source]

rebasing is merging

Indeed it is. Any hypothetical "fossil rebase" command would reuse the merging logic. A rebase really is just a merge together with an auto-commit where the merge deliberately avoids recording the merge parent in the new check-in. In other words, a rebase is a merge (or a sequence of merges) with less history recorded, so as not to clutter the timeline.

So, we could implement rebase as a special tag attached to a check-in that says "ignore branch parents when rendering history". Or we could simply omit the merge parent from the check-in, though I would rather record the information and simply mark it as not intended to be used. Since rebase would be implemented as a special tags, a merge could be converted into a rebase, or a rebase could be converted into a merge, after the fact.

The most problematic aspect of rebase for me is that it creates new check-ins without giving the developer an opportunity to test those check-ins. What happens if there is a merge conflict? Do the merge conflicts become a part of the permanent record of the project?

Perhaps we could work around that problem by saying that all automatic check-ins from a rebase are initially marked "private" and must be manually published by the developer in a separate step.

(67) By Florian Balmer (florian.balmer) on 2019-08-26 17:57:24 in reply to 66 [link] [source]

Perhaps we could work around that problem by saying that all automatic check-ins from a rebase are initially marked "private" and must be manually published by the developer in a separate step.

Maybe this is more a question of "project policy" -- i.e. people could also easily do broken/untested (regular) commits with Fossil repositories, and people who don't care could simply setup a shell alias like "fossil rebase; fossil command-to-publish"?

(68) By Steve Schow (Dewdman42) on 2019-08-26 18:18:29 in reply to 66 [link] [source]

my vote would be to not over complicate it. There are several reasons why a full rebase as conceived in other products may not ever be part of fossil. one major reason is related to the whole auto sync functionality which I personally see as a major benefit in fossil. But it is a very big departure from git's model where everyone is working alone on their own repo, so they can rebase, to simplify what they are looking at, to get a newer base in their own branch so that their branch is more to up to date, in the same order as what the the mainline has been doing since they originally branched off. Yes they have to resolve the merge conflicts in their branch rather then waiting to merge back to mainline to do it. But ultimately that gives them kind of a cleaner and more chronological view of commits in they branch as if they had started from a later point in time with their branch to begin with. There might be times when its not appropriate to do that, but other times where it is.

But either way, they are in their own isolated repo where they should be free to squash their commits, remove their commits or do anything they want to their commits UNTIL its time to push back to the repo. Then things need to become more settled,

My golden rule would be, never allow history to be changed that is already in the shared repo. But what someone is working on in their remote repo....change the history all day long and night with reckless abandon however they want.

But therein lies the rub because fossil auto sync, a great feature, makes all commits go back to the shared repo so that everyone sees what everyone else is working on, etc.. and if anyone else happens to be working on the same branch, then rebasing it would be a problem.

The idea that a rebase might break a build in a committed way, is not a problem in git because people are working on their disconnected repos and they can just squash and repair, fix the break, make a new commit, squash it back down if they want, it doesn't matter. They can edit their local history as they go just as easily as changing lines in a source file. Its just what they are working on and its nobody's business what order they move semi colons around. It becomes everyone's business once it is pushed back to a shared repo though.

(70) By Warren Young (wyetr) on 2019-08-26 18:38:19 in reply to 68 [link] [source]

I don't see the value.

If it is true as claimed above that the original commits that go into a Git rebase aren't actually abandoned in the local repo, thus not subject to Git garbage collection, that means the simplification people talk about when using rebase only comes about by choosing not to push some branches to the remote; Git lets you push only the rebased branch.

That doesn't work with Fossil, since it always pushes all commits during sync.

Assuming that won't be changing, the only possible benefit I can see from a Fossil rebase feature is if the Fossil UI /timeline simply doesn't display the original checkins that go into a rebase, thus pretending that they arrived on the destination branch in a single checkin.

How is that different from:

$ fossil up target-branch
$ fossil merge source-branch
$ fossil ci
$ fossil amend --hide base-of-source-branch
$ fossil push

Is it just that this proposed fossil rebase would bundle all of that into a single command?

Incidentally, you can't use the root: syntax in the amend command above for two reasons:

It names the parent checkin of the branch's starting checkin, not that starting checkin itself. If target-branch is trunk and source-branch is as-named, then root:source-branch names a checkin on trunk, with the consequence that if you said --hide root:source-branch above, you'd hide everything on trunk past the branch point! I think we need another special name like initial:branch-name to handle cases like this.
A rebase should start from the first check-in past the last merge-up operation from the destination branch onto the source branch, unless a specific checkin ID is given instead of a branch name.

(71) By Steve Schow (Dewdman42) on 2019-08-26 18:45:28 in reply to 70 [link] [source]

where it matters the most is in code review prior to merging to the mainline. The branch's chronological series of commits will be more sensible to whomever is reviewing the feature before allowing it to be merged into trunk. If you intend to hide or destroy the original feature branch eventually anyway, then it won't matter a year later when someone will just see the merge-in commit on the trunk that has all the consolidated and squashed changes as one commit there. That commit will look exactly the same regardless of whether the feature branch was developed, reviewed and approved using merge or rebase in the process of getting there.

(77) By Warren Young (wyetr) on 2019-08-26 19:30:15 in reply to 70 [link] [source]

How is that different from...

From later posts in this thread, I can add to my breakdown above to better match my understanding of the Git fans' conception of a proper rebase feature:

$ fossil up source-branch
$ fossil merge target-branch
$ fossil ci
$ fossil up target-branch
$ fossil merge source-branch
$ fossil ci
$ fossil amend --hide base-of-source-branch
$ fossil push

I've added the first three commands to remove changes made on target-branch since the branch point from the "logical set of changes," as Git people put it.

I don't mean to imply that this proposed fossil rebase feature actually do the first merge command or the ci following it. I'm just showing my understanding of the best way we currently have to get the same effect in Fossil.

Is it correct?

(105) By anonymous on 2019-08-27 20:57:31 in reply to 70 [link] [source]

I think we need another special name like initial:branch-name to handle cases like this.

Alternately, I suggest root+n:branchname to specify any branch commit relative to its root. Though root+1:branchname is probably the most useful.

Also, another special name for "most recent common ancestor". By that, I mean, if you branch off of trunk (or some other branch) and have merged in updates from the parent branch, the trunk/whatever commit that merge was from would be the most recent common ancestor. Maybe mrca:branchname

Then mrca+n:branchname to refer to branch commits relative to that.

For symmetry, maybe also have current-n and current+n

(106) By Warren Young (wyoung) on 2019-08-27 21:01:08 in reply to 105 [link] [source]

Your extensions could be useful, but the basic syntax is already set.

(108) By anonymous on 2019-08-27 21:12:02 in reply to 106 [link] [source]

Even if it's just:

root:
root+1:
mrca:
mrca+1:

that would be a useful, intuitive naming scheme.

Though, seems to me that I'm just proposing an extension to that basic syntax.

(92) By anonymous on 2019-08-26 21:35:13 in reply to 68 [link] [source]

not a problem in git because people are working on their disconnected repos

In Fossil, branches can be made private, achieving the same effect.[1]

However, proper use of development branches can also achieve the same effect:

if anyone else happens to be working on the same branch

The process where I work requires that if 2 or more people are working on the same topic branch, then each should be using their own sub-topic branch.

Merges from a sub branch to its parent branch are treated the same as a merge to trunk. The changes must be reviewed and approved, first. (Yes, this means some code gets reviewed twice, but the 2nd review is in a greater context.)

Any conflicts a team member might encounter would be from attempting to integrate approved code, not because someone else was working on their branch.

[1] Something I have never tried: Can I set up a post-commit hook to simulate auto-push? Push "public" branches upstream, but also push both private and public branches to a "hot spare" repository. If this can be done, that would (should?) give the benefits of both auto-sync and private branches.

(69) By anonymous on 2019-08-26 18:34:33 in reply to 66 [link] [source]

Any hypothetical "fossil rebase" command would

If I were going to make a fossil rebase command, I would include 2 differences:

It would always squash down to a single commit.
It would leave the result in the working copy, requiring the user do the actual commit. (Presumably after testing.)

I see this as more consistent with Fossil's philosophy. It also means that auto-sync can be left on. (If the user doesn't want the feature branch pushed, they can make it private.)

Fossil rebase, would, however, perform the merges commit-by-commit[1], into the working copy, just not do any commits.

I suppose a possible feature enhancement would be for Fossil to trigger a build after each step-wise merge was completed, waiting for the build to complete. If the build failed, it could stop the rebase as if there were any merge conflicts.

Since Fossil would be waiting for the build, it could do a "standard" fork/exec or spawn then wait for the child process to complete.

[1] I am making the assumption that doing the merge in smaller chunks will make it easier to resolve any conflicts that do arise.

(72) By Warren Young (wyetr) on 2019-08-26 18:46:47 in reply to 69 [link] [source]

a possible feature enhancement would be for Fossil to trigger a build after each step-wise merge was completed

Git has a related feature that would be much more useful to have in Fossil than any of this rebase stuff: git bisect run.

It takes a command as arguments and runs that command at each bisect step, automatically assigning good/bad verdicts based on the return status of that command. Thus a command like this:

  $ git bisect run make test

will usually get you to the first bad check-in in a single command. (After bisect reset, of course.)

You can get more complicated effects by calling a short one-off shell script:

  #!/bin/sh
  make -j11 my-failing-program && ./my-failing-program | grep -q success-case

That can fail at least three different ways, all of which result in a "bad" verdict, allowing the bisect to try the next one without bothering to diagnose which type of failure occurred.

(73) By Andreas Kupries (aku) on 2019-08-26 18:50:28 in reply to 66 [link] [source]

What happens if there is a merge conflict?

Rebase stops. The user is asked to fix the conflict manually. The user then has to restart rebase using git rebase --continue. Or may choose to abort using git rebase --abort.

Rebase is not a way for the user to get around having to fix merge conflicts by themselves.

(82) By Richard Hipp (drh) on 2019-08-26 20:01:24 in reply to 73 [link] [source]

Very interesting. Thank you for the information.

Is there also a way (in Git rebase) to do additional checks on each of the rebased check-ins - above and beyond the simple (and unreliable) check that there were no merge conflicts - before committed each new check-in to the blockchain?

(83) By Steve Schow (Dewdman42) on 2019-08-26 20:06:44 in reply to 82 [link] [source]

I guess one option might be to provide a fully-squashed rebase that doesn't commit it. Do the rebase, squash the commits, make the changes to the working dir without committing yet (like we do for merge), then the user can choose to commit after they test and fix.

(86) By Steve Schow (Dewdman42) on 2019-08-26 20:11:42 in reply to 83 [link] [source]

but actually the more I think about it, I wouldn't like that as it would discourage rebasing often. You'd lose all your feature branch commits every time you rebase, no bueno...

(33.1) By Stéphane Aulery (LkpPo) on 2019-08-23 23:01:32 edited from 33.0 in reply to 21 [link] [source]

Its not only squashing, but sometimes people might do say 4 commits in a row, but then they realize commits 1 and 3 are related to each other and 2 and 4 are related to each other and they'd rather have two commits with the related changes in each one. This is because they were coding away and commiting often and they didn't know where it would end up as they executed commit, but now in retrospect they wish they had those two commits for future posterity or for peer review.

Yes it's common. There are also two cents recommendations: it is necessary to commit often and small. Why? Do you know my project? There is also a kind of habit and fear for software that crash: you have to save / commit to not lose work in case of problems.

To avoid falling into these problems I usually make a manual copy of the code in a separate folder that serves me as a draft. I code, I code. Then WinMerge or Meld is my interface between the deposit code and my manual copy. I use this tool to merge the code one way or the other with the granularity that suits me.

In this way I can import the part of the new code that I want into my local branch and set aside the resolution of a conflict for example. The second beneficial effect is to be obliged to read the new code to make this merge and be always aware in the broad lines of the changes made.

I can also merge the code in the SCM into logical packets.

Somehow I decoupled the working area of the SCM because it is not his role. Its role is the immutable historization of the project and its sharing. This historization must intervene at key moments and promote good communication.

(36.1) By Florian Balmer (florian.balmer) on 2019-08-24 12:14:38 edited from 36.0 in reply to 33.1 [link] [source]

I love Fossil, mostly due to its simplicity, and also due to the one repository → one SQLite database file architecture, which looks like an easy and transparent way to backup repositories. (However, some more attention is required to make sure the backups are "consistent" SQLite database files without any outstanding transactions.)

I understand that "Fossil is designed to keep all historical content forever". But the principle of "forgetting" is also very important regarding proper functioning of human brains, and human lives in general.

That's why I believe that sometimes, a "clean, re-staged, ``artificial´´ timeline with thematically grouped and ``condensed´´ commits", omitting any "background noise" from hidden commits with experiments, mistakes and intermittent merge points, may be the preferred way to store a project. So I wouldn't use a "rebase" command to merely hide branches or individual commits, without also being able to rearrange and group them.

Somehow I decoupled the working area of the SCM ...

I'm doing the same, sometimes to the extent that I maintain "superordniate repositories", to which single check-ins or series are transferred from "staging repositories" in a "rebasing fashion", with:

fossil diff --from ... --to ... -R path/to/staging-repo.fossil | patch -p0
WinMerge
A set of scripts to transfer selected check-ins (as completely new commits, i.e. allowing changes to target branches, tags, and comments).

I the keep backups of the "staging repositories" for some time, to be able to recover "deleted work" -- but I've never needed to do that, so far.

(38) By Warren Young (wyoung) on 2019-08-24 13:14:23 in reply to 36.1 [link] [source]

But the principle of "forgetting" is also very important regarding proper functioning of human brains, and human lives in general.

Yes, but that doesn't tell us that Fossil should have the same sort of flawed recall that humans do.

Fossil's closest external interface to the way human brains work is the timeline. The defaults on that page roughly approximate the way we think about things. Everything we're working on today is on a single browser screen, for immediate recall. Everything a few days back might require a bit of scrolling, analogous to "let me think a bit." And anything farther back than that requires a good think and maybe a search to pull up, and even then, we might fail to find what we want.

None of this tells us that rebasing is a good idea.

I don't have any of the receipts for the hamburgers I bought in 1993. I couldn't tell you how many I bought, and I wouldn't lay any odds on my ability to even give a complete list of all the places I bought them.

But that doesn't tell me I don't ever want to see the source code to a program I was writing that year, or explore the way it evolved over that year.

A few times in my life, something sufficiently momentous has happened that I've lost all source code from before that time. But if you ask me if I want that code back, the answer is "Yes!"

Other times, I've lost the checkin-by-checkin history for a project, and all I have is a number of release versions. Is that good enough? Often, yes. But I'm not going to do such things on purpose. If I could wave a magic wand and have the checkin-by-checkin history back, I'd wave it.

(50) By Florian Balmer (florian.balmer) on 2019-08-25 12:00:32 in reply to 38 [link] [source]

None of this tells us that rebasing is a good idea.

Yes, I agree. But with permanent "cleanup" of the history ("rebasing"), the efforts of decision making about the relevant points have to be made only once -- and they have to be made again and again when reviewing a "cluttered" history, which seems kind of "wasteful". This probably doesn't matter in case the efforts are small, or rarely necessary -- but I think there's cases where a clean history with less ballast can simplify the management of a project.

The genius dominates the chaos -- log me out ;-)

Fossil's closest external interface to the way human brains work is the timeline.

Off topic! From my amateur reading about the human brain and memory, I think this is different:

It's very difficult for human brains to sort memories chronologically, unless there's concrete reference points (a remembered date, a known time-relation to other memories). So my understanding is that timelines (or, calendars) were created to support human brains, not to duplicate their functionality. During Stone Age, only the skills were important, and not the chronology of acquisition. This is quite different for some birds and rodents hiding nuts for winter, for example, as chronology matters if they want to find nuts, and not young sprouting trees.
Memories include almost no details, but only a few very rough cornerstones. When recalling them, human brains invent a complete and detailed scenery, similar to a movie. It's fantasy for the most part, but it's so "realistic" that we "believe" they are true -- but in reality, they can be quite far from reality. So our brains are far from the accuracy of, for example, the details of the Fossil timeline.

(52) By Eric Junkermann (ericj) on 2019-08-25 14:30:48 in reply to 50 [link] [source]

None of this tells us that rebasing is a good idea.

Yes, I agree. But with permanent "cleanup" of the history ("rebasing"), the efforts of decision making about the relevant points have to be made only once -- and they have to be made again and again when reviewing a "cluttered" history, which seems kind of "wasteful". This probably doesn't matter in case the efforts are small, or rarely necessary -- but I think there's cases where a clean history with less ballast can simplify the management of a project.

I don't believe this. If there is a "cluttered" history then you don't have a branch naming convention, or a branch usage policy. If you have these you can ignore any clutter and just look at the obviously important branches -- you can tell which they are from their names. You don't need to care about the (possibly tangled) clutter except for what it contributed to an important branch at the time of merging in.

Of course if bisecting for a problem lands you on that merge checkin, then you might benefit from looking at the clutter, which you can't do if it has been rebased away. Or when the newly-joined whizz-kid says "Why didn't you do that bit this way?", you can say "We did, 2 years ago, but it didn't work out. There it is." Then he or she can either find a way to make it work and prove that it's worth trying at this point in time, ... or not, in which case they have learnt something. Of course if it's been rebased out of existence, they'll just feel resentful.

(53) By Florian Balmer (florian.balmer) on 2019-08-25 15:06:38 in reply to 52 [link] [source]

I fully agree, here, it's mostly a question of how the feature is used.

By "clutter", I mean things like refining comments, or fixing simple coding errors and documentation spelling mistakes, and similar, i.e. condensing 3 or 4 check-ins with small fixes that form a "logical unit" into one single check-in, as already mentioned in other posts (3-4 check-ins → 1 check-in == 75% reduction of clutter). It's nice to commit often, but I'd love the ability to do such micro-condensation continually during regular work.

Another possible benefit: as soon as upstream branches are merged down to feature branches to keep them up-to-date, it's difficult to view a diff of just the feature branch. For example, Fossil's "openssl-1.1" branch: it's almost impossible to review the final/cumulated changes of just this branch at once, without any "foreign" changes from trunk slipped in. Such things would be possible with rebasing of the branch.

(54) By Eric Junkermann (ericj) on 2019-08-25 16:38:51 in reply to 53 [link] [source]

By "clutter", I mean things like refining comments, or fixing simple coding errors and documentation spelling mistakes, and similar, i.e. condensing 3 or 4 check-ins with small fixes that form a "logical unit" into one single check-in, as already mentioned in other posts (3-4 check-ins → 1 check-in == 75% reduction of clutter). It's nice to commit often, but I'd love the ability to do such micro-condensation continually during regular work.

These should not be commits. Everybody seems to have been seduced into treating the source control system as the primary backup of their work. In my opinion

don't check in untested (coding errors)
don't check in unchecked (spelling mistakes)
use a backup tool for backups (at frequent (FSVO) intervals)

That is, I don't believe in "early and often". On the other hand, consolidating a few checkins on a single branch seems mostly harmless.

Another possible benefit: as soon as upstream branches are merged down to feature branches to keep them up-to-date, it's difficult to view a diff of just the feature branch. For example, Fossil's "openssl-1.1" branch: it's almost impossible to review the final/cumulated changes of just this branch at once, without any "foreign" changes from trunk slipped in. Such things would be possible with rebasing of the branch.

I don't see what rebase should be done to fix that one, but then I don't know much about the details of rebase.

Actually I don't think that a full merge of trunk or a release branch or something into a feature branch is a good idea. Only things that are needed for or will assist in the feature development, or that will create merge conflicts with the feature code, should be cherry-picked.

You might gather that I believe in process and procedure and policies, which are, of course, the project's choice, but rebase is too powerful and dangerous to not be restricted in some way.

(55) By Richard Hipp (drh) on 2019-08-25 19:16:49 in reply to 53 [link] [source]

it's difficult to view a diff of just the feature branch. For example, Fossil's "openssl-1.1" branch: it's almost impossible to review the final/cumulated changes of just this branch at once, without any "foreign" changes from trunk slipped in.

I beg to differ. The isolated changes of openssl-1.1 can be seen here

How to easily get to that link:

View the openssl-1.1 branch timeline: https://fossil-scm.org/fossil/timeline?r=openssl-1.1
Find the last place where trunk was merged into openssl-1.1. In this case it is check-in d0de24fe8679e354
Click on the graph node for that check-in so that it contains a red dot in the center.
Locate the leaf node of the openssl-1.1 branch. In this case 91a0d5a55f3865a8
Click on the graph node for the branch leaf, and you will immediately jump to the diff page linked above that only shows the changes associated with the openssl-1.1 branch, and without any extraneous changes from trunk that were merged in along the way.

Question:

Though the technique described above is obvious to me, perhaps it is less obvious to those who have not spent the past 12 years writing their own version control system. Would it help if I automated the above by providing some kind of "overall branch diff" links someplace? If so, where could I put such links. To me, the obvious place to put such a link would be in a pop-up menu that appears when you right-click (Ctrl-click on a mac) on the check-in in a timeline. Can you even do right-click pop-up menus in a web browser? Is that considered bad style? Is there a better place to put an "overall branch diff" link?

(56) By kak (kkugler) on 2019-08-25 19:28:50 in reply to 55 [link] [source]

Though the technique described above is obvious to me, perhaps it is less obvious to those who have not spent the past 12 years writing their own version control system.

It was obvious to me only because someone had shown me that very useful trick. I'm sure it is documented somewhere, but I'd wager it is not a well known feature of Fossil.

(57) By anonymous on 2019-08-25 21:18:28 in reply to 55 [link] [source]

If so, where could I put such links.

Currently, in the timeline, clicking on a tag gives 3 related commits.

One suggestion would be to have "Diff branch" and "Show branch history" links on that view.

A right-click menu in a web page would involve either adding to or overriding the browser's context menu. I don't remember seeing this "in the wild", so maybe it's not possible. Even if it is, I expect few people would think to try it.

(60) By Stephan Beal (stephan) on 2019-08-26 06:44:44 in reply to 55 [link] [source]

A problem with context/right-click menus is getting them to work on mobile platforms. Maybe that's a negligible audience for Fossil, but 90% or more of my online time is via a tablet (including writing this post).

(61) By stevel on 2019-08-26 06:47:36 in reply to 60 [link] [source]

I usually add a long press binding for context/right-click menus on touch devices.

(65) By Florian Balmer (florian.balmer) on 2019-08-26 17:39:22 in reply to 55 [link] [source]

Find the last place where trunk was merged ... Click on the graph node for that check-in ...

A nice trick! I should be familiar with this technique, and I don't know why I didn't remember it. Maybe I was scared by this branch with so many junctions ...

The branch was merge-closed to trunk, and later reopened from trunk. Maybe my idea was more that if I could "abuse" the rebase feature to move both sub-branches to "current", and then view the total diff of the two parts together, as a whole. Then "fossil undo/revert" the rebase -- the same way I can do a merge, check the diffs, and get rid of it with "fossil undo/revert" without committing anything, if I'm not happy with it.

Yet, such strange ideas might even scare git people ... ;-)

(37) By Warren Young (wyoung) on 2019-08-24 12:57:49 in reply to 33.1 [link] [source]

it is necessary to commit often and small. Why? Do you know my project? There is also a kind of habit and fear for software that crash: you have to save / commit to not lose work in case of problems.

Committing early and often is not solely about off-machine backup. Personally, that itch only starts getting to me as the end of the working day comes up. I hate leaving work uncommitted until the next day, sitting there on my machine only.

But within a working day, committing early and often is a practice that supports clear thinking. If I can't make, oh, 6-12 commits during one of those satisfying heads-down coding-only days, it means I'm probably writing code that tries to do too much in a single step. If it takes me all day (or multiple days!) to work my way up to a single commit, I'm almost certainly not committing just one function, or just one class, or whatever. I'm probably committing a hairball without proper layering and separation of concerns.

In other words, if I've been working on a single thing for hours and still haven't found a good commit point, it's a sign that the design is flawed or my thinking about the problem is unclear. It tells me I need to back off and look at what I've got and try to break it down into individual pieces I can commit separately.

Check-ins are bursty. There might be 3 in close succession followed by the next a few hours later, followed by another burst half an hour after that.

Committing early and often also aids cherrypicking between branches. A given feature might require 3 check-ins on the current branch, the first of which fixes a bug discovered while trying to get the new feature to work. That bug fix should be backported to the stable branch, rather than wait for an end user to find the same problem and have someone fix it all over again on the stable branch. Checking the bug fix in separately makes a cherrypick onto the stable branch a quick operation. If instead I commit the whole feature in a single check-in, I have to do a manual cherrypick instead of let Fossil do it for me.

Small commits also make bisects easier. Each step has less code to inspect when you eventually find the check-in that caused the problem that lead you to start bisecting.

(22) By Steve Schow (Dewdman42) on 2019-08-23 19:15:27 in reply to 19.1 [link] [source]

well actually now that I think about it...about the hiding thing.... maybe not such a terrible idea...guess what...that is what git does too when you rebase...

the original commits are there, but they end up not linked to any branches anywhere so you don't see them on the timeline. You could find them from their hash they are still in there, but no branches are using them anymore

(24) By mattwell on 2019-08-23 19:18:43 in reply to 19.1 [link] [source]

What if "fossil rebase" worked like "git rebase" in that it would operate on any commits in the repository, except that instead of deleting the commits you rebased from, it merely marks them as "hidden"? Then other people who are working off of changes you rebase can continue to do so without harm. It is just that the rebased check-ins no longer appear in the timeline.

This model would be useful except I suggest leaving it to the user to manually hide the rebased commits.

Let's take "rebase" to mean something like "rationalize a change history (and potentially merge in another branch) for readability". This is a dang useful idea when there is collaboration going on. Fossil can easily have it's cake and eat it to. The goal does not require hiding the original data.

(26.1) By Steve Schow (Dewdman42) on 2019-08-23 19:24:40 edited from 26.0 in reply to 24 [link] [source]

If I understand correctly, in git, when you rebase say a feature branch from the trunk, what rebase does is actually starts from the parent node of your feature branch and mainline...where you branched from...then it reapplies NEW commits into your branch that are copied from the mainline, followed by all of your new feature commits (as merges)...so that basically you end up with your feature branch looking like you branched off from a later point in time and added your changes...thus preserving a more linear set of changes.

Its very useful for merging the mainline into a feature branch before finalizing the feature branch to merge back to mainline. There are probably use cases for rebasing the other direction, but I personally found them confusing to read about.

in so doing that, the original commits you had made on your feature branch become orphan commits. they are still in there, but not attached to any branch, so basically hidden.

While doing the above, its possible to squash your commits.

(28) By Steve Schow (Dewdman42) on 2019-08-23 19:30:47 in reply to 26.1 [link] [source]

But you can see from the above, if you rebase, not just to squash commits but to literally rewrite the history and the branch start point...then push it back and anyone else is working off that feature branch...then merge conflicts would be scary...

thus the golden rule....

(39.3) By Steve Schow (Dewdman42) on 2019-08-25 02:45:24 edited from 39.2 in reply to 19.1 [link] [source]

(Edit: I'm also skeptical of the "hidden" flag, fwiw. If it were only up to me, I wouldn't have a "hidden" flag. :-))

this thread got a little scattered, but something i pointed out on another part is that git doesn't actually hide any commits during a rebase squash, but what it does is actually orphan existing commits and create new ones to replace them. The previous commits are still in the DB with a hash, but essentially get unlinked out of any branches and no longer impact the HEAD.

so rebase kind of does this


Mainline and Feature Branch before Rebase

---------C1-----C2-----C3------------
          \
           \
            +------C4-----C5----


After Rebase into feature branch

---------C1-----C2------C3----------------------C6----
           \             \                      /
            \             \                    /
             +-C4--C5      +------C4'-----C5'-+

So basically the original commits are hidden, they are there, but they aren't on any named timelines anymore.

(Actually in the diagram above, its debatable about whether C1 is even linked to C4 in any way. Git doesn't easily link downwards anyway, only upwards. But I think in theory it would be possible to checkout C5 and keep working from there even though its not on any named branches including trunk. Just a guess).

New commits are created that are re-based from a new place from the mainline timeline.. Completely separate from the squashing, this results in a more linear timeline. The feature branch merges in the changes for C4' and C5'....and usually C5' would then be MERGED back to mainline without much trouble.

What git also does is allow you during rebase to change the way C4 and C5 are re-created as new commits. They can be combined (squashed), rearranged, grouped into meaningful commits, as you wish. You can also delete commits during this interactive rebase also if you desire, for better or worse.

I am not sure how any of the above could possibly be worked into fossil, maybe none of it. Maybe some of it.

Just wanted to clarify what is going on with git rebasing.. Rebasing is different then merge in that it results in somewhat cleaner and more linear commits then would might happen if you only merge things, but ultimately the end goal of isolating a branch, working on a feature and merging the work back to mainline can be done either way. When handled properly rebase and provide a cleaner history to look at later, or for peer review. However, it can also be handled improperly, one advantage of fossil is that it doesn't provide any rebase so people are less likely to get into trouble by misusing it.

(40) By Warren Young (wyoung) on 2019-08-24 21:14:04 in reply to 39.0 [link] [source]

Isn't that a distinction without a difference after git gc?

(41.1) By Steve Schow (Dewdman42) on 2019-08-24 21:23:25 edited from 41.0 in reply to 40 [link] [source]

I do not think any repo cleanup would do what rebase does. in the graphs I showed above, C4' and C5' are actually DIFFERENT commits then C4 and C5. They may or may not have the same comments, but the actual diffs are different. They are different deltas then C4 and C5 because they are coming from a different "base".

If you interactively squash or re-group the commits then particularly the results will be very different then if you had merged in and out. If you squashed to only one commit, there would not be any C5', there would only be C4' which would have all the changes from both C4 and C5, but originating from the new base.

Some would argue that these C4' and C5' changes are perhaps more sensible to someone looking at it for peer review or a year later. Generally people are going to rebase from the mainline into the feature branch right before they are going to merge their branch back to mainline. This results in their branch having the same timeline of changes that the mainline will have and a much more simple merge back to mainline.

(42) By Warren Young (wyoung) on 2019-08-24 21:25:23 in reply to 41.1 [link] [source]

I think you're missing my point. During a garbage collection pass, aren't those abandoned commits going to be removed, so that your point about nothing actually being "hidden" is a pointless quibble? If I'm right, the old material isn't just hidden, it's gone.

(43) By Steve Schow (Dewdman42) on 2019-08-24 21:33:25 in reply to 42 [link] [source]

oic. No I don't think so from what I have read, but I could be wrong I'm not a git guru. They are hashed commits and they are there.

man page for git gc says objects that were staged with add and then unstated, are cleaned up, and some things like that, but based on what I read, I got the feeling that the commits are always there even if they end up orphaned. But who knows, maybe it does...

(44) By Steve Schow (Dewdman42) on 2019-08-24 21:35:27 in reply to 42 [link] [source]

and actually the more I think about it, I should have a branch line showing from C1 to C4... Its like a dead end branch. I think in theory you could checkout C5 and work from there as a new branch, but don't quote me on that.

(45.1) By Richard Hipp (drh) on 2019-08-24 23:02:02 edited from 45.0 in reply to 39.2 [link] [source]

Questions:

Do I understand correctly that rebase creates new check-ins that have potentially never before existed as files on disk, and hence check-ins which are guaranteed to be untested?
What does rebase do if there is a merge conflict?
What are the timestamps of C4' and C5'? Are they the same as the timestamps on C4 and C5? And if so, does that not mean that C4' has a parent check-in that is younger than itself? (We call such situations "time-warps" in Fossil.) Or do C4' and C5' get new timestamps so that all check-ins on a line of development continue to occur in chronological order? If C4' and C5' do get new timestamps, how are those timestamps computed?
Is there any indication anywhere in the C4' and C5' check-ins that those check-ins were created via rebase? Is there a reference from C4' back to C4?

(46.2) By Steve Schow (Dewdman42) on 2019-08-25 02:56:28 edited from 46.1 in reply to 45.1 [link] [source]

I'm the wrong person to ask about some of those details unfortunately I am only echoing what I have read in a couple git books and am by no means a git expert...so to get to the bottom of those questions you'd have to consult with others..

But I don't think new files would exist on disk.. when you do a rebase, its just changing the the "base", or the starting point..and somehow it works out all the diffs so that the new commits end up with the same thing on disk as you had before...to compile and work with normally. the difference is only what the actual diffs look like.. The end result will be the same as a merge. The differences to get there will be different.

And yes..when you merge or rebase from mainline into a feature branch, there are potentially going to be merge conflicts. Either way there can be some. The difference is that with merge, you are starting from an older "base" and theoretically then there could be more merge conflicts and perhaps less with rebase, but there are still the potential for merge conflicts that have to be resolved either way. There is no rule that says one way would have more or less merge conflicts then the other, that will always depend on the changes really. here is the rebase example again for comparison:

---------C1-----C2------C3----------------------C6----
           \             \                      /
            \             \                    /
             +-C4--C5      +------C4'-----C5'-+

No I don't think there are any references from the new commits back to the old ones. The old ones are orphaned really. The new ones take their place. I have no idea about timestamps and I am expressing this a certain way to make a point about what a rebase is conceptually..the actual implementation details in git you'll have to ask someone else.

But Here is a MERGE example for comparison

------C1---C2--------C3---------C7-------
        \              \        /
         \              \      /
          +---C4----C5---C6---+

Note the difference. No commits are replaced by new commits. But we have more commits with merge activity in them. The merge at C6 is a little more complicated then the rebase example really because you have changes on mainline and changes on the feature branch and they have to be resolved with possible conflicts. In the rebase example, you simply move the base forward and then add the new changes in the feature branch to the end of that...which still involves some possible conflicts but perhaps easier and certainly you end up with a more linear and chronological sequence of changes compared to the Merge approach.

In both cases, you would expect the feature branch developer to compile and test their feature branch before finally merging back to C7 in the MERGE example...or if the REBASE had a C6 merge back it would be the same thing....developer tests the work before merging it back again.

In either case you are merging the two branches and dealing with the conflicts. The difference is in what the history looks like to get there.

(47.1) By Richard Hipp (drh) on 2019-08-25 02:54:30 edited from 47.0 in reply to 46.1 [link] [source]

But I don't think new files would exist on disk.. when you do a rebase, its just changing the the "base", or the starting point..and somehow it works out all the diffs so that the new commits end up with the same thing on disk as you had before...to compile and work with normally. the difference is only what the actual diffs look like.. The end result will be the same as a merge. The differences to get there will be different.

And the result of a rebase might not work. It might contain conflicts. Or, it might rebase cleanly, but still not function correctly because the changes on one branch are incompatible with changes on the other. Either way, you end up with an untested check-in that does not work.

When you merge (at least in Fossil) it does not produce a new check-in. The merged code is only in your local check-out. So you have an opportunity to test it before doing a commit and adding it to the blockchain. A disciplined developer can ensure that no untested check-ins ever enter the blockchain.

That does not seem possible with rebase. (At least, I don't see how it is possible.) With rebase, it seems you are always adding at least one new untested check-in to the blockchain. Am I wrong?

Note that it is not necessary to merge trunk into the feature branch before merging the feature branch into the trunk. You could do this:

------C1---C2--------C3---------C7-------  (trunk)
        \                       /
         \                     /
          +---C4----C5--------+  (feature-branch)

When your check-out is sitting on C3 and you type "fossil merge C5", the merge occurs in your local check-out only. So you get the chance to fix merge conflicts and test the code before it enters the blockchain. If the result of merge does not work out, you can always "fossil revert" and start over. C7 only enters trunk after it has been tested by the developer and found to work correctly.

I don't understand how the graph in this post is any more complex or difficult to understand than any of the rebase graphs shown earlier. Can somebody please explain?

(48.2) By Steve Schow (Dewdman42) on 2019-08-25 03:22:54 edited from 48.1 in reply to 47.1 [link] [source]

well you should check with real git experts before assuming some of what you're saying, as I said, I'm not the right person to ask. I have not worked with git extensively, I have chosen fossil for simplicity instead of git..in the past I used clearcase where rebasing was a fundamental part of our large dev team workflow. So to how exactly git constructs those commits or how it orphans, etc.. check with git gurus, not me...

But its an interesting point you make that with rebase you may end up with new commits, that have not been tested yet....as opposed to merge that merges into the working dir where you can test it before you commit it as a merge.

I realize its not necessary to merge out before merging in...but that would not happen for me ever..I treat the mainline as sacred and where I have worked in the past that would be the case with severe implications for breaking the mainline nightly build. I realize in smaller environments, people can be looser about things like that.. either way it all has to be merged...and its still a question of how the history will look. Even in the example you gave, you have to merge two branches at C7 on trunk.

Actually though with both git and fossil...the feature developer can always just merge back to trunk WITHOUT doing a push...and test the mainline branch that way before finally pushing it..which is really the final merge back to the shared repo that counts. So yea.

I am not meaning to join the fight here of rebase vs merge, which I consider an absurd argument. There is a time and place for both operations... Its just that git allows it and people sometimes get into trouble for using it the wrong way at the wrong time. Fossil keeps things more simple...which is both limiting and protective at the same time. That actually has worked just fine for me personally, but I do wish I could squash my commits when merging back to trunk.,

(49.2) By Steve Schow (Dewdman42) on 2019-08-25 03:15:25 edited from 49.1 in reply to 47.1 [link] [source]

I guess with git rebase... when you get those new commits that are merged...and you test and find a problem you just fix it and make a new commit and you can squash all the commits anyway if you want, so in terms of git...its moot point that there are commits which are broken. In the Git philosophy..you should commit often; having a functioning product is not a requirement to justify a commit.

(51) By anonymous on 2019-08-25 13:22:05 in reply to 45.1 [link] [source]

Do I understand correctly that rebase creates new check-ins that have potentially never before existed as files on disk, and hence check-ins which are guaranteed to be untested?

Yes. rebase takes the diffs introduced by given commits and applies them in their order on top of other commits (the leaf of the target branch). The result of that becomes the leaf of the source branch, and the old commits are discarded. (They may be present in the content-addressable storage and found by git reflog et cetera, but they are not part of the official view of the branch anymore.)

What does rebase do if there is a merge conflict?

Stops and asks for the user to fix the conflict. The rebase process may then resume (git rebase --continue; the fixes become part of the C4' commit) or not (git rebase --abort; in that case, all changes are rolled back and the blockchain remains intact).

What are the timestamps of C4' and C5'? Are they the same as the timestamps on C4 and C5?

I think they are taken from the time when rebase is being made. Might also be subject to GIT_COMMITTER_DATE environment variable.

Is there any indication anywhere in the C4' and C5' check-ins that those check-ins were created via rebase? Is there a reference from C4' back to C4?

No. C4 is then ignored (since it is missing from the new state of the branch).

(17) By Richard Hipp (drh) on 2019-08-23 19:00:06 in reply to 15 [link] [source]

That's one reason why I personally would never use the hypothetical "fossil rebase" command - I like to commit just as a backup. My commits to SQLite and Fossil are backed up to multiple servers in different cities thousands of miles apart, within minutes. It is also a convenient means of transferring work from one machine to another.

But if I understand Dewdman42's argument, the same situation already applies with Git, since you are not suppose to rebase things you have pushed to other servers. Correct me if I have misunderstood.

(20.1) By Steve Schow (Dewdman42) on 2019-08-23 19:18:01 edited from 20.0 in reply to 17 [link] [source]

see this:

https://blog.axosoft.com/golden-rule-of-rebasing-in-git/

its also found in the Pro-Git book and many other places..

More specifically, don't rebase a branch if you have already pushed AND IF that feature branch is also being shared and worked by someone else.

Basically rebase will cause the other person to be face with a crazy merge conflict and probably they will coming screaming to you asking for help.

(14) By Warren Young (wyoung) on 2019-08-23 18:37:57 in reply to 10 [link] [source]

Fossil does have rebase - we just call it "merge".

Maybe we should just move that point from the fossil-v-git doc to a "Myths about Fossil" doc or similar, where we cover such non-issues.

I'm thinking we should also get rid of the multiple check-outs per repo point. Fossil's method is far cleaner than Git's, but if people are going to take our point on this as false because technically Git can do multiple check-outs per repo, then I'd rather not argue the point at all.

(23) By Steve Schow (Dewdman42) on 2019-08-23 19:18:32 in reply to 14 [link] [source]

merge and rebase are similar but different though.

(59) By anonymous on 2019-08-26 05:25:37 in reply to 23 [link] [source]

Reading https://git-scm.com/book/en/v1/Git-Branching-Rebasing, I would describe it as doing:

Merging without recording the "merge parent"
Automatic sequential merging, creating a new commit on the target branch for each commit on the source branch.

There was no mention of commit squashing, though I can see how it could be included in the process.

Obviously, #2 includes automatically doing commits. I don't like this, but, at least in Fossil, those commits can be moved to a new branch and marked private.

Rebase could be done as a wrapper script. If I were going to do it, I'd have the script create a private branch off of the target branch and check in the new commits to that branch. After testing, do a pull. If target branch hasn t been updated, cancel the branch and private tags on the new private branch to make the new commits part of the target branch, then push.

(29) By Steve Schow (Dewdman42) on 2019-08-23 19:35:58 in reply to 14 [link] [source]

The main difference I see is that in fossil you can have multiple working dirs where you have opened a repo and are working in a separate dir tree with those separately opened checkouts.

In git you have only one working dir tree, which is where the actual repo is, but you can use the stash command to rebuild the checkout you want to work on.

So the subtle difference is that in fossil you can work on two different checkouts at the same time without having to stash anything.

But either system can technically work on one checkout a while, or switch to the other checkout, etc..

its a minor difference really, but on the other hand I see an advantage to being able to have a completely separate working dir tree for each checkout in case there are other non-versioned files in there that need to be kept separated too.

(35) By Stéphane Aulery (LkpPo) on 2019-08-23 23:03:06 in reply to 29 [link] [source]

So the subtle difference is that in fossil you can work on two different checkouts at the same time without having to stash anything.

It's an invaluable feature.

(58) By Steve Schow (Dewdman42) on 2019-08-26 00:29:07 in reply to 35 [link] [source]

By the way, newer versions of git do include a working tree feature which i guess can achieve the same thing... a separate working dir on a different checkout.

(25) By Warren Young (wyoung) on 2019-08-23 19:20:02 in reply to 5 [link] [source]

What is surprising is the strong feeling people have about rebase.

I think we're only hearing from a few who think we've injured their sacred ox. How many people read that with silent agreement, and how many more don't much care about the point at all? And even among those we've swayed with this article, how many would we expect to write in with responses like, "I've seen the light!"

People change philosophies slowly, but come to anger over perceived attacks on their current philosophy quickly.

...which is why I'm not going to respond on the other site: that thread started and died over the course of hours, ending about a day ago. It's all over now, there.

Anyone have any insight on that?

Sure, Gerald Weinberg, 48 years ago, in The Psychology of Computer Programming.

Chapter 4, section Error and Ego:, "When the computer revealed a bug in his program, the programmer would have to reason something like this: 'This program is defective. This program is part of me, an extension of myself, even carrying my name. I am defective.'" Now along comes Fossil and publishes those defects around the globe a few seconds after you press Enter? The unmitigated, unspeakable horror!

In the next section, Egoless Programming, Weinberg presents this anecdote: "As Marilyn worked and worked over the code — as she found one error after another — [Bill] became more and more amused, rather than more and more defensive as he might have done had he been trained as so many of our programmers are. Finally, he emerged from their conference announcing to the world the startling fact that Marilyn had been able to find seventeen bugs in only thirteen statements. He insisted on showing everyone who would listen how this had been possible."

None of this is to say that we should strive for error and thrive in it. But neither should we fool ourselves that we are perfect and never perpetrate bugs. I broke Fossil trunk just a few days ago, and I'm still annoyed about it. But I'll learn, and I'll do better. In fact, I'll do better because I fell on my face in public. I might not have found the bug if I'd kept my changes private until I'd tested them on multiple platforms, rather than publish what I had and have drh find the problem.

Punched card programming was still a thing when this book was first published, but since we haven't invented the Mark II Human since then, the book is still quite relevant to today's software development world.

(27) By Warren Young (wyoung) on 2019-08-23 19:25:11 in reply to 25 [link] [source]

I might not have found the bug...

I meant "I might not have learned the lesson as well..."

Yet another error perpetrated in public. How will my ego survive?

(88) By Eric Junkermann (ericj) on 2019-08-26 20:24:48 in reply to 25 [link] [source]

I do not have a sacred ox, I have my professional opinion.

(74) By Roy Keene (rkeene) on 2019-08-26 18:50:47 in reply to 1 [link] [source]

I have not been able to read through this entire chain, but having managed large projects using Git/GitHub.com as well as using Fossil for large projects I have a lot of notions of the subtlety that may not be obvious at first. Some of this is probably already mentioned in this thread.

Rebase versus Merge Commits: The value of a rebasing versus merge commits is that when you are reviewing a Pull Request (PR), you look at a cumulative diff of that branch. If you use merge commits you'll have those merged changes appear as part of your Logical Set of Changes (LSOC), which is semantically incorrect. Rebase is used instead so that your PR only has commits related to your LSOC. Another strategy would be to provide a diff mode that shows what would change if you merged this branch into trunk (fossil does not provide this), but this only works if the branch is currently mergable. Fossil has no solution for rebasing.
A lot of talk about implementing rebasing ignores private branches, which could be used to provide a rebase-able branch without too much affect on the rest of the system.

(75) By Warren Young (wyetr) on 2019-08-26 19:14:04 in reply to 74 [link] [source]

If you use merge commits you'll have those merged changes appear as part of your Logical Set of Changes (LSOC), which is semantically incorrect.

As someone who only speaks pidgin Git, that reads like a non sequitur. It assumes a level of knowledge about Git thinking and the way this LSOC term of art is used that not all of us participating in this thread may have.

After a bit of web searching, I think I can restate your point thus: If you have multiple people working on feature branches and they just merge their changes into the parent branch (e.g. trunk) then branches that span that point (i.e. fork earlier and merge later) will include any intervening merges as part of their total diff.

If that is a correct restatement of your point, then I think it's only true if you don't fossil merge trunk or similar onto the feature branch before merging down to the parent branch.

So, can we say that part of the value of rebase is that it avoids the need for that explicit "merge up from parent branch before merging down into it" step?

a diff mode that shows what would change if you merged this branch into trunk (fossil does not provide this)

I assume you're proposing something more than fossil diff --from trunk?

Could we state it in prose as "diff from trunk, but ignore intervening merges and rebases; act as if they're merged into the current branch first"?

(79) By Roy Keene (rkeene) on 2019-08-26 19:35:08 in reply to 75 [link] [source]

If at any point when working on a branch you do "fossil merge <parent>", there is now a diff between when you forked the branch from the parent on that branch that is unrelated to the change you are making on that branch, and would not, at that point, show up in a diff against the parent branch (modulo cherry-picking).

Let's take an example to make this more clear.

 [TIME ] [Commit] [Branch] Description
 [00:00] [000100] [trunk ] Started working on project
 [00:01] [000200] [trunk ] More work
 [00:02] [000300] [feat1 ] Fork from trunk
 [00:03] [000310] [feat1 ] Add some stuff
 [00:04] [000400] [trunk ] Fix some bug
 [00:05] [000500] [trunk ] Fix some more bugs
 [00:06] [000320] [feat1 ] Add more of this feature
 [00:07] [000600] [trunk ] Even more bug fixes
 [00:08] [000330] [feat1 ] Finish the feature up

If you NEVER merge "trunk" into "feat1" in until you are ready to merge "feat1" into "trunk" there is no problem. However, if commit "000400" fixes a bug that you need fixed in "feat1" to continue its development you must either rebase "feat1" or merge "trunk" into it.

Let's compare those two strategies:

Strategy merge "trunk" into "feat1"

 [TIME ] [Commit] [Branch] Description
 [00:00] [000100] [trunk ] Started working on project
 [00:01] [000200] [trunk ] More work
 [00:02] [000300] [feat1 ] Fork from trunk
 [00:03] [000310] [feat1 ] Add some stuff
 [00:04] [000400] [trunk ] Fix some bug
 [00:04] [000311] [feat1 ] *** Merge "trunk" into "feat1"
 [00:05] [000500] [trunk ] Fix some more bugs
 [00:06] [000320] [feat1 ] Add more of this feature
 [00:07] [000600] [trunk ] Even more bug fixes
 [00:08] [000330] [feat1 ] Finish the feature up

Strategy Rebase:

 [TIME ] [Commit] [Branch] Description
 [00:00] [000100] [trunk ] Started working on project
 [00:01] [000200] [trunk ] More work
 [00:04] [000400] [trunk ] Fix some bug
 [00:04] [000401] [feat1 ] Fork from trunk
 [00:04] [000411] [feat1 ] Add some stuff
 [00:05] [000500] [trunk ] Fix some more bugs
 [00:06] [000421] [feat1 ] Add more of this feature
 [00:07] [000600] [trunk ] Even more bug fixes
 [00:08] [000431] [feat1 ] Finish the feature up

In the case where we Merge in Trunk we can no longer get a diff of the work done on that branch exclusively. If we diff against the fork point we get the commit [000400] as [000311] in our diff, even though that change is already present on the parent. If we diff against the state of trunk at merge time there are a whole bunch of commits to trunk that are not present on "feat1" branch that show up.

In the case where we Rebase, the branch contains only the commits related to that changeset the diff from fork-point gives you the set of changes related.

It's important to note that you cannot "simulate" the output you want by excluding commits from the diff, since the results may be intertwined. This is what I think you are suggesting. You cannot ignore merge commits when calculating the diff since without them the code may not present a set of diffs that can be merged.

You can compute what the diff would look like to merge into trunk (this is NOT fossil diff --from trunk, since that would again include changes made to trunk that are not present in the branch in question) but ONLY IF the branch is currently mergable, where rebasing always works (because a human must resolve merge conflicts before the rebase atomically comes into being).

(80.1) By Richard Hipp (drh) on 2019-08-26 20:05:36 edited from 80.0 in reply to 79 [link] [source]

we can no longer get a diff of the work done on that branch exclusively

I don't think that is correct. If you diff from 000400 to 000330, you will see only the changes that have been made to the feature branch, and none of the changes that were applied to trunk.

In other words, a diff from 000400 to 000330 on your merge example will be identical to a diff from 000400 to 000431 on the rebase example.

You can easily verify that the previous sentence is true by observing that check-in 000400 is the same in both examples and that 000330 and 000431 are byte-for-byte identical as well, assuming that the rebase functioned correctly. And the diff algorithm cares not how the change came about - it only looks at the end products. The end results are the same, and hence the diff will be the same.

(81) By Roy Keene (rkeene) on 2019-08-26 19:59:19 in reply to 80.0 [link] [source]

But it would miss the change made in

[00:03] [000310] [feat1 ] Add some stuff

So you would have to diff from 000300 to 000310 and add that to the diff from 000400 to 000330, which may not produce anything useful.

(84) By Richard Hipp (drh) on 2019-08-26 20:10:07 in reply to 81 [link] [source]

But it would miss the [000310] change...

No it wouldn't. That change is still present in the feature branch and absent in the trunk, so it would appear in the diff.

Remember, diff only looks at the end result. Do not be lead astray by thinking that diff is somehow influenced by the sequence of changes that lead up to the end result. Diff only looks at the end result.

For two files A and B, diff(A,B) is always the same, regardless of what sequence of steps were followed in converting A into B. Diff() only looks at A and B. It does not care about any common ancestors or edit paths.

(85) By ckennedy on 2019-08-26 20:10:58 in reply to 81 [link] [source]

So here is an example of what rkeene is talking about taken from the recent server-docs branch. Server-docs diff. Even though we only worked on the server docs under /www on that branch, because a merge from trunk was done (twice) the diff now shows all the changes to trunk (including some changes to the c code) as well. I can see this being an issue in clarity when doing a strict diff review.

But with Fossil we can also see the entire timeline including where merges came from. I'm not sure how it is best to handle this situation. I can see it both ways depending on your workflow.

Thanks.

(87) By Richard Hipp (drh) on 2019-08-26 20:22:12 in reply to 85 [source]

You did the wrong diff.

You diffed from the original branch point to the tip of the branch: fbc3b2f to c57e179
You should have gone from the last merge point to the tip of the branch: 9bdf650f to c57e179

Increasingly, I am made aware that I need to somehow provide a mechanism that simplifies viewing a diff of all changes on a branch, without any of the merge-ins. Such a diff is not hard to compute from the point of view of the Fossil implementation. (No new code needs to be written here - Fossil already does everything it needs to do.) The hard part is figuring out how to arrange the interface so that people will find and use the feature. Suggestion on that front are appreciated.

(89) By ckennedy on 2019-08-26 20:27:44 in reply to 87 [link] [source]

Good to know. I only read about the ability to do the diffs this way earlier today. I had thought that I would need to include the start of the branch to get all the changes that had occurred, but I see your method works much better.

So part of this is just better clarity in the docs.

Thanks.

(90) By Steve Schow (Dewdman42) on 2019-08-26 20:33:03 in reply to 87 [link] [source]

While we're on that topic....

Yes, a one-click way to see a diff of a planned merge before merging, for review and approval would be super useful for me. Ideally, virtually rebased so the common ancestor is as late as possible.

Another diff that I find cumbersome to find is when I have attached various commits to tickets and I want to see all the file diffs in one click related to the ticket. The problem is that we can see a list of the commits, the web viewer comes up with a subset of the timeline showing just those commits on the ticket...but even from that screen if you diff between the first and last commit, your diff doesn't include the changes in the first commit. And what is the common ancestor in that too? If the ticket was all managed on one feature branch, and ideally it would be, then that is simple enough, but I haven't found a simple way to call up the file diffs for the ticket..because that subset timeline doesn't include the commit just before the first commit in the ticket, to base the diff on for seeing the delta between there and the last commit in the ticket. Technically it also gets a little squirely if those commits are not sequential...if there were other changes in other commits that aren't in the ticket...how to make sure i'm just looking at the changes that were attached to the ticket...

(91.1) By Warren Young (wyetr) on 2019-08-26 20:47:44 edited from 91.0 in reply to 87 [link] [source]

It is indeed not obvious how to get the correct diff. If you simply tell someone to diff against the last merge point, a light begins to glow as to why that would help, but then how do you define "last merge point"? Why is this not what you meant? How does a person know that you actually meant for them to start from that checkin's parent instead?

As to how to fix it with UI, how about you add a "Diff from Parent Branch" button to /timeline?r=branch-name so that the steps are:

Click on branch name in /timeline
Click on "Diff from Parent Branch"

Instead of "Parent" you could substitute in the parent branch's name.

(94) By Richard Hipp (drh) on 2019-08-27 02:20:57 in reply to 87 [link] [source]

I need to somehow provide a mechanism that simplifies viewing a diff of all changes on a branch, without any of the merge-ins.

I've been working on this on the vdiff-improvements branch.

You will notice that the vdiff-improvements branch has a merge-in, so it serves as its own testcase.

So far all that I have is this page: https://fossil-scm.org/fossil/vdiff?branch=vdiff-improvements

I don't have hyperlinks to that page yet, but you can clearly see how to get there: Just use the /vdiff page with a branch=NAME query parameter. The NAME can be a branch name, or any check-in identifier. If NAME is a check-in, it shows all the changes in the branch that includes NAME up to and including NAME but excluding all the check-ins that follow.

I'm not real happy with this, but it is a start.

You Can Help!

I have to go off and work on other things for a while. Please beat on the prototype and try to break it. Report bugs here.

Also please suggest ways to improve the new branch-diff page and suggest places where it would be useful to have hyperlinks to that page. I do not mind if you commit your suggested improvements directly to the vdiff-improvements branch, if you have appropriate check-in privileges.

(96) By anonymous on 2019-08-27 09:30:49 in reply to 94 [link] [source]

The prototype doesn't seem to skip cherrypick merges (yet). Consider the following example (run on Windows):

fossil init fruit_basket.fossil
mkdir fruit_basket
cd fruit_basket
fossil open ../fruit_basket.fossil
fossil set crnl-glob "basket1.txt basket2.txt"
echo apple >> basket1.txt
fossil add basket1.txt
fossil ci -m "Added apple"
echo banana >> basket2.txt
fossil add basket2.txt
fossil ci -m "Added banana"
echo orange >> basket1.txt
fossil ci --branch orange -m "Added orange"
fossil update trunk
echo cherry >> basket2.txt
fossil ci --tag cherry -m "Added cherry"
echo peach >> basket1.txt
fossil ci -m "Added peach"
fossil update orange
fossil merge --cherrypick cherry
fossil ci -m "Cherrypick cherry"

The output of http://localhost:8080/vpatch?from=merge-in:orange&to=orange Looks like the following:

Index: basket1.txt
==================================================================
--- basket1.txt
+++ basket1.txt
@@ -1,1 +1,2 @@
 apple 
+orange 

Index: basket2.txt
==================================================================
--- basket2.txt
+++ basket2.txt
@@ -1,1 +1,2 @@
 banana 
+cherry

I would have expected the change to basket2.txt not to be included in the diff since it was originally committed on trunk.

(97) By Richard Hipp (drh) on 2019-08-27 09:55:11 in reply to 96 [link] [source]

Cherrypicks are a whole different cup of tea. I do not have any plans to "fix" the branch diff algorithm to "skip" cherrypick merges.

Thanks for testing the code!

(99) By anonymous on 2019-08-27 12:15:06 in reply to 97 [link] [source]

I also noticed that if forks exist on the branch, the fork which was created last will be prefered by the algorithm (just like fossil merge BRANCHNAME will). Maybe the vdiff?branch=NAME page could be disabled / issue a warning for branches which contain cherrypicks or forks.

(100) By Richard Hipp (drh) on 2019-08-27 12:23:00 in reply to 99 [link] [source]

Do you have an example we can look at?

(101) By anonymous on 2019-08-27 12:55:53 in reply to 100 [link] [source]

By "forks" I meant multiple open leaves on the same branch, just to clarify. Of course that scenario is only possible if the Person looking at the branch diff is somehow unaware of the unmerged fork(s). I don't know how likely that is, just thought I'd mention. Here is an example to reproduce:

fossil init fruit_basket.fossil
mkdir fruit_basket
cd fruit_basket
fossil open ../fruit_basket.fossil
fossil set crnl-glob "basket1.txt basket2.txt"
echo apple >> basket1.txt
fossil add basket1.txt
fossil ci -m "Added apple"
echo banana >> basket2.txt
fossil add basket2.txt
fossil ci -m "Added banana"
echo orange >> basket1.txt
fossil ci --branch orange -m "Added orange"
echo plum >> basket1.txt
fossil ci -m "Added plum"
fossil update previous
echo grape >> basket2.txt
fossil ci --allow-fork -m "Added grape"

If I now look at the vdiff?branch=orange, I see only the addition of grape, but not plum. The "Added plum" commit is missing from the short context timeline as well.

(109) By anonymous on 2019-08-28 08:30:16 in reply to 101 [link] [source]

The vdiff should probably be disabled for trunk as well. On Fossil's self-hosting repository, it generates about 24 megabyte of HTML content. On my previous fork example, it fails with the following error:

Database error: UNIQUE constraint failed: ok.rid: {INSERT INTO ok VALUES(5);}

(76.2) By Steve Schow (Dewdman42) on 2019-08-26 19:30:12 edited from 76.1 in reply to 74 [link] [source]

I have not been able to read through this entire chain, but having managed large projects using Git/GitHub.com as well as using Fossil for large projects I have a lot of notions of the subtlety that may not be obvious at first. Some of this is probably already mentioned in this thread.

Rebase versus Merge Commits: The value of a rebasing versus merge commits is that when you are reviewing a Pull Request (PR), you look at a cumulative diff of that branch. If you use merge commits you'll have those merged changes appear as part of your Logical Set of Changes (LSOC), which is semantically incorrect. Rebase is used instead so that your PR only has commits related to your LSOC. Another strategy would be to provide a diff mode that shows what would change if you merged this branch into trunk (fossil does not provide this), but this only works if the branch is currently mergable. Fossil has no solution for rebasing.

This is absolutely correct since most 3 way merge tools go back to the common ancestor. Rebase moves that point forward in time, making it as if you started your branch from a later point and helping to isolate your changes. This makes a pre-merge code review much easier and more sensible.

A lot of talk about implementing rebasing ignores private branches, which could be used to provide a rebase-able branch without too much affect on the rest of the system.

If we had fossil rebase possible with private branches would be something I would use. with squashing and all the rest. Its still not quite what git provides. In GIt your branch is private until its pushed, then its not private anymore. And as long as nobody else is working on it as the same time as you, you can rebase and squash without bothering anyone. But that is one way people get into trouble with git, rebasing something that others are using..and git doesn't stop that loading gun from firing... Git does at least block the second person from being able to merge, but then the first person that rebased and squashed might get a phone call to help them resolve the situation. Thus the git golden rule people are supposed to follow. I think if they could provide a way to avoid the situation through the git executable, they would have done so already by now. So its a case where the increased workflow flexibility also increases danger of this situation happening through wrongful use.

But the more I think about it, I just do not see a good way to allow rebasing and especially commit squashing on non-private branches with fossil.. because of auto sync.

(78) By Steve Schow (Dewdman42) on 2019-08-26 19:33:52 in reply to 74 [link] [source]

And warran you gave me an idea worth suggesting...if there was a code review feature built in that kind of does a virtual rebase without actually doing the rebase...that could be useful to see the diff from the newer base....

I think squashing commits in Fossil is probably out because of auto-sync. The fact that people can merge to the trunk and have the commit all consolidated there for after the fact is probably good enough.

but having an ability for a third party reviewer to see a sensical, squashed, rebased view of what is about to be merged to trunk or requested to be merged... a pull request so to speak...I think would be invaluable feature add to fossil. Have the ability specify a rebase option on it that would move the common ancestor to a later check out point...and then dynamically figure out all the stuff that a rebase would figure out and essentially provide a 3 way merge view of the changes proposed form that later rebased point in time as the common ancestor.

(110) By anonymous on 2019-08-29 12:37:18 in reply to 78 [link] [source]

but having an ability for a third party reviewer to see a sensical, squashed, rebased view of what is about to be merged to trunk or requested to be merged

I will use the terms:

branch - a branch to be merged in to its parent
parent - the parent of the above branch, possibly trunk

If the tip of branch is up to date with its parent, whether by rebasing from or merging in from parent, then:

diff( tip(parent), tip(branch) )

will be a preview of the changes to be applied to parent.

If parent has commits newer than the nearest common ancestor, then the diff will show those changes as being "un-applied". However, Fossil will (I think) consider those conflicts when the merge to parent is done.

I think that any "rebasing" should be done first, then create the review invitation containing a vdiff URL to display the diff for review.

For right now, you can get the URL for the vdiff from the timeline by clicking the tip of parent, then the tip of branch, then highlight and copy the URL of the resulting page for pasting into the review invitation.

A possible interim enhancement to Fossil would be a command line command to generate this URL for use by a script to generate and send the review invitation.

Another possible interim enhancement would be for the vdiff page to have a mailto link with a body field to facilitate creating and sending a review invitation. The body field would contain the generated vdiff URL.

The ideal, of course, would be for Fossil to automatically determine the parent of the branch to be reviewed, generate the vdiff URL, fill in a template invitation, allow the user to edit the invitation, then send it.

(111) By anonymous on 2019-08-30 01:51:08 in reply to 78 [link] [source]

if there was a code review feature built in

With JavaScript, you can do it, now.[1] It's ugly, but it works. Not sure how fragile it is.

I can't share the scripts, but I can describe what they do.

On the /info page for a commit, you can get the branch name by looking for the element with id="br-name" then use /ci_tags/root:BR_NAME to get the name of the parent branch. /whatis/PR_NAME will get the hash of the parent's current tip. Element with id="hash-ci" will give you the hash of the commit you are looking at.

Having the hashes of the 2 tips, generate the vdiff URL: BASE/vdiff?from=PR_TIP&to=BR_TIP

Then you can generate form data and submit it to BASE/tktnew to create a new review ticket.

When reviewers look at the ticket, they can click on the vdiff link to see the diff listings.

Of course, you should have the branch tip updated from the parent branch. Whether by "rebase" or by "merge in" doesn't matter because the diff between the branch tip commit and the parent tip commit will still show the changes to be applied to the parent's tip.

Can the reviewers look "behind" the diff to see the other commits, yes. But unlikely they will bother to do so.

[1] And, right now, you can also see what the reviewers will see before you write any JavaScript: In the timeline page, click on the tip of the parent branch, then the tip of the branch to be merged. That will give you the vdiff page.