Fossil Forum

Rebase considered harmful
Login

Rebase considered harmful

Rebase considered harmful

(1) By Richard Hipp (drh) on 2019-09-04 19:13:46 [link] [source]

I have a new draft document available:

Rebase Considered Harmful

This draft contains pointed criticism of Git and rebase. If you think I am being to harsh, or if you think I am not being harsh enough, or if you have ideas for corrections, additions, or clarifications, then please provide your feedback.

Thanks.

(2) By Stephan Beal (stephan) on 2019-09-04 19:26:52 in reply to 1 [link] [source]

Excellent! One minor typo which i won't correct myself because i'm reading/posting from a phone:

... programmers should linking their code with their sense of self, as they makes it more difficult for them to find and respond to bugs, and hence makes them less productive.

Down with rebase!

(3) By ckennedy on 2019-09-04 19:38:16 in reply to 1 [link] [source]

Perhaps mark off the section discussing diffs that begins with:

Another argument, often cited, is that rebasing a feature branch allows one to see just the changes in the feature branch without the concurrent changes

Give it a proper header. Then perhaps explain in a bit more detail why diff(C6,C7) is the correct diff to perform. It was somewhat eye opening to me when you explained it in the forum earlier. It took me a few minutes of jumping between the various diffs and looking at it before the light bulb went off for me. While making people figure out the reasons on their own can be good, so can leading them to the same conclusion gently, or perhaps with a solid example taken from Fossil itself.

(4.1) By Chris (crustyoz) on 2019-09-04 19:40:31 edited from 4.0 in reply to 1 [link] [source]

Comments and errata:

  1. Editing documents without paragraph number is more difficult than with numbering. In Markdown, using " 1. " as the paragraph prefix provides automatic number sequencing and makes it easy to remove later.

  2. Section titled "Rebase encourages siloed development", third paragraph starting with "Weinburg", the phrase "That is to say, programmers should linking their code with their sense of self,.." should include "avoid" between "should linking".

  3. Regrettably, this is preaching to the converted while justifying a design decision. The git/github stalwarts who want it are likely to those who are easily embarrassed by their mistakes.

  4. Consider also, a large contributor population, each making their own noisy branches, and how that noise might be overwhelming for the approvers.

  5. I support these arguments 100%.

(6) By Richard Hipp (drh) on 2019-09-04 19:57:17 in reply to 4.1 [link] [source]

In Markdown, using " 1. " as the paragraph prefix provides automatic number sequencing and makes it easy to remove later.

I don't think Fossil Markdown does this, does it? Or am I missing something? Is this a feature we should add? Can you provide a concrete example of how it is suppose to work (perhaps as a follow-up to this post)?

(7) By ckennedy on 2019-09-04 20:02:38 in reply to 6 [link] [source]

I think he just meant add "1. " verbatim in front of every paragraph. Markdown will then auto-increment. Below, every paragraph starts with "1. ".

  1. This is the first paragraph.

  2. This is the second, but is starts with 1.

  3. This is the third.

  4. Markdown should change all the "1. "'s to the correct number.

(16) By Martijn Coppoolse (vor0nwe) on 2019-09-05 10:27:10 in reply to 7 [link] [source]

Personally, I prefer the Markdown to be readable as-is; and so I tend to use the normal sequence of numbers. If editing leads to a mistake in the numbering sequence, most Markdown interpreters will restore the sequence during rendering. This includes Fossil’s:

  1. First paragraph, numbered 1.
  2. Second paragraph, numbered 2.
  3. Third paragraph, numbered 4 in Markdown, but 3 in the output.
  4. Fourth paragraph, numbered 5.
  5. Fifth paragraph, numbered 5.

This was generated using the following Markdown:

  1. First paragraph, numbered 1.
  2. Second paragraph, numbered 2.
  4. Third paragraph, numbered **4** in Markdown, but 3 in the output.
  5. Fourth paragraph, numbered **5**.
  5. Fifth paragraph, numbered 5.

So it won't make a difference to the rendered HTML; but always writing 1. will make the unrendered Markdown much less readable, IMHO.

(27) By Chris (crustyoz) on 2019-09-06 14:34:11 in reply to 4.1 [link] [source]

Another small typo:

Section 7.0 Cherry-pick merges work better then rebase

"Perhaps there some cases where a rebase-like transformation is actually helpful."

should be:

"Perhaps there are some cases where a rebase-like transformation is actually helpful."

(5) By ckennedy on 2019-09-04 19:44:50 in reply to 1 [link] [source]

Another awkward phrasing in the Weinberg paragraph:

with their sense of self, as they makes it more difficult for them to find and respond to bugs

(8) By Eric Junkermann (ericj) on 2019-09-04 20:47:35 in reply to 1 [link] [source]

Very good indeed!

(9) By Warren Young (wyoung) on 2019-09-05 00:17:48 in reply to 1 [link] [source]

Would check-ins replacing the ASCII art diagrams with yEd diagrams be welcome, as I recently did for the merging and branching doc?

(10) By Richard Hipp (drh) on 2019-09-05 01:24:34 in reply to 9 [link] [source]

What I would really like is for you to code up that PIC interpreter that I mentioned the other day. :-)

The yEd diagrams look great. My biggest concern with them is will we be able to edit them in 5 or 10 years. But you'll generate GIFs or PNGs, right? So if editing is needed down the road the yEd is no longer available, we could just redo all the diagrams using whatever tool is available then.

So, yes. Go head with the yEd diagrams.

Meanwhile, I think a home-brew PIC interpreter installed on both Wiki and Markdown would be really, really cool! Who needs a fun project?

(12) By Warren Young (wyoung) on 2019-09-05 01:45:08 in reply to 10 [link] [source]

I chose yEd because it's free-to-use, it stores its output in clean XML files, and it generates clean SVG output for Fossil's web UI. So, if yEd disappears or they decide to start charging for the interactive tool instead of using it as a free come-on for their money-makers, we could at least in principle edit the SVG directly or use an SVG-aware illustration package like Inkscape or Adobe Illustrator to make any needed future edits.

It also helped that its default diagrams were easily configured to match the GIFs previously used in the branching doc.

If someone has a truly libre diagramming tool of yEd's calibre, I'll certainly evaluate it as a replacement.

Meanwhile, we have a free tool that cooperates nicely with Fossil's delta-compressing nature.

(13) By Warren Young (wyoung) on 2019-09-05 01:54:00 in reply to 12 [link] [source]

I should point out that I do already use yEd with Inkscape: for small diagrams, there's a bug you can work around by passing the yEd SVG output through Inkscape.

This implicitly proves the portability of the output, since you've got multiple browsers plus Inkscape all interpreting the diagram contents the same way.

(14) By Andy Bradford (andybradford) on 2019-09-05 04:33:02 in reply to 12 [link] [source]

> Meanwhile, we  have a free  tool that cooperates nicely  with Fossil's
> delta-compressing nature.

And if  it is  ever decided that  the yEd files  are no  longer working,
Fossil already has Richard's original ASCII art committed. :-)

Thanks,

Andy

(15) By anonymous on 2019-09-05 09:59:36 in reply to 10 [link] [source]

Meanwhile, I think a home-brew PIC interpreter installed on both Wiki and Markdown would be really, really cool! Who needs a fun project?

To clarify, what would you like the output format of such an interpreter to be? Personally, my vote is for SVG, since it is both old enough to be supported by the majority of web browsers and flexible enough to look well on devices with wildly different screen sizes and resolutions. Are you aware of any good resources about pic, besides Making Pictures With GNU PIC by ESR?

(17) By Richard Hipp (drh) on 2019-09-05 10:35:21 in reply to 15 [link] [source]

I was thinking SVG output too. If you can come up with a file "pic.c", or perhaps better a Lemon source file "pic.y", that takes an input string of PIC language and returns a string of SVG in space obtained from malloc() or fossil_malloc(), that would be ideal. I can integrate this into the wiki and markdown formatters such that any text between ".PS" and ".PE" gets rendered as SVG.

ESR's documentation seems like a good resource. I first encountered PIC (decades ago, literally) from the original Bell Labs Tech Report by Brian Kernighan.

(18) By Richard Hipp (drh) on 2019-09-05 13:08:00 in reply to 17 [source]

Perhaps a better approach would be to put PIC code in ~~~ markup and use PIC as the language name. Like this:

     ~~~ PIC
       ellipse "document"
       arrow
       box "PIC"
       arrow
       box "TBL/EQN" "(optional)" dashed
       arrow
       box "TROFF"
       arrow
       ellipse "typesetter"
     ~~~

This renders as:

     <pre><code class='language-PIC'>
        ... PIC code here ...
     </code></pre>

Then we enhance Fossil slightly so that whenever it sees a <code> with a class of "language-PIC" it also inserts at the bottom of the page:

     <script src='/builtin/pic.js'></script>

Then implement a client-side PIC-to-SVG translator in the javascript file "pic.js".

Perhaps make this even more general by adding options under the /Setup/Wiki menu that allow the admin to select arbitrary javascript source files to include in any rendering that contains <code> of a certain class. That would allow a site to use Mermaid diagrams or whatever other graphics rendering package the admin desires.

(20) By Stephan Beal (stephan) on 2019-09-05 14:30:25 in reply to 18 [link] [source]

One problem i foresee here is that we'll be outputting to HTML or SVG using a client-specified skin (i.e. font sizes). One of PIC's limitations, according to the ESR docs, is that a box does not grow to fit its text because PIC does not have access to the font size, and therefore cannot do much better than simply use a fixed box size and require the user to grow it as needed via the height and width options. That's all fine and good when one knows exactly which font size(s) will be used in the output, but we won't know that. It's quite possible that a given PIC box will look fine in Skin X and the text will overflow its box in Skin Y (which uses larger font sizes).

Maybe that's a non-issue, but it seems worth pointing out.

(21) By anonymous on 2019-09-05 17:18:13 in reply to 20 [link] [source]

SVG images aren't designed to reflow their contents according to the varying size of the content (like some HTML pages manage to do), so we might as well fix the font size together with the rest of the picture dimensions, like it's currently done in the SVGs from yEd. This way the picture will scale rigidly, preserving the look and the relative positions at all zoom levels. Perhaps we should add a non-standard variable to PIC to set the font size.

There are articles about "responsive SVG", but they all boil down to whether the scaled SVG image will have its aspect ratio preserved and whether the scaled image should be centered; the images still scale as a whole and not according to the size of their inner elements.

(22) By anonymous on 2019-09-05 17:21:59 in reply to 20 [link] [source]

SVG elements can be scaled via CSS. But this situation would require someone with much more experience with CSS than I have.

On the other hand, a client side, JavaScript implementation might have access to the font sizes.

(24) By anonymous on 2019-09-05 17:45:19 in reply to 22 [link] [source]

(this is stephan - a login quirk keeps leading me through the login page but does not actually log me in.)

JavaScript "can" dynamically resize fonts, but it's subject to timing restrictions related to visibility of DOM elements. i set out to do that for a hobby project a couple months ago and nearly went insane fighting with its limitations. In essence, the code has to keep increasing the font size until it determines that it's too big (via the various size-related DOM element properties) then back up with slightly smaller sizes until it finds a perfect fit. It's extremely inefficient, but possible. i don't recommend bothering, though.

(25) By Stephan Beal (stephan) on 2019-09-05 17:53:09 in reply to 24 [link] [source]

(this is stephan - a login quirk keeps leading me through the login page but does not actually log me in.)

It seems that that's caused by a Firefox for Android bug and affects all sites i use, not just Fossil. Force-closing the browser resolves it. In any case, it is apparently not related to the intermittent "post mis-attribution" bug.

(23) By anonymous on 2019-09-05 17:39:30 in reply to 18 [link] [source]

Perhaps a better approach would be to put PIC code in ~~~ markup and use PIC as the language name.

I like this idea and a quick search shows there's some precedent for this approach.

(26) By anonymous on 2019-09-05 17:53:27 in reply to 18 [link] [source]

Thank you for the link to the lab report! Appendix A is especially useful thanks to its grammar description.

While I understand the usefulness of client-side renderers, I would appreciate if schemes were prepared server-side.

That being said, for the server-side PIC rendering we can either take dpic apart, remove all renderers except SVG and replace printf calls with formatted output to allocated buffer (there might be a lot of global variables to take care of, though) and try to embed it in Fossil (including the Bison-generated parser and scary-looking parser.w produced by p2c) or try to write a new implementation with library calls in mind (which is probably a harder task). Which option would be a better choice?

(19.1) By Andreas Kupries (aku) on 2019-09-05 14:20:41 edited from 19.0 in reply to 17 [link] [source]

There is also the earlier-mentioned dpic and its documentation.

To me the biggest issue with PIC implementation is the macro system. Because that looks to require quite a bit of shenanigans in the lexer to get all the text substitutions right. As an aside, does anybody know if there is a test suite for PIC, and where to find it ?

On the security side the sh command likely has to be forbidden. Similarly copy has to be changed to not use the filesystem. Its path argument should refer to artifacts in the repository instead.

(11.1) By Andy Bradford (andybradford) on 2019-09-05 01:42:22 edited from 11.0 in reply to 1 [link] [source]

I think  the document  is excellent reading  material, and  it certainly
clarifies the philosophical and practical differences between Fossil and
Git. I don't  think the document is too harsh  (though the conclusion is
strong medicine).

Specifically, I think this is insightful:

    Surely a better  approach is to record the  complete ancestry of
    every check-in but  then fix the tool to show  a "clean" history
    in those instances  where a simplified display  is desirable and
    edifying, but retaining  the option to show  the real, complete,
    messy  history for  cases  where detail  and  accuracy are  more
    important.


Since the document  is about rebase, perhaps a small  mention of "squash
commits"  would also  be in  order,  though really  perhaps that's  just
another name for using rebase to "clean up history".

I have to  wonder how much of  the reason why people use  rebase so much
with  git can  be  attributed to  GitHub and  Pull  Requests. As  you've
already pointed  out many times, getting  a complete picture of  what is
being  merged in  is easily  available by  simply analyzing  the correct
diffs---and a diff between the most  recent ancestor and a branch pretty
much shows you the "squashed" view of timeline does it not?

Thanks,

Andy

(28) By Steve Schow (Dewdman42) on 2019-09-06 15:35:24 in reply to 11.1 [link] [source]

I think it would be interesting if this article also presented the case that history cleanup could be done in fossil by using private branches, and how that would be done. I do think that is a big reason why most people use rebase,

(29) By anonymous on 2019-09-06 17:30:41 in reply to 28 [link] [source]

history cleanup could be done in fossil by using private branches

While that is true, and I think that private branches do have a place in fossil, they also introduce one of the problems that rebase introduces: intentional forgetting of information.

Merging a private branch to trunk gives the same result as rebase --squash to trunk does. The resulting commit does not refer to the branch it came from.

Private branches really should only be used for experiments. If the experiment produces something worth merging in, then why not publish the branch? No matter how ugly the chain of commits, publishing gives others deeper incite to the new feature.

While I like the idea another poster suggested, "weak references", I think a new card type in manifests would be a better way to implement.

As for why have "weak references", I think that DRH's article already makes a case.

(46) By Warren Young (wyetr) on 2019-09-06 19:44:34 in reply to 29 [link] [source]

Private branches really should only be used for experiments.

Through hints in the various discussions on this, I've come to believe that private branches in Fossil were specifically created to allow for proprietary extensions to SQLite that are never pushed to the public SQLite repos. Only renegotiation of the contract those features were developed under would give cause to push those branches to the public repos.

"Private" encompasses a wide range. There's "private" as in playing your cards close to the vest, and there's "private" as in not letting anyone even know whether you're playing whist, Settlers of Catan, or 5D hyper-dimensional Tetris.

(30.2) By Steve Schow (Dewdman42) on 2019-09-06 17:57:14 edited from 30.1 in reply to 28 [link] [source]

Intentional forgetting of information is only a problem if a requirement is to have strict auditing of all coding activities. I think this gets back to fossil being the non-bazaar style of software production, which is totally fine, but it does limit fossil's usefulness for some, if people are unable to consolidate their commits as they can with git.

Fossil devs take a strong philosophical stand that no coding should be done by anyone without an overlord watching every step to see what everyone is up to. That is useful in certain situations, but there are many situations where that is undesirable and discourages active involvement and discourages frequent commits by various contributors.

Fossil might be more useful and interesting to more people if they could see how to function in more of a bazaar style of development.. private branches do provide that possibility somewhat. With GIt you can basically do an interactive-rebase, and squash your commits down to whatever you want, down to one commit, or perhaps down to 2 or 3 commits..or perhaps only combine a couple of non-sensical commits into singular meaningful commits as makes sense. There is no harm in doing this unless you have a philosophy that developers are stupid and incapable of making those kinds of decisions before presenting their work to peers for review and merge consideration.

In fossil, on a private branch, you could commit away with reckless abandon and end up with a long series of confusing commits on your branch, then at the very least you could do one merge to a public branch...and that one resulting merge commit would be like doing a squash-to-one rebase in git. Then the overlords could see the work, do peer review, etc, on the public branch. That is a perfectly legitimate way to work and I see no harm in it. I think perhaps some git users would be more interested in fossil if they could see these kinds of possibilities in fossil, rather then being told they are blasphemous to even consider reorganizing their commits on their private repo before anyone else sees what they are doing. I think its related to this article because I think a lot of git users use rebase specifically for this purpose, not for some advanced theory about why they think rebasing two branches together is better or worse than merging.

The only thing I don't like about that is that one of fossil's strength is the distributed backup of all commits.. That doesn't happen in the git world, and doesn't happen on fossil private branches either.

(31.1) By Steve Schow (Dewdman42) on 2019-09-06 18:06:38 edited from 31.0 in reply to 30.2 [link] [source]

and actually the thing I like about using the private-branch approach to squashing commits, is that a developer cannot really accidentally screwup the work they did in the last few days. Its still all there in the private branch, with all the confusing series of commits..its all right there, they haven't lost it. And they aren't changing history per say, but they are adjusting what anyone else will see by keeping it private until its squashed into a public branch.

And I think the vast majority of git rebasers out there, just want to do that...

(36) By anonymous on 2019-09-06 19:11:22 in reply to 31.1 [link] [source]

its all right there, they haven't lost it

Sort of. The branch is still there, but information about the merge is lost. (Maybe you mention, in the commit comment, you merged in from some branch, but do you really put the commit ID of the merge parent in the comment?)

but it does limit fossil's usefulness for some, if people are unable to consolidate their commits as they can with git

You can consolidate your commits with Fossil. The difference is, with Fossil, after you have squashed it into a public branch, others still have the option to "look behind the curtain". With git, there's nothing behind the curtain.

(44) By Steve Schow (Dewdman42) on 2019-09-06 19:38:23 in reply to 36 [link] [source]

Sort of. The branch is still there, but information about the merge is lost. (Maybe you mention, in the commit comment, you merged in from some branch, but do you really put the commit ID of the merge parent in the comment?)

I don't know the right answer, this is where I'd like to see fossil provide built in commands to handle things properly. No I don't think the commit going to public branch should refer back to the private branch, they can't see in the private branch. When you say the merge information is lost..what do you mean? Just referring to the parent private branch right? Well again, this could be resolved in some way by enhancements to fossil so that merging from a private to a public branch basically makes it look like the change was just a commit on the public branch as if someone made the changes there. There is no need or desire at that point to know anything about the private branch.

You can consolidate your commits with Fossil.

Well this is what I think RH's paper ought to touch on a little bit. Fossil can do it too! Mention that in this case that rebase can be avoided by using a different and superior approach. I for one, do not even know what the exact procedure would be to do that, btw. And I find it too much of a PITA to work on private branches and remember to do all that extra branching, etc...so for my own stuff I just stick to default and typical fossil behavior, and live with confusing commits for all of posterity. But if fossil made it easier, I would definitely use the private branches and cleanup my commits before making it official in push.

The difference is, with Fossil, after you have squashed it into a public branch, others still have the option to "look behind the curtain".

How are they able to look behind the curtain at private branches? I may not be understanding something.

With git, there's nothing behind the curtain.

Well actually there is. The commits are still there! They just aren't attached to the branch chain in the right way to find them easily. And once the remote repo is push or pulled, I would assume those commits would go to the other repos also...hidden as they may be...they are there and a smart git user could probably find them. Whereas with fossil private branches, they are truly private to the repo. Right?

(55) By anonymous on 2019-09-07 10:44:53 in reply to 44 [link] [source]

after you have squashed it into a public branch

Sorry, I meant "target branch".

And, I was referring to the usual case of merging from a public branch to another public branch.

Obviously, until a way to record merges of private branches into public branches is developed, even publishing a private branch will still leave it effectively orphaned.

Of course, if the private branch is published before it is merged to a public branch, then the merge parentage is fully recorded.

The commits are still there

But there's nothing but the memory of the user that rebased to link the merge commit back to its merge parent.

Unfortunately, merging private branches has this same problem, for now.

I would assume those commits would go to the other repos also

Only if the user pushing specifies those branches be pushed. (FYI, you can push private branches on Fossil.)

(32) By Stephan Beal (stephan) on 2019-09-06 18:11:55 in reply to 30.2 [link] [source]

... but it does limit fossil's usefulness for some, if people are unable to consolidate their commits as they can with git.

Just out of curiosity: isn't git the first SCM in history which (intentionally) enables changing/losing history?

(33) By Steve Schow (Dewdman42) on 2019-09-06 18:33:54 in reply to 32 [link] [source]

I have no idea the answer to that. But in the past SCM's were lock oriented and much more constrictive then they are now, in more ways then one.

But anyway just because nobody did it before doesn't necessarily means that its not appropriate to allow it in certain situations. Again..there is no reason for the overlords to see every single semi colon change made to source code and committed in a series of haphazard commits. Git is more like an editing tool...not an auditing tool. The developer edits what they plan to present for consideration, which will happen to have delta change information in what is finally presented.

Take SCM out of it for a minute. If you are editing files in your text editor..and you edit a bit here and bit more there, hopefully saving often. But your boss doesn't need to see every single keystroke you make. Your boss doesn't need to see every single SAVE you make to the file either. At some point, if you make a commit to some kind of SCM, then your boss and other devs get to see what you felt was enough of a change worthy of making a commit. But all too often people make a commit, then they think, ah darn, I need to fix a couple more things and do another commit. Their boss doesn't really need to see that "oh darn" moment. Nobody does except you.

Perhaps traditional SCM's kind of force a situation where devs either commit infrequently in order to preserve that sense of privacy, or the official and final series of commits will be in some cases hectic, busy and hard to follow for anyone other then the original developer that actually made those series of commits. Git makes committing just a step above saving. Its just an editing step, nothing more. When they are ready to present their work for review, they can edit the files all they wish, edit the commits all they wish and present it as a pull request. If they want a singular commit for the whole pull request, they can do that, or if they want a series of meaningful commits that retrospectively just make more sense and will make more sense in a year from now when someone is looking back at it too, then git lets you do that. its just an editor and allows you to edit the way your source code changes will be stored in an SCM. Without an ability to do that, I will say that most people will avoid committing anything until they reach certain checkpoints that are really solid and worthy of being a long term commit for all to see.

I don't really understand why some of you consider that to be a problem. Its just a tool that enables developers a bit more flexibility to hack on their code with whatever reckless abandon they want and for their own purposes, they can commit often..which can be useful for them too by the way, in a private way, to see the ugly ups and downs of what they were doing to get where they needed to get. So committing deltas to some kind of SCM is useful for them, but at some point they want to do more of a well thought out REAL commit...the commit that others will see, the commit that they will want to look at in a year from now and not have to wade through a long series of haphazard commits they actually made in order to figure out what the heck was going on in their mind when they did it, etc..

I see this capability as being an improvement and added flexibility.

Anyway, my only point is that if you're going to write a document that labels "rebase" as an anti pattern that should be avoided at all costs, then this particular aspect of why people are rebasing a lot needs to be discussed also. Its a perfectly reasonable desire to work privately, commit often, and have a way to bring consolidated commits back to others for review and for future posterity. It makes SCM part of the editing process rather then being an auditing log.

(34) By Stephan Beal (stephan) on 2019-09-06 18:50:42 in reply to 33 [link] [source]

I don't really understand why some of you consider that to be a problem.

i started my query with "just out of curiosity", not to start a flame against this aspect of git. i agree entirely that "historical momentum" is not necessarily a great reason to keep a status quo. That said...

Git makes committing just a step above saving.

It also introduces the ~~possibility~~ inevitability of data loss (hmmm - apparently fossil's markdown doesn't do strikethrough). i've literally lost more data via git than any other single piece of software (not counting /bin/rm). "Once burned, twice shy" certainly applies to me.

It makes SCM part of the editing process rather then being an auditing log.

Keeping in mind that using an editor without "undo" support is often painful.

The ability to mutate history is a different approach, no less valid than fossil's, but it's demonstrably more fallible in the sense that it can and does lose data (sometimes data the user did not intend to lose, and sometimes other peoples' data).

"With great power" and all that.

(35) By Steve Schow (Dewdman42) on 2019-09-06 19:02:26 in reply to 34 [link] [source]

oh yea I'm sure we have all lost data using our editor when we did or didn't save something how we should have. This is always a possibility with any editing tool, including git.

And I agree, that is one reason I like the private-branch approach in fossil for achieving the same effect that most people are achieving in git by interactive-rebase (ie, changing the history). Changing history in git means losing history. Its the same as saving your changes to a file without any versioning in place. Its gone forever. Maybe that's desirable for some, but I think most people would probably prefer the best of both worlds, which is to keep a private history and never lose it, but able to re-construct what is going to merge back to public repos for others to see or for becoming the "official" series of change sets.

In git they do that with interactive-rebase, admittedly in a destructive way...though...as I pointed out on that other thread, their original commits are actually still there! They just aren't chained into the current branches anymore. They could still be found if you did something stupid during interactive rebase. But admittedly it would be complicated and a PITA.

I haven't used git long enough to lose any work. I might be whistling a different tune after that happens, if it happens.

Another thing about both git and fossil, is that because they are distributed, its no skin off anyone's chest to just copy a repo and try something out and still have the original just in case. Just like you would if you were editing a file and wanted to make sure you could get back to the prior version, without having to commit anything into an SCM to actually have that short term versioning. This is a valuable tool to have, the ability for short term versioning, for whatever purpose.

(39) By Steve Schow (Dewdman42) on 2019-09-06 19:22:47 in reply to 34 [link] [source]

I guess my best of both worlds answer and suggestion would be something like this...

  1. fossil would prevent public shared branches from being confused by rebases for merging branches, use merges
  2. fossil would provide easy and automatic diff and merge resolution based on the the ideas RH put forth this paper about which nodes need to be diffed to get the cleanest view of a feature branch and for code review of them.. In RH's example, it was necessary to merge from mainline to feature branch first in order to see this isolated diff, might be preferable to have the ability to VIRTUALLY do that somehow for code review purposes, I think if fossil had an actual code review feature that guided people through the right steps to do all of this it would eliminate all concerns.
  3. fossil would provide public shared branches from having history rewritten, ever.
  4. Fossil would keep all private branch commits around as a short term history, it lasts until the repo is destroyed. Can't rewrite them.
  5. But... Fossil would provide an EASY way for developers to work privately and rewrite the history that would be merged back to public branches. Right now it can do it as a singular squash commit into a public branch, but even better would be the ability to reconstruct whatever set of commits desired to be merged and pushed into public shared repos. And all of it could be put into built in commands to make it simple and easy without having to understand all the branch mechanics, so that it could be done always the same way easily and consistently.

Do all that I would see very little justification for rebase. The real reason people are using rebase would have a solution in fossil too, actually a superior solution since the private branch would still be keeping a full private short term history even while they are attempting to squash their commits for review and merge into mainline.

There are a few other examples I see given out there, rare situations for when an administrator might want to do some kind of rebase for some arcane situation, and I think most of those could be ignored for the time being or dealt with specifically if the need came up.

Another thing people do with rebase, for example, is they merge a branch into mainline, or rebase I should say, and in some cases if there is no conflicts it can do a so called fast-foward, which basically just copies the commits from the feature branch onto the tip of mainline directly, and the feature branch can be completely deleted as if it never existed. Makes it look like mainline had the series of commits to begin with. I can see how some people might want to do that, and rebase makes it possible, but I think that's also something people can live without, branches are cheap, better to keep a feature branch around, its only a question of how many commits should show up on the final public version of that feature branch and how meaningful those commits should be.. Opinions differ.

(43) By anonymous on 2019-09-06 19:34:56 in reply to 39 [link] [source]

if there is no conflicts it can do a so called fast-foward, which basically just copies the commits from the feature branch onto the tip of mainline directly ... Makes it look like mainline had the series of commits to begin with. I can see how some people might want to do that, and rebase makes it possible

Don't need rebase to do that. Rebase makes it easier to do that, but you can do it without rebase.

(45) By Steve Schow (Dewdman42) on 2019-09-06 19:43:16 in reply to 43 [link] [source]

great! Point that out in the article too.

(49) By ckennedy on 2019-09-06 19:59:11 in reply to 39 [link] [source]

In RH's example, it was necessary to merge from mainline to feature branch first in order to see this isolated diff, might be preferable to have the ability to VIRTUALLY do that somehow for code review purposes, I think if fossil had an actual code review feature that guided people through the right steps to do all of this it would eliminate all concerns.

You can diff any checkin to any checkin in Fossil. It's working on this website right now. Go to the timeline, click on any of the open circles on the timeline. Now click on any other timeline circle. You will get a diff between those two checkins, even if they are on different branches. You can also use fossil diff with the --from VERSION and --to VERSION parameters.

Additionally look at the --checkin VERSION abilities of fossil diff. Fossil has some very strong features for code review right now. You just have to know they are there, and how to use them.

While some sort of guided code review would be awesome, that is best left as an extension, as your version of code review might be different from my version.

Thanks.

(37) By Marcelo Huerta (richieadler) on 2019-09-06 19:17:27 in reply to 33 [link] [source]

Its a perfectly reasonable desire to work privately, commit often, and have a way to bring consolidated commits back to others for review and for future posterity.

If you work in a project, specially under an employer and with propietary code, there is no such thing as "your" code.

It makes SCM part of the editing process rather then being an auditing log.

And that's a problem. Many external audits consider that the SCM provides a way to see what code was active at a certain date. The ability to rewrite history could help mask a crime. I'd prefer a SCM with "fossilized" commits with tools to mark hidden several intermediate commits if they are not serious changes; you can diff between the visible nodes and see the changes in consolidated form, but you can also show the intermediate steps if you need to know exactly what was active in each branch at a specific point in time.

I'm a mere observer and not affiliated with Fossil other than as a happy user, but I would advise you to re-read what you wrote, and specially to consider the tone in which is written. Many git users come to Fossil apparently with friendly and favorable intent, but after some exchanges with the local community, who is very happy with a Fossil without rebase and the current state of affairs while receiving happily the changes that really help them, the git users turn into git evangelists, foaming in the mouth and acusing the Fossil community of being backwards and ludicrous in not embracing the wonderful freedoms that rebase provides. It has happened repeatedly, to the point that it starts to be cringeworthy.

The Fossil community should be free to publish their philosophical position about rebasing without having to endure the attacks of happy rebasers who are still free to do whatever their favorite SCM allows them.

(40.1) By Steve Schow (Dewdman42) on 2019-09-06 19:25:12 edited from 40.0 in reply to 37 [link] [source]

as I said already, there are use cases for full auditing of every change, but I would argue the vast majority of people do not need that. And hopefully you are not saying the fossil should be constrained only to use cases where an employer owns every line of code and the right to overlord every single time you save anything?

I mostly work alone. And even I want to consolidate my commits sometimes.

(48) By Marcelo Huerta (richieadler) on 2019-09-06 19:58:22 in reply to 40.1 [link] [source]

there are use cases for full auditing of every change, but I would argue the vast majority of people do not need that.

And I would argue that the vast majority of rebases serve to protect the author's ego.

(42) By Steve Schow (Dewdman42) on 2019-09-06 19:29:27 in reply to 37 [link] [source]

I don't think there is anything wrong with anything I have written or my tone. Just throwing out ideas. Perhaps you should not presume to know what I know or don't know or what my attitude is. Just responding. RH did ask for feedback.

(47) By Marcelo Huerta (richieadler) on 2019-09-06 19:56:39 in reply to 42 [link] [source]

I don't think there is anything wrong with anything I have written or my tone.

That's part of the problem.

Perhaps you should not presume to know what I know or don't know or what my attitude is.

And perhaps Git advocates should stop attempting to turn Fossil into another Git clone. Git exists. That should be enough.

Just responding. RH did ask for feedback.

And that exempts you from criticism somehow?

(54) By sean (jungleboogie) on 2019-09-07 03:52:38 in reply to 37 [link] [source]

In Dewdman42's defense, I don't think he has a bad tone. He's been very thoughtful to explain why some find rebase necessary, or at least, a positive thing for their development needs. Perhaps git-ers don't fully understand the consequences of rebase, and just maybe they'll read drh's paper and have a better understanding of Fossil's ideals of why rebase doesn't make sense in a version control system.

As we saw in the lobste.rs post a couple weeks ago, many people seem to enjoy rebase and I think Dewdman42 is just explaining that point of view, even if it's unfounded or an improper way to manage source control to those who oppose it. Now I will agree that some of the comments on the lobste.rs site were harsh with their comments about Fossil. Because the Fossil vs. git page may be controversial, it can draw out some emotions for those who are favorably using rebase and even git itself.

I appreciate all that Dewdman42 has written to explain his thoughts and reasoning on rebasing.

(59) By anonymous on 2020-01-28 00:08:05 in reply to 37 [link] [source]

Fell down a rabbit hole and saw:

Many external audits consider that the SCM provides a way to see what code was active at a certain date.

This is another reason Fossil should have some kind of "soft reference" for merge-parents that are on a private branch.

Again, I know that the comment on the commit of the merge could include the merge parent, but that's not reliable. The committer could make mistakes: wrong parent, typo in the ID or branch/tag name or other mistake.

(52) By sean (jungleboogie) on 2019-09-06 21:18:55 in reply to 33 [link] [source]

This is probably a tale of "don't commit a giant pull request at once", but I've also ran into an issue that may be the result of a rebase. I have no idea, since I wouldn't know what to look for on git/github.

Caddy is a webserver written in go and it's cross platform. However, someone checked in a change that causes Caddy to die on http requests - so it's kind of useless as a webserver now.

See https://github.com/caddyserver/caddy/issues/2694

So I found the exact commit with the issue: https://github.com/caddyserver/caddy/commit/c32a0f5f712f0dee7b473946fcee65e826501e56 and its pull request: https://github.com/caddyserver/caddy/pull/2551

So I want to try checking specific commits to see if things will begin to work as expected. So trying a random commit as an example, git tells me this:

$ git checkout 65af5479217b1dab60698501024aa8da0c62685b
fatal: reference is not a tree: 65af5479217b1dab60698501024aa8da0c62685b

Again, in this specific case I don't know if the author rebased or deleted his branch, but the sha hash, I guess, is somehow outside the knowledge of this github repo?

In this case, I'd prefer all saves be available to me so I can test one-by-one, or have the ability to skip ahead some. Maybe I do and I am just doing something wrong.

(53) By anonymous on 2019-09-06 22:12:04 in reply to 52 [link] [source]

Rebase and squash is either “I have been working on this and following my personal review this is my minimal functional summary of the necessary change set” or “I really don’t want to share my embarrassing intermediate efforts so here is the pristine final version that looks as though I wrote it perfectly first time”.

(38) By Eric Junkermann (ericj) on 2019-09-06 19:22:22 in reply to 30.2 [link] [source]

Intentional forgetting of information is only a problem

Those who forget history are condemned to repeat it!

Fossil devs take a strong philosophical stand that no coding should be done by anyone without an overlord watching every step to see what everyone is up to.

If you think that you have probably misunderstood loads of other things as well.

(41) By anonymous on 2019-09-06 19:26:48 in reply to 30.2 [link] [source]

Fossil devs take a strong philosophical stand that no coding should be done by anyone without an overlord watching every step to see what everyone is up to.

Where do you get that from?

While it is true that, most of the time, no one is going to look at anything not on trunk, sometimes, having access to that "ugly branch" makes fixing a problem easier.

Even in git, if you don't squash commits, rebasing still leaves the "ugly series" commits visible. If just moves them down to the master's tip.

In fossil, on a private branch, you could commit away with reckless abandon and end up with a long series of confusing commits on your branch

And what's wrong with publishing that branch? Doing so really does benefit the project in the long run.

and that one resulting merge commit would be like doing a squash-to-one rebase in git

You get the same squash-to-one commit even if you also publish the branch.

(50) By Warren Young (wyetr) on 2019-09-06 20:14:32 in reply to 30.2 [link] [source]

no coding should be done by anyone without an overlord watching every step to see what everyone is up to.

I think you've misunderstood this sentence in the development organization section of the Fossil vs. Git document: "Fossil places a lot of emphasis on synchronizing everyone's work and on reporting on the state of the project and the work of its developers, so that everyone — especially the project leader — can maintain a better mental picture of what is happening, leading to better situational awareness."

Nowhere in that document do we use the word "overlord," because it would be inaccurate and thus unhelpful in drawing the distinctions we wish to make.

The only reason we talk about this feature of Fossil being of especial help to the project leader is because it is that person's job to watch over the state of the project. If you have no such role in your project, you have no leader. Even the stone soup fable, which is a model for much of FOSS, had a necessary leadership role baked into it.

But even if you are a member of a truly leadership-less project, you still want to know what your other collaborators are up to. That means keeping up on what they're checking in on their working branches, rather than be surprised when they suddenly dump weeks or months of work into the project in a single lump.

The Linux kernel is one of the most bazaar projects of all. It is, in fact, the primary project ESR had in mind when he wrote The Cathedral and the Bazaar. Yet, in the Linux kernel project, one of the characteristics most likely to get a patch rejected is for it to do too much in a single jump. It took years for single-big-lump contributions like XFS to become mainstream in Linux for this reason.

We don't even have to talk about this in terms of software. It is mathematically provable that any feedback system can change state faster and proceed faster, with safety, if it has a short feedback loop. This branch of mathematics is called control theory, and it explains why a wide variety of systems work as they do. A huge variety of physical systems from steam engines to analog electronics to chemical manufacturing processes to environmental systems can be explained by control theory. And a misunderstanding of control theory is a common reason why such systems fail!

Watching your co-contributors' work in Fossil and reacting to it is a type of feedback control, and it keeps a project on-track in the same way that negative feedback keeps an op-amp circuit from going into oscillation.

If you want to consider mathematics an "overlord," fine. I welcome our eternal and immutable overlord...because to deny mathematics is just plain silly.

(51) By Warren Young (wyetr) on 2019-09-06 20:43:56 in reply to 50 [link] [source]

The same mathematical foundation underlies the OODA loop, which was originally formulated to explain the demonstrated success of the F-86 Sabre over the supposedly superior MiG-15 in the Korean War. This concept is now widely employed in business, often with accompanying martial imagery, but you can discard all of the aggressive war-speak of it; it's just one application of the idea, not core to it.

In the context of Fossil, when you work on private branches and squash your work to single commits, you're increasing the size of your OODA loop, so your project moves forward more slowly, because everyone else is forced to overcorrect to each others' actions.

This also explains why the "overlord" characterization is wrong. If you've got one person in the project who spends all his time nit-picking contributions, that person is ineffective. If that person is your project leader, that means your project leadership is ineffective. An effective leader sets direction and applies minimal correction to keep the project moving in the desired direction. An ineffective leaders sends so many corrective signals down the other members' OODA loops they're slowed down, making the project's workforce ineffective.

(56) By Andy Bradford (andybradford) on 2019-09-07 15:12:34 in reply to 28 [link] [source]

I haven't used private branches in quite  a few years and the more I use
Fossil the  more I  think it's  a feature that  should be  lightly used,
possibly even contrary to Fossil's  original concept. Isolation can also
be achieved by simply  having a clone from which pushes  are not done. I
suppose the one thing a private branch  saves is the extra clone, but is
the complexity worth it?

Also, with the use of a private branch, bisect will only ever be able to
find  the one  massive merge  commit that  brings in  the whole  kit and
caboodle, whereas with public branches,  once the merge commit is found,
it is  possible to  bisect the  branch. Sure, the  guy with  the private
branch can bisect further, but only he.

Or do  I misunderstand how private  branches could be used  to "cleanup"
history?

Thanks,

Andy

(57) By sean (jungleboogie) on 2019-09-07 17:04:00 in reply to 56 [link] [source]

I think in this thread people said private branches could be made public. Would that not include the whole history of it?

(58) By anonymous on 2019-09-07 19:42:31 in reply to 57 [link] [source]

Private branches can be published.

BUT, any merge commits involving a formerly private branch lack references to the commit(s) where the merge content originated.

Fossil needs a way to record private merge parents. Even if a private branch never gets published, the owner of the private branch could still make use of the merge parent reference.