Fossil Forum

Can two or more repos be combined?
Login

Can two or more repos be combined?

Can two or more repos be combined?

(1) By anonymous on 2020-12-05 00:10:09 [link] [source]

Repo A and Repo B need to be combined into a monorepo while preserving history of both repos.

The combined repo should end up with repo A under directory A, and repo B under directory B.

Can this be done?

(2) By Warren Young (wyoung) on 2020-12-05 00:32:44 in reply to 1 [link] [source]

If you didn't have the restriction that all files be renamed while being combined, then it'd be easy. That invalidates every single commit artifact, which means you lose all internal referential integrity.

If you can live with just the trunk view of each repo, then scripting a linear checkout of each repo's trunk commits wouldn't be too difficult. Study the options to fossil ci, particularly --date-override and --user-override.

If you need all branches and such, you could in principle do the same thing, only with recursive descent of the branches, but that gets tricky.

(3) By Dan Shearer (danshearer) on 2020-12-05 18:03:49 in reply to 2 [source]

Warren Young (wyoung) wrote on 2020-12-05 00:32:44:

If you didn't have the restriction that all files be renamed while being combined, then it'd be easy.

I gather the rename you refer to is putting both repos one level down, importing into A and B respectively. While this seems a logical thing to do with a tree import operation, that isn't what the DAG thinks. Fair enough, because Fossil doesn't have an "import tree with full history" command.

You do seem to suggest that some version of this can easily exist though. What would you recommend to come as close as possible to importing Fossil repo A into Fossil repo B? Or combining the two into a monorepo as the original poster asked?

Thanks

Dan Shearer

(4) By Warren Young (wyoung) on 2020-12-05 21:57:10 in reply to 3 [link] [source]

I gather the rename you refer to is putting both repos one level down

Yes: that changes all F cards in all manifests, which changes all commit hashes, which means all P cards become invalid, all references to specific commits in other commit messages, ticket comments, forum posts, etc. become invalid...

The whole things falls apart. Total protonic reversal, dogs and cats living together, mass hysteria.

importing Fossil repo A into Fossil repo B?

There are at least two ways. Given x.fossil and y.fossil in a directory with subdirectories x and y containing checkouts of each:

Method 1: Pull by Project Code

$ cd x
$ fossil sql "select value from config where name='project-code'"
'2f17b2e264673df474bc606a73727cab7e029263'
$ cd ../y
$ fossil pull --project-code 2f17b2e264673df474bc606a73727cab7e029263 ../x.fossil 

Method 2: Reconstruct

$ mkdir tmp
$ cd tmp
$ fossil deconstruct -R ../x.fossil .
$ fossil deconstruct -R ../y.fossil .
$ fossil reconstruct ../z.fossil .

Beware, this loses non-artifact info. (User table, skin config, etc.) You can fix that with fossil config export/import commands.

On the plus side, it creates a new merged repo with a new project code, so it doesn't harm repos x or y, now or even potentially in the future, short of a repeat of this sort of hackery.


Both methods result in two unmergeable DAGs, both with their own "trunk", which results in messages like:

$ fossil stat
...
WARNING: multiple open leaf check-ins on trunk:
  (1) 2020-12-05 21:39:09 [df0e663732] (current)
  (2) 2020-12-05 21:38:48 [c5d8fde68d]
$ fossil merge
Merging fork [c5d8fde68d] at 2020-12-05 21:38:48 by tangent: "foo"
cannot find a common ancestor between the current checkout and 

Giving hashes to fossil merge won't help, because it's right: there is no common ancestor. You'd do better to fossil amend --branch on one of them to suppress the complaint than try to fully merge them.

(5) By Warren Young (wyoung) on 2020-12-05 22:21:59 in reply to 4 [link] [source]

I should add, both options suck.

Because you have two unmergeable DAGs, you don't get a unified view of both trees on checkout. You bounce between them, depending on which DAG you select.

This also means things like bisect don't work between files in the two DAGs.

Reconstructing the repo using a script to create a single DAG per my initial post in this thread is a far better plan.

(6.1) By Warren Young (wyoung) on 2020-12-06 22:40:22 edited from 6.0 in reply to 5 [link] [source]

two unmergeable DAGs

It turns out, the DAGs are mergeable if you're willing to get your hands dirty. fossil merge will refuse to do it for you, even with --force, but you can get there by stopping just before the reconstruct step in Method 2, then:

  1. Combine the manifests for the tip commits in each DAG:

    • Write a new C card comment: be sure to escape spaces and such!
    • Create a new plausible D card
    • Sort the F cards lexicographically
    • Write two new P cards to refer to the DAG tips you're merging
    • Drop the R cards, unless you're willing to recompute a new one (§4.1)
    • Pick one of the U cards, or write a new one
  2. Run fossil test-parse-manifest on the result. This will not only check that you've written the manifest properly, it will calculate the Z card for you. Append it.

  3. Rename the manifest: mv mf.txt $(sha3sum -a 256 mf.txt | cut -f1 -d' ')

    Note that it doesn't have to go into a subdirectory named after the first 2 hex digits, as fossil deconstruct writes it. fossil reconstruct will find manifests at the top level just fine.

  4. Proceed with reconstruction. The result should show a new "tip" commit merging the two DAG tips.

I propose a new flag — fossil merge --no-common-ancestor — which does this on the user's behalf. That is, it says "Yes, Fossil, I know the two commits you're proposing to merge have no common ancestor. Just jam them together for me, thanks."

My manual DAG tip merge process above will not get you a merged history prior to the merge step. To do that, you'd have to rewrite the whole tree's P cards so you could inject a new parent for both original DAG's initial empty commit. Then presumably you'd want to generate new commits that refer to the F cards in both DAGs based on timestamp merging, at which point we're back to total protonic reversal.

(7) By Dan Shearer (danshearer) on 2020-12-06 23:40:33 in reply to 6.1 [link] [source]

Warren Young (wyoung) wrote on 2020-12-06 22:40:22:

Combine the manifests for the tip commits in each DAG:

  • Write a new C card comment: be sure to escape spaces and such!
  • Create a new plausible D card
  • Sort the F cards lexicographically
  • Write two new P cards to refer to the DAG tips you're merging
  • Drop the R cards, unless you're willing to recompute a new one
  • Pick one of the U cards, or write a new one

I need to digest this some more but it looks like a feasible partial solution.

But one thing is clear already, no other SCM has had the advantage of discussing major DAG handling changes with the knowledge we have now in 2020 of many years of git/mercurial, and contemporary graph theory. Fossil has a chance to get this right, by being in some ways late to the SCM party and yet also with a mature codebase so well-designed it can still take fundamental change.

My manual DAG tip merge process above will not get you a merged history prior to the merge step.

Isn't that a feature though?

Dan Shearer

(8) By Warren Young (wyoung) on 2020-12-06 23:57:37 in reply to 7 [link] [source]

I need to digest this some more

Read the file format doc (previously linked) and everything I wrote should become clear.

I should point out that the above isn't a guess, it's a report of a successful merge of two test repos.

Isn't that a feature though?

It's Fossil doing what it's programmed to do, but I can see someone coming back after reading the above with a complaint that they don't get a historical merge of all prior commits as well.

For that, you need a commit-by-commit merge, which will rewrite all commit hashes.

(9) By jshoyer on 2020-12-08 15:01:55 in reply to 6.1 [link] [source]

I would welcome a --no-common-ancestor option for fossil merge. I used to use fossil open --empty reasonably often, until the schema changes associated with fixing the ‘Ryerson bug’ prevented merging together the resulting disconnected DAGs. Maybe I'll try the manual method presented above.