Fossil Forum

Git importing questions
Login

Git importing questions

Git importing questions

(1) By Julian Heinken (Schneckers) on 2023-09-10 15:50:33 [link] [source]

I would like to migrate some of my repositories from Git to Fossil. However, it seems that git-submodules or Git-LFS aren't supported, right? (I could workaround LFS, but I use submodules quite a lot.)

Also: The suggested git fast-export --all | fossil import --git new-repo.fossil doesn't work with Powershell on Windows. It does however work fine with the Windows Git-Bash for example. That wasn't really obvious to me and I think it's helpful to be clear about this in the docs.

(2) By Marcelo Huerta (richieadler) on 2023-09-10 18:14:30 in reply to 1 [link] [source]

I think difficulties and the need to use Git for Windows are more or less generally implied in the relevant part of the "Fossil vs. Git" document.

(4) By Warren Young (wyoung) on 2023-09-11 07:21:25 in reply to 2 [link] [source]

I suppose you mean this part? I don't see how that explains the PowerShell problem.

The only problem I'm aware of in PowerShell is the lack of a "redirect in from" operator, <. Piping between a stdio type source and sink should work fine.

Me not being a Windows user except under duress leaves me with a nearly uninformed guess that is that this is due to someone playing games with temporary files under an obsolete perception that "Windows doesn't have pipes." Fossil itself does this in certain cases.

If there's a good reason why this should not work and cannot be expected to work in the future, I'll be happy to explain the problem and the workarounds in the docs.

(3.2) By Warren Young (wyoung) on 2023-09-11 07:15:31 edited from 3.1 in reply to 1 [source]

git-submodules…aren't supported, right?

Right.

The reason I didn't mention this lack in our Git to Fossil translation guide is that there is no philosophical reason for it, thus no "better way" for me to suggest.1 Submodules are simply missing until someone decides to add the feature.

Will you be the contributor who provides this?

Git-LFS…I could workaround [the lack]…

This one's less clear-cut. There is a philosophical problem involved, but I have no "better way" to suggest, having never run myself into the problem solved by Git-LFS or its various competitors. I am therefore curious what your workaround would be. That might spark a new section in the translation guide.

I can confirm that out of the box, Fossil is indeed a poor way to store lots of huge files. Part of it is due to the SQLite blob size limit, and part of it is inherent to DVCS operation, where every user gets a copy of every historical version of every tracked file. Add atop that the binary file delta problem, and you do end up with a philosophical problem. DVCSes fall apart when you try to treat them as a general-purpose distributed filesystem.

The trick is, what to suggest instead? A purpose-engineered alternative like Syncthing? I'm so far out of this problem area that I don't even know the shape and size of the solution space.

Help?


  1. ^ Contrast rebase or the staging area/index, where Fossil has such a superior alternative that, in our opinion, it completely obviates the lack. The problem in these cases isn't the lack of a feature, it's the perceived need to have the bogus affordance in the first place, motivating the guide's authors (me, primarily) to attempt the reader's reeducation. I don't see that submodules fall into this category.

(5) By Florian Balmer (florian.balmer) on 2023-09-11 07:42:11 in reply to 1 [link] [source]

(6.1) By Warren Young (wyoung) on 2023-09-11 08:19:21 edited from 6.0 in reply to 5 [link] [source]

Ouch!

This being an inherent design issue with PowerShell that Fossil cannot fix, I've expanded the relevant doc, adding the new "Converting Repositories on Windows" section.

Does this work, or are improvements needed?

(7) By Florian Balmer (florian.balmer) on 2023-09-11 18:59:36 in reply to 6.1 [link] [source]

For me, it works!

I was about to mention that since PowerShell is available for other platforms, this could affect other systems as well -- but it looks like at least PowerShell on Ubuntu doesn't have this problem.

(8.1) By Warren Young (wyoung) on 2023-09-11 19:31:45 edited from 8.0 in reply to 7 [link] [source]

While that does mean the "on Windows" qualifier on my section name covers us here, doesn't that undermine my theory of the core problem? It cannot be that it inherently pulls everything into RAM, decides how to process it, then sends it out if the Ubuntu port doesn't exhibit the same symptom.

While slagging on Windows has a certain amusement value, I prefer to be correct when I indulge in this pleasure. 😛

(10) By Florian Balmer (florian.balmer) on 2023-09-11 20:59:08 in reply to 8.1 [link] [source]

Yes, this seems somewhat strange, indeed.

But the only case where output is delayed until all input is read is PowerShell on Windows piping to an external program.

This seems not the case on Ubuntu, where the external program starts processing immediately (verified this by piping though sh -c cat -n to make sure more and similar are not aliases to internal PowerShell functionality).

(11) By Warren Young (wyoung) on 2023-09-11 21:40:53 in reply to 10 [link] [source]

I wish I knew the why of this, but regardless, I've decided to dial back the strength of the new prose, to speak only of facts actually in evidence.

(12) By Daniel Dumitriu (danield) on 2023-09-11 21:58:16 in reply to 10 [link] [source]

This would some time soon also work on Windows. Of course, you will need to install the latest PS 7 then...

(13) By Warren Young (wyoung) on 2023-09-11 22:27:31 in reply to 12 [link] [source]

So…it's not related to data volume at all, but to data encoding? It's possible to make the conversion choke even with an all-but-empty repository?

(14) By Florian Balmer (florian.balmer) on 2023-09-12 19:05:38 in reply to 13 [link] [source]

So…it's not related to data volume at all, but to data encoding?

Yes, to my very surprise! Following two tests with https://github.com/drhsqlite/fossil-mirror.git and PowerShell 5.1.22000.282 on Windows 11.

A: Export all from Git → Fossil: full pipe buffering causes graceful OOM.

PS C:\...> git fast-export --all | fossil import --git test1.fossil
fatal: Out of memory, malloc failed (tried to allocate 4203555 bytes)
Exception of type 'System.OutOfMemoryException' was thrown.

B: Export only part from Git → Fossil: data encoding problem.

PS C:\...> git fast-export master~1..master | fossil import --git test2.fossil
]ad fast-import line: [blob

So my earlier assumption that PowerShell's full pipe buffering may cause the limit proved wrong. The strange error message "]ad fast-import line: [blob", quite similar to the earlier example "]ad fast-import line: [JSON", tricked me.

(15) By Warren Young (wyoung) on 2023-09-12 21:08:54 in reply to 14 [link] [source]

I dunno; it looks like there are two failure cases here.

Thanks for testing and reporting. I would have never bothered to fire up a Windows VM and try it myself, but I am glad to know the answer.

(16) By Florian Balmer (florian.balmer) on 2023-09-13 06:09:13 in reply to 15 [link] [source]

I dunno; it looks like there are two failure cases here.

Yes, there are, but so far I think only case B was reported on this forum.

Anyway, as soon as PowerShell will connect pipelined processes directly, either problem should be solved.

(17) By Florian Balmer (florian.balmer) on 2023-09-13 07:21:29 in reply to 16 [link] [source]

Now I see where the strange error message comes from: PowerShell injects CR before LF, which get trimmed to CR alone, and instruct the terminal to move the cursor back to the start of the current line.

(18) By Konstantin Khomutov (kostix) on 2023-09-13 07:22:17 in reply to 5 [link] [source]

It's interesting to note that PS issue #559 about this problem was created in 2016, and the fix was implemented in the beginning of 2017, so it looks like PowerShells of versions ≥ 6.0 should work more sensibly.

(9) By anonymous on 2023-09-11 19:13:17 in reply to 1 [link] [source]

... I use submodules quite a lot

Not to discourage your choice of migrating from Git, however I wonder if you considered just transitioning to Fossil instead of full-history migration.

Basically, select a list of recent releases and just port them by checking out from Git and re-commit them into Fossil repo using some custom scripts. It may be possible to maintain some sort of xref to Git commit-id via Fossil tags.

This approach should also take care of submodules naturally per version integrated into the main repo release.

In case more granularity needed, the Git repo(s) are still there for continuity.

Just an idea.