Converting a Git repository and mirroring it to GitHub does not work as advertised on Windows 10
(1) By Cthulhux on 2019-08-23 21:01:34 [link] [source]
I'd like to transfer some (well: all) of my Git repositories to Fossil, including those which I maintain on Windows 10. Naturally, to lower the barrier etc. etc., I'd still like to have a GitHub mirror.
I am prepared:
> fossil version
This is fossil version 2.9 [0fd79a3e09] 2019-07-13 13:05:19 UTC
> git version
git version 2.22.0.windows.1
Now Windows's PowerShell (seemingly) does not support the recommended way too well - maybe it's a Unicode misunderstanding?
> cd REPOSITORY
> git fast-export --all | fossil import --git new-repo.fossil
]ad fast-import line: [JSON
Uhh.
Trying on cmd
:
Rebuilding repository meta-data...
100.0% complete...
Vacuuming... ok
project-id: 867a8d350eca36a6ca8f585261545e726069a43b
server-id: d10314f8c47ea14c157c4bd52d75527beef8d54c
admin-user: (redacted)
Hmm, OK. We'll stay with cmd
then...
> fossil git export ../repo-git --autopush https://(redacted)@github.com/(redacted).git
WARNING: The repository database has been replaced by a clone.
Bisect history and undo have been lost.
git init 'C:/REPOS/repo-git'
fatal: cannot mkdir 'C:/REPOS/repo-git': Invalid argument
cannot initialize the git repository using: "git init 'C:/REPOS/repo-git'"
O...kay?
> cd ..
> git init
(Worked. Trying again!)
> cd ..\REPOSITORY
> fossil git export ../repo-git --autopush https://(redacted)@github.com/(redacted).git
WARNING: The repository database has been replaced by a clone.
Bisect history and undo have been lost.
git fast-import --export-marks=.mirror_state/marks.txt --quiet --done
fatal: Unsupported command: blob
fast-import: dumping crash report to .git/fast_import_crash_6188
1 check-ins added to the C:/REPOS/repo-git
git update-ref "refs/heads/master"
usage: git update-ref [<options>] -d <refname> [<old-val>]
or: git update-ref [<options>] <refname> <new-val> [<old-val>]
or: git update-ref [<options>] --stdin [-z]
-m <reason> reason of the update
-d delete the reference
--no-deref update <refname> not the one it points to
-z stdin has NUL-terminated arguments
--stdin read updates from stdin
--create-reflog create a reflog
git push --mirror https://(redacted)@github.com/(redacted).git
To https://(redacted)@github.com/(redacted).git
! [remote rejected] master (refusing to delete the current branch: refs/heads/master)
I would assume that the git push
command error in the end is entirely my fault - the GitHub project already exists and I forgot to purge it before starting the conversion process. So this is not a big deal, I guess...
Closing?
> fossil close
WARNING: The repository database has been replaced by a clone.
Bisect history and undo have been lost.
I could repeatedly reproduce this warning with freshly imported Git repositories on FreeBSD with Fossil 2.8 and on WSL/Debian with Fossil 1.37, so it seems that imported repositories are not that well supported just yet. Or how can I get rid of that warning? It also appears in the web UI.
Either way, following the suggested "walkthrough" leads to way too many errors, at least on Windows:
- fatal: Unsupported command: blob - looks like a Git error? What do I need to change?
- The usage: git update-ref thing: Does Fossil call it with a wrong syntax?
- Why does
git init
fail? - What's the problem with the "ad fast-import line" in PowerShell? What does it do differently and why can't Fossil handle it?
Looking forward to any nudge to the right direction! :-)
(2) By Warren Young (wyoung) on 2019-08-24 12:11:49 in reply to 1 [link] [source]
maybe it's a Unicode misunderstanding?
If it's a conflict between UTF-8 and UTF-16, one way to straighten it out would be to get everyone onto UTF-8, either by running everything under Cygwin or everything under WSL.
In the Cygwin case, I mean that you should use the Fossil and Git packages that come from Cygwin. Don't mix in either your native Windows binary of Fossil or the Git-for-Windows quasi-native environment. Get them entirely out of the PATH, if you can.
With WSL, I don't think it's actually possible to mix things up this way, but personally, I'd still go to Cygwin first due to its maturity.
WARNING: The repository database has been replaced by a clone.
Known bug. It's certainly harmless in your case, since the Fossil repo is fresh. There is no bisect or undo history to lose! Just ignore it.
fatal: Unsupported command: blob
Now that one's interesting. "blob
" isn't a sub-command of either git
or fossil
. My wild guess is that the command parser is out of sync somehow, due to crossing the boundary between native Windows + cmd.exe
in the Fossil case and the POSIX-like MSYS portability environment that Git must run under, which uses some kind of Bourne shell script.
They're both calling system()
, which uses the "system" shell, but they don't agree on the same definition of "system"!
Thus my suggestion to use Cygwin or WSL.
(3) By Cthulhux on 2019-08-24 12:20:05 in reply to 2 [link] [source]
WSL would be an option, but AFAICS "Debian for Windows" is really outdated - it will probably take ages before Fossil 2.9 (which I need for the Github export) reaches the repositories. Fossil 2.x does not even seem to exist yet (or have I looked in the wrong place?). Does Cygwin have updated packages? That would be an option indeed, although it's probably not the best of all options.
Thank you for the information about the known warning. So I'll leave that there for now. :-)
(6) By Warren Young (wyoung) on 2019-08-24 13:39:10 in reply to 3 [link] [source]
"Debian for Windows" is really outdated
The problem isn't with WSL, it's with Debian. They have a purposeful policy of freezing application versions in place a month or two prior to a major release, which only happens every 2-3 years. That version then stays the same until the next major version.
When the prior major Debian release did this to us, it froze on Fossil 1.34, the release just prior to our major 2.0/2.1 upgrade, which created a potential repository incompatibility. Any Fossil repos that did not go out of their way to avoid falling into that incompatibility couldn't be opened on Debian 9 for a span of about 2 years as a result.
Debian 10 was just released a few months ago, with the feature freeze being a few months before that, so it captured Fossil 2.8, not 2.9, and it will stay that way for the next 2-3 years! Nothing Microsoft is likely to do will fix this.
I offer you several alternatives:
Don't use Debian. If you demand the latest versions all the time, it is not the OS distro for you. There are other Linux flavors available for WSL, one of which might include 2.9. If it's a "rolling release" type of OS, it should then get 2.10 shortly after it's released, which should be soon.
Build from source. It's not difficult, especially on Linux. Any difficulties you do have are likely to be one-offs, so that you never have to push through them again on that same machine, and doing it on the next machine will be easier.
Try the pre-compiled Linux binary. I put this last because Linux's library ABI isn't very stable across distros, so that binary is most reliable on the distro and version it was built on. On others, it's a crapshoot.
Does Cygwin have updated packages?
Yes, for packages with an active maintainer, which includes Fossil.
Unlike Debian, Cygwin is more of a "rolling release" type of software distribution.
(7) By Cthulhux on 2019-08-24 14:31:42 in reply to 6 [link] [source]
I see. I'll "update" to Cygwin tomorrow when I'm back home, thank you. :) Linux is not really my cup of tea, I use the WSL exclusively for plan9port, honestly.
(4) By Florian Balmer (florian.balmer) on 2019-08-24 12:43:00 in reply to 1 [source]
Now Windows's PowerShell (seemingly) does not support the recommended way too well - maybe it's a Unicode misunderstanding?
The PowerShell pipes should be able to handle "binary" data, i.e. as long as fossil.exe
and git.exe
use the same encoding, things should work.
Yet it seems that PowerShell is buffering the entire output of git fast-export
before invoking fossil import
, while CMD.EXE is running them in parallel with smaller buffer chunks. So the ]ad fast-import line: [JSON
line may be the symptom of an out-of-memory condition of PowerShell with a very large repository?
(5) By Cthulhux on 2019-08-24 13:01:05 in reply to 4 [link] [source]
I get the same error trying to export into a file and importing that afterwards.- The repository has five files or so, nothing big... A buffering problem could be there though!
(8) By Florian Balmer (florian.balmer) on 2019-08-24 18:40:38 in reply to 5 [link] [source]
In PowerShell, the expression cmd1 ¦ cmd2
seems to be just an alias of a more complex construct, treating the data stream between cmd1
and cmd2
as an object that could be filtered, or otherwise modified, thus requiring full user-mode buffering.
Regarding CMD.EXE, my guess is that the write-end of a pipe is set as stdout
for the first program, and the read-end of that same pipe is set as stdin
for the second program. So the OS/kernel takes care of buffering, probably with (small) buffer sizes in the order of magnitude of (several times) the OS page size. By blocking writes to "full" pipes, and blocking reads from "drained" pipes, this also causes some synchronization/memory balance between the two processes -- while in PowerShell, the second process will only run after the first has terminated.
git fast-export
has to write every single file artifact contained in the repository (along with some framing/control code) in full, uncompressed format to stdout, so the amount of data may already become quite large even with "moderately-sized" repositories, and full buffering becomes expensive (see /stat
→ Uncompressed Artifact Size).
The strange ]ad fast-import line: [JSON
line may also origin from git.exe
or from fossil.exe
due to memory allocation failure confusing internal bookkeeping (yet, unlikely), after PowerShell's buffering has exhausted free memory.
(9) By Florian Balmer (florian.balmer) on 2019-08-24 19:23:53 in reply to 5 [link] [source]
I get the same error trying to export into a file and importing that afterwards.- The repository has five files or so, nothing big...
Unless PowerShell would do the same full-buffering with files? But you said a small repository. So, dropping the pipe-buffer-is-exhausting-memory theory, for now ;-)
(10) By ckennedy on 2019-09-02 18:54:07 in reply to 9 [link] [source]
Actually you are correct Florian. PowerShell does full buffering of pipelines and files by default, unless you have structured the commands with proper attributes to let PowerShell know it can handle importing a data stream. Since Fossil is NOT a PowerShell cmdlet, PowerShell will always fully buffer the file stream. Redirection might work here rather than piping. I can't remember the full specification at this time, and I don't have any Git repositories to try this out on.
Thanks.