Fossil Forum

"uv cat" and "uv export" generate different output
Login

"uv cat" and "uv export" generate different output

"uv cat" and "uv export" generate different output

(1) By jvdh (veedeehjay) on 2023-03-12 20:18:05 [source]

I do have an unversioned pdf file in my repo:

fossil uv export myfile.pdf export.pdf

correctly extracts the file from the repo but

fossil uv cat myfile.pdf > cat.pdf

does not (in my case the resulting file is about 75k compared to correct file size of about 55k, it also does no longer display correctly in a pdf viewer (empty pages are displayed instead).

question: is this a bug or to be expected ("uv cat" only supposed to work for text files, e.g.) ??

(2) By anonymous on 2023-03-12 20:50:05 in reply to 1 [link] [source]

What operating system are you using?

(4) By jvdh (veedeehjay) on 2023-03-12 21:07:08 in reply to 2 [link] [source]

MacOS 12

fossil version 2.22 [4e688dc0f9] 2023-03-11 23:49:52 UTC

(3) By Stephan Beal (stephan) on 2023-03-12 21:06:58 in reply to 1 [link] [source]

question: is this a bug or to be expected ("uv cat" only supposed to work for text files, e.g.) ??

My guess is that you're on Windows. The code in question indeed assumes UTF8 encoding of output for anything which goes to the console on Windows. Yes, it's arguably a bug, but it's also as designed. It doesn't do so for non-Windows platforms. Whether it should continue to make that UTF8 assumption on Windows, i'm not qualified to say. Perhaps Florian or Daniel, or one of our other heavy Windows users, can suggest whether this needs to be changed.

(@Florian, Daniel, etc: the offending code is near the top of blob.c:blob_write_to_file())

(5.1) By jvdh (veedeehjay) on 2023-03-12 21:08:41 edited from 5.0 in reply to 3 [link] [source]

no, this is MacOS and the locale is

LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=

(6) By jvdh (veedeehjay) on 2023-03-12 21:18:50 in reply to 5.1 [link] [source]

another observation in this context:

uv export

currently takes no prisoners in overwriting any existing file just so (that's one reason why currently I'd rather use uv cat and redirection).

I would much prefer if it would not do that by default (a --force flag should cause that in my view or, alternatively, there should be a setting making export behave like mv -i): overwrite confirmation would seem in order.

(9) By jvdh (veedeehjay) on 2023-03-13 09:38:13 in reply to 6 [link] [source]

regarding the above issue (which is separate from the original, spurious, issue of this thread):

would such a change to uv export behaviour be desirable? this would somewhat reduce the "destructive potential" of this special sub-command.

compared to uv sync and uv cat, it seems by far the more "dangerous" command since it will overwrite w/o warning an arbitrary file. uv cat would need redirection, thus the shell will complain/ask for confirmation (provided the noclobber option is set...) and uv sync at least does only overwrite a file with the same name which very probably is what the user wants since it is the unversioned file(s) he is actually targeting.

but uv export is something else...

(10) By Stephan Beal (stephan) on 2023-03-13 09:43:23 in reply to 9 [link] [source]

but uv export is something else...

Has it ever really happened that something got overwritten which shouldn't have, or is that a hypothetical problem? Adding a flag to force overwrite, where none was required before, risks breaking existing scripted usage so is less appealing unless the problem is a real one.

(11) By jvdh (veedeehjay) on 2023-03-13 10:06:29 in reply to 10 [link] [source]

well, it ultimately boils down to whether you like to have

alias rm='rm -i'
alias mv='mv -i'
set -C

lines in your .shrc or you find that such measures are only suitable for the weak of heart :).

how often inadvertent overwrites happen of course is a question of how careful you are and how often/seldom you mistype a file name... but "philosophically speaking": I would argue that in the context of a revision control system even (or rather especially) for unversioned files the system should ere on the conservative/data-loss-risk minimisation side.

I understand the "might break" existing scripts issue but it would break those scripts in a harmless, easily fixable way.

if that is out of the question, I would think just another setting controling the behaviour of uv export might be in order (and its default could then be the present behaviour but the user at least could change it -- just like with the shell and those aliases and noclobber settings).

and since I would presume that a massive majority of shell users does have those aliases in place I also presume those same users would like uv export to also ask them whether they really want to do that :).

(7) By Florian Balmer (florian.balmer) on 2023-03-12 22:55:44 in reply to 3 [link] [source]

In a quick test, I get correct results for binary files with any of:

  • fossil cat > file
  • fossil artifact > file
  • fossil uv cat > file
  • fossil uv export

For console output, Fossil assumes UTF-8 and converts to UTF-16 to be fed to WriteConsoleW(). But in the four cases above it seems to correctly detect that output is not a console, and the file data is written to output without conversion.

Side note: Detecting whether standard input is a console1 is somewhat quirky on Windows, because some shells connect to Fossil by pipes instead of real console handles, and obviously there's still one problem case2. But with the new pseudo console handles (since Windows 10), this problem may go away, hopefully, as they look like real console handles to the children.


  1. ^ https://www.mail-archive.com/fossil-users@lists.fossil-scm.org/msg25237.html
  2. ^ http://fossil-scm.org/forum/forumpost/922f13dd81

(8) By jvdh (veedeehjay) on 2023-03-13 09:05:34 in reply to 7 [link] [source]

you are right and I was wrong and should not have opened this thread (my apologies to everybody!).

as I now have realized, the behaviour I described (uv cat producing faulty output) occurs in a Tcl/Expect wrapper I am using as my 100% drop-in replacement of fossil's CLI :(. fossil proper behaves correctly (I have no idea right now why Expect does mangle the uv cat output but it probably will be the binary vs text issue - will have to look into that: everything else, notaby fossil cat of source files is handled just fine).

apologies again for the false alarm.