(Remote) code execution in fossil git export

(1.1) By Midar3 on 2020-06-08 01:45:43 edited from 1.0 [link] [source]

fossil git export does not properly shell-escape inputs:

zTagCmd = mprintf("git tag -f \"%s\" %s", zTagname, zObj);

Furthermore, gitmirror_sanitize_name() is incomplete and doesn't not strip dangerous characters.

This means that the following results in (remote) code execution in fossil git export (remote in the case where someone provides an automatic mirroring service):

fossil tag add '";touch${IFS}code_exec;"' tip

After fossil git export ., there's now a file called code_exec. An attacker could have put an entire reverse shell here.

As last time, I would have prefered to report this privately, but I still contact info for that.

A CVE should be obtained for this bug, so that distributions will be made aware of the need to apply patches for security or update if there will be a new release for this.

(2) By Stephan Beal (stephan) on 2020-06-08 01:48:44 in reply to 1.1 [link] [source]

As last time, I would have prefered to report this privately, but I still contact info for that.

Security-relevant topics can be emailed directly to project lead Richard Hipp via drh@sqlite.org.

(3) By Richard Hipp (drh) on 2020-06-08 12:49:01 in reply to 1.1 [link] [source]

Y'all, please verify that check-in c9a592dde7fe493f is a complete and sufficient fix for this problem. Once it is verified, I will issue a 2.11.1 version with the patch.

(4.1) Originally by Midar3 with edits by Richard Hipp (drh) on 2020-06-08 19:40:38 from 4.0 in reply to 3 [link] [source]

It's a little bit hard to review this way. I suggest to switch this to a switch / case - modern compilers will create a LUT anyway (or a partial LUT, which even saves on binary size as well) if it's faster on the target system. And in any case it should be const, so that it can end up in section .rodata instead of section .data (and also allow more optimizations, since the compiler knows it can't change).

It seems (if I made no mistake) that the following characters are allowed:

!#%&()+,-./0123456789<=>ABCDEFGHIJKLMNOPQRSTUVWXYZ]_abcdefghijklmnopqrstuvwxyz{|}

Is it really sensible to allow more than [0-9A-Za-z\-_]+ (regex)? Depending on the shell used, I would not be surprised if there's ways to execute code.

Also, I think it's probably worth reviewing all other uses of fossil_system(). I recommend to deprecate fossil_system() entirely, as system() is bad by design: The escaping is up to the user, and it's unknown to the developer what exaclty needs escaping, as that depends on the user's system. I suggest to migrate to posix_spawn() on UNIX and CreateProcessW() on Windows - these take an array of arguments, and hence don't have any escaping problems, and avoid the extra round trip via the shell.

(5) By Midar3 on 2020-06-08 19:45:56 in reply to 4.0 [link] [source]

I have not tried this, but having a quick look at http_transport.c, I'm a little worried: The repository name seems to be copied into the command to be executed. It might therefore be possible to create a repository that, when cloned, executes code. The code in particular that worries me:

zCmd = mprintf("\"%s\" http --in \"%s\" --out \"%s\" --ipaddr 127.0.0.1"
               " \"%s\" --localauth",
   g.nameOfExe, transport.zOutFile, transport.zInFile, pUrlData->name
);

I traced where zInFile and zOutFile come from, and apparently, they contain the repo name:

  transport.zOutFile = mprintf("%s-%llu-out.http",
                                   g.zRepositoryName, iRandId);
  transport.zInFile = mprintf("%s-%llu-in.http",
                                   g.zRepositoryName, iRandId);

I'm not sure how much control attacker has about zRepositoryName - if an attacker can control it, I suspect they can execute code. Similarly in winhttp.c.

Maybe it's safe. But it's hard to prove. It just shows how this code with system() is hard to review for being safe, so I think moving away from fossil_system() could close a lot of potential holes at once.

(6) By anonymous on 2020-06-08 19:46:29 in reply to 4.0 [link] [source]

I suggest to migrate to posix_spawn() on UNIX and CreateProcessW() on Windows - these take an array of arguments, and hence don't have any escaping problems, and avoid the extra round trip via the shell.

Unfortunately, the command line arguments on Windows are always a string, probably for backwards compatibility reasons from CP/M days, so the caller of CreateProcess has to quote them for the runtime of the child process to decode the arguments back into char ** argv. One of the ways to perform that transformation is described in Daniel Colascione's blog.

(7) By Midar3 on 2020-06-08 20:04:18 in reply to 6 [source]

Ah, yeah, you're right. It's been a few years since I wrote code for this :). But it's still possible to create something like posix_spawn() with it, so that only this one instance needs to be reviewed thoroughly instead of every place where it's used.

(8) By Richard Hipp (drh) on 2020-06-09 00:49:47 in reply to 1.1 [link] [source]

The original problem has been fixed, and new 2.10.1 and 2.11.1 releases made. All binaries on the download page have this problem fixed.

Multiple passes through the code show no additional problems. But for additional security, the following changes have been made:

All calls to fossil_system() also invoke fossil_assert_safe_command_string() on the command string. If the command-string seems unsafe, the process aborts with a fatal error and never invokes system(). This restricts the generality of fossil_system(), but that's ok. Please audit this procedure for correctness and completeness.
The new "%$" substitution letter has been added to mprintf(). This works like "%s" except that it does appropriate shell escaping for filenames. The blob_append_escaped_arg() subroutine is used to do the escaping. Many of the places that build command-strings now use "%$" rather than their own home-grown escaping. Please audit this procedure for correctness and completeness.

(9) By Richard Hipp (drh) on 2020-06-09 17:54:32 in reply to 8 [link] [source]

For testing the fossil_assert_safe_command_string() safety-net, there is a new test-command on trunk:

fossil test-fossil-system

Run the command above and at each prompt, type strings to be run by the shell (/bin/sh on unix or CMD.EXE on windows). If Fossil thinks the command is unsafe, it will show you the unsafe part and refuse to run the command.

Please experiment with this. Use all your cleverness to try to devise commands that will do nefarious things. Report your findings.

What "safe" means in this context

A "safe" command is one that only runs the one command specified at the beginning of the command line. In other words, it is not possible to trick the shell into running a second command.

This is a safety net

To be clear: This mechanism is a safety net. It is a fallback to catch problems that are missed earlier in the system. We should not rely on the fossil_assert_safe_command_string() to prevent mischief. The mischief prevention should occur at a higher level. This mechanism only kicks in if the higher-level mischief prevention malfunctions. This is intended as defense-in-depth.

(12.2) By Phil Maker (pjm) on 2020-06-10 01:21:20 edited from 12.1 in reply to 9 [link] [source]

G'day,

The fossil_assert_safe_command() looks fine, but there are two very minor questions, redirection is ok? and should the length of z be checked. In particular:

1. Redirection is allowed according to the requirements but I can't help thinking it is a possible attack. This is taken care of further up the stack but if I can inject echo ...>t I can do bad things. Attacks might redirects into /proc, etc.

2. Should we check the length of z so that an overflow attack isn't possible. This boils down to the old does size_t == int which I think is an assumption in the code. You also have the signed vs unsigned issue to worry about.

For example since strlen(z) == MAX_INT then i+1 would wrap around and then we could tick it up 0 and then this say the command is safe. So for a 64 bit size_t unsigned and a 32 bit signed int and if we can generate a huge which should be limited by the maximum path length, and limits in exec*(2).

void fossil_assert_safe_command_string(const char *z){
  int inQuote = 0;
  int i, c;
  int unsafe = 0;

Happy to a bit of work on this or demonstrate a real attack which I haven't done yet. My guess is these are non-issues, in the modern world, but perhaps I'm wrong. I've looked for assumptions for sqlite and I'm guessing its all covered there.

Reference:

https://www.lysator.liu.se/c/ten-commandments.html

(10) By Midar3 on 2020-06-09 18:29:37 in reply to 8 [link] [source]

Any reason to not switch to a fossil_spawn() method instead? It also avoids the extra round-trip to the shell.

E.g. fish shell doesn't need a $ for subcommands, () is enough. That would still not be filtered.

Trying to escape for the shell is just more complex and has more unknowns - why prefer that over a solution that avoids the shell entirely?

(11) By Richard Hipp (drh) on 2020-06-09 18:52:14 in reply to 10 [link] [source]

Because fossil_spawn() is vaporware. The existing design compiles and runs and works and is portable across all known platforms and is (now) secure as far as anybody can tell. And I have other more pressing problems to solve at the moment.

You are welcomed to contribute code to implement fossil_spawn() if it is important to you.

Note that for many current uses of fossil_system(), a naive fossil_spawn() alternative will not work. For example, the editor used for editing check-in comments is obtained from the VISUAL environment variable. But the value of VISUAL need not be a simple command. It can be a command plus arguments, as long as it it allows the file appended to be edited. So to make VISUAL work as an editor, you either have to:

Learn to escape filenames properly (which we already do)
Parse the VISUAL variable up into a command-name plus arguments so that they can be fed into fossil_spawn().

Note also that some uses of fossil_system() deliberately run commands in the background. For example, when you run "fossil ui" and it starts up your web-browser for you automatically, that fossil_system() call includes the '&' character at the end of the command-line so that it runs in the background. Other uses of fossil_system() run in the foreground, such as when running the VISUAL editor. If you provide a fossil_spawn() implementation, be sure that it supports both modes of operation.

(13) By Midar3 on 2020-06-13 13:00:05 in reply to 11 [link] [source]

Yes, the places where you actually want evaluation by the shell of course make no sense to replace. I was talking about those where you don't want evaluation by the shell: Having fossil_spawn() seems easier to get right than having everything escaped properly for every single shell out there.