Fossil Forum

a changes filter for executable?
Login

a changes filter for executable?

Changes filter for flipped executable bits -- Curing EXECUTABLE woes

(1.1) By Larry Brasfield (larrybr) on 2022-01-16 02:25:24 edited from 1.0 [link] [source]

(No changes except to title, to aid searches by those suffering the same woe,
and to reference post #10 where a mitigation may be found.)

I recently had cause to wrestle with an old problem where, due to using Linux fossil for sources kept in a NTFS filesystem (via WSL), the "executable" bit(s) get messed up. I have a pretty good solution for this lined up, but it's missing one important if not vital piece. I have looked for this piece in present Fossil code and fossil help, and found something close but not quite close enough.

What's missing is a way to determine which files in a checkout have "x" bits that differ from what they should be for the stored/committed version. The changes subcommand is close to this, but suffers one deficiency. For files whose content has not been altered, it will blat an 'EXECUTABLE' or 'UNEXEC' tag depending upon how the "x" bit(s) have been flipped. But for files with altered content, the tag is 'EDITED' and nothing more. Sadly, it is the edited files which most commonly need some sort of intervention to avoid gratuitous and unwelcome "x" bit changes at the pending commit.

Hence, I would like to make a slight feature change to Fossil. This would be to add a flag or filter whereby just the "x" bit changes can be output by the changes subcommand. Or if other filters are specified along with the new, "--executable" flag, the "x" bit changes remain visible rather than being clobbered by other tidbits such as 'EDITED', 'MUNGED_BY_MERGE', etc.

(2.1) By Warren Young (wyoung) on 2022-01-13 18:34:55 edited from 2.0 in reply to 1.0 [link] [source]

The best solution I can come up with is an "exec-glob", which tells Fossil which files are expected to have +x flags, so it can warn you when they change unexpectedly.

While we're waiting, though:

Linux fossil for sources kept in a NTFS filesystem (via WSL)

That sounds like WSL 1. WSL 2 uses regular Linux filesystem images, so it won't exhibit this problem.

EDIT: …unless, that is, you put your checkout directories under /mnt/c, in which case you're back to NTFS and you deserve what you get. 😛 Separate checkouts for the Linux and Windows sides will help here.


WSL 1 is a Windows subsystem, using its internal APIs to mimic the Linux/POSIX API. WSL 2 is a lightweight virtual machine that runs alongside Windows and tries to behave seamlessly, but it's real Linux underneath, so it fixes dozens of known problems with WSL 1, not just this one.

(3) By Martin Gagnon (mgagnon) on 2022-01-13 18:42:23 in reply to 2.0 [link] [source]

Linux fossil for sources kept in a NTFS filesystem (via WSL)

That sounds like WSL 1. WSL 2 uses regular Linux filesystem images, so it won't exhibit this problem.

Can happens on WSL2 If working from /mnt/c/ dir.

Been there, I decided to use a small msys2 env with native windows fossil executable instead.

(4) By Stephan Beal (stephan) on 2022-01-13 19:24:16 in reply to 1.0 [link] [source]

For files whose content has not been altered, it will blat an 'EXECUTABLE' or 'UNEXEC' tag depending upon how the "x" bit(s) have been flipped. But for files with altered content, the tag is 'EDITED' and nothing more.

The reason it sets "edited" in those cases is because once it sees that the file has been modified, it stops looking for further changes. A change in size, for example, automatically warrants "edited." Permissions changes, being relatively seldom, are among the last things checked. The vfile table can only record one type of "is-changed" flag, so the is-modified check trumps the permissions check. This happens, IIRC, in vfile_check_signature() in vfile.c: grep that function for "chnged" (note the missing "a") and you'll see what's happening there.

It can't be made to report more than one type of change without changing the semantics of vfile.chnged and modifying the is-it-changed checks to check for all possible types of changes instead of stopping at the first change it sees.

(5) By John Rouillard (rouilj) on 2022-01-14 04:29:30 in reply to 1.0 [link] [source]

Would a pre-commit script that validates executable-ness against an external list of executables help?

At the very least it can stop checkin of files that were (but are not currently) executable.

Is there a way to list the executable status of the base revision? That way fossil could keep the info rather than an external file.

Since the pre-commit script runs locally, you can pass in environment variables to control it:

DONT_CHECK_EXE_STATUS=1 fossil ci -m "my latest changes"

could disable the check.

(6) By Stephan Beal (stephan) on 2022-01-14 04:41:48 in reply to 5 [link] [source]

Is there a way to list the executable status of the base revision?

You can get what fossil things is the current state via:

$ fossil sql "select isexe, pathname from vfile where isexe<>0"
1,'autosetup/autosetup'
1,'autosetup/autosetup-config.guess'
1,'autosetup/autosetup-config.sub'
1,'autosetup/autosetup-find-tclsh'
1,'configure'
1,'docker/build-static-s2sh.bash'
1,'s2/createSignatureTypeList.sh'
1,'s2/r-tester.sh'
1,'s2/reportFromVg.s2'
1,'s2/s2sh.sh'
1,'tools/ucd2casefold'
1,'vgLastFewVersions.sh'

but that state is based on the what's in the filesystem. However, we can fish it out of the checkout's manifest...

$ fossil artifact current | awk '($1=="F" && $4=="x"){print $4 " " $2}'
x autosetup/autosetup
x autosetup/autosetup-config.guess
x autosetup/autosetup-config.sub
x autosetup/autosetup-find-tclsh
x configure
x docker/build-static-s2sh.bash
x s2/createSignatureTypeList.sh
x s2/r-tester.sh
x s2/reportFromVg.s2
x s2/s2sh.sh
x tools/ucd2casefold
x vgLastFewVersions.sh

With the obligatory caveat that filenames with spaces will be in "fossilized" format, with spaces replaced by \s. So long as the checkout doesn't have spaces in its filenames, the above could be scripted in something which compares the local checkout's is-exec state (that's what's reported by "changes") and the checked-out version's immutable is-exec state.

(7) By Larry Brasfield (larrybr) on 2022-01-14 08:16:15 in reply to 6 [link] [source]

I was very glad and grateful to see this, particularly the 2nd report. (I can get the file states easily enough from either the Windows side or the Linux side, although your method is somewhat simpler.)

However, after building a tool to report discrepancies between x bits in the current artifact and in files within the checkout, I ran into (what I suspect is) a small problem. When I chmod a file in the checkout to alter its x bit, then run my tool, it reports Nothing. (What???) Upon investigation, I find that the output (after suitable format tweaking) is identical for both methods. In particular, output from your 1st method ("based on what's in the filesystem") does not reflect my chmod done for testing purposes.

I suspect and hope that this issue is merely a side effect of doing nothing to cause fossil to update its notion of what is actually in the checkout. I'm guessing I have to change a file size or timestamp or something to get fossil to actually look at the mode bits again. Is there some less hoky way to get fossil to update its vfile table?

(8) By Stephan Beal (stephan) on 2022-01-14 08:32:49 in reply to 7 [link] [source]

However, after building a tool to report discrepancies between x bits in the current artifact and in files within the checkout, I ran into (what I suspect is) a small problem. When I chmod a file in the checkout to alter its x bit, then run my tool, it reports Nothing. (What???) Upon investigation, I find that the output (after suitable format tweaking) is identical for both methods. In particular, output from your 1st method ("based on what's in the filesystem") does not reflect my chmod done for testing purposes.

The vfile entry is only updated when fossil scans for changes. Simply run "fossil status" after your chmod and vfile will be updated.

(9) By Larry Brasfield (larrybr) on 2022-01-15 00:27:52 in reply to 8 [link] [source]

My study of the code shows (me) that the vfile table represents the state of checkout files if they are unmodified in any way. This can be seen in the vfile_check_signature() function where this appears: origPerm = db_column_int(&q, 8); currentPerm = file_perm(zName, RepoFILE); , in which origPerm is being taken from a query on vfile joined with something else. The file_perm() function ultimately gets stat() done on the subject physical file.

This is why my tool uniformly reports no discrepancies right now; it is comparing the same data which has merely taken different pathways up to the comparison.

Curiously, and only tangentially related, the code which gets actual x bits is gnarly and difficult enough to decode that I question somebody's sanity, (either mine or its author(s).) The file_perm() function reads (in part:) if( !getStat(zFilename, eFType) ){ if( S_ISREG(fx.fileStat.st_mode) && ((S_IXUSR)&fx.fileStat.st_mode)!=0 ) return PERM_EXE; else if( db_allow_symlinks() && S_ISLNK(fx.fileStat.st_mode) ) return PERM_LNK; } . This appears, at first glance, to give executable-marked files special status with respect to whether db_allow_symlinks() is true. But since getStat() bottoms out in fossil_stat() which checks db_allow_symlinks() before doing any stat() or lstat() calls, that special status is illusory. However puzzling, correct or slightly awry this code is, I will not be touching it!

Fortunately, there are traditional ways to get actual file x bits without going through fossil to get them. Unless you know of some cached result of fossil's permission checking, I now happily settle for getting the repo-file x bits using your 2nd (and 1st ;-) method and using "stat --printf=%A" with cut to get actual file x bits.

(10.2) By Larry Brasfield (larrybr) on 2022-01-16 02:19:46 edited from 10.1 in reply to 1.0 [source]

(Edited once more to be a clean, robust and complete solution, thanks to some generously shared tips.)

The following is a bash script which reports or fixes discrepancies between "x" bits on actual files of a Fossil checkout and what those "x" bits are when those files are in their pristine, just-opened checkout state. Run it without any argument to see invocation options.

This addresses the problem reported here earlier where "fossil status" or "fossil changes" emit a flurry of "EXECUTABLE" tags when a checkout is kept in an NTFS repo and dealt with from both Windows apps and Linux apps. And it addresses the worse problem where stray "x" bit changes might be checked-in.


#!/usr/bin/bash

pgm=$(basename $0)

function help {
  echo Usage: $pgm [option]
  cat - <<HELP
  Report or fix x-bit discrepancies between files in a Fossil checkout and
  their pristine state. This is useful when: a Fossil checkout is kept in
  NTFS; created or otherwise used within WSL; its files are manipulated by
  Windows tools (which know nothing of the extended attributes WSL stores
  into the NTFS MFT to record/use *Nix permissions); and then any Fossil
  status or changes subcommands ares to be run from WSL without a flurry
  of EXECUTABLE noise, or a Fossil commit is to be avoided when it would
  record inadvertantly modified execute permission bits in project history.
 
  Empty output and success exit indicates absence of x-bit discrepancies,
  where "empty" means an empty table, a "true" how-to-fix string, or no
  counts emitted when the --tsv, --fixes or --summary options are given.
 
  With invocation option --tsv, TSV showing the x-bit discrepancies is
  in emitted tabular form, with these 3 columns: file xHave xRepo
 
  With invocation option --fixes, the output is in a form which, when
  executed by a *Nix shell, eliminates the discrepancies using chmod.
  From the shell, "eval \$($pgm --fixes)" will flip any discrepant
  x-bits back to their pristine state. With adequate user permissions
  that eval should always succeed absent strange, parallel operations.

  With invocation option --unflip, the above-suggested eval is done.
  Its success return always means that no discrepant files remain.

  With invocation option --summary, discrepancy counts are reported to
  stderr if any are non-zero and success is returned only if they are
  both zero. This is useful as a check prior to running "fossil commit".

  Any option can be abbreviated to -?, where '?' is its first letter.
HELP
}

# Take help pleas and emit help to stdout, without error.
if [ "$1" == "-h" ] || [ "$1" == "--help" ] || [ "$1" == "" ] ; then
  help
  exit 0
fi

declare -i fixes=0 tsv=0 summary=0

case "$1" in
  --fixes|-f)
    fixes=1
  ;;
  --summary|-s)
    summary=1
  ;;
  --tsv|-t)
    tsv=1
  ;;
  --unflip|-u)
    eval $( $0 --fixes )
    exit $?
  ;;
  *)
    echo Unknown option "$1". Enter \"$pgm --help\" for help. >&2
    exit 1
  ;;
esac

# Define some functions to create TSV tables.
# A '-' or 'x' is emitted for 1st column, reflecting user-executable mode.
# The project-root-relative pathname is emitted for the 2nd column.

function HeaderX {
  echo -e "ExBit\tFile"
}

# Blat TSV for regular files within a Fossil checkout, from its root directory.
# This will include controlled files and files not under version control.
function HaveX {
  HeaderX
  find . -type f -printf '%M\t%P\n' | cut --bytes=4,11-
}

# Blat TSV for file objects that are tracked in current tip of a Fossil repo.
function RepoX {
  HeaderX
  /usr/bin/fossil artifact current \
      | awk '($1=="F"){print (($4=="x")?"x":"-") "\t" $2}'
}

# Setup some temp files.
XbitTsv=/tmp/$$_
fHave=${XbitTsv}Have.tsv
fRepo=${XbitTsv}Repo.tsv
db=${XbitTsv}XBits.sdb

# Find something to use as fossil, else fail.
fossil=""
if [ -n "$(which fossil)" ] ; then
  fossil=$(which fossil)
elif [ -x /usr/bin/fossil ] ; then
  fossil=/usr/bin/fossil
elif [ -n "$FOSSIL" ] ; then
  fossil="$FOSSIL"
else
  echo $pgm cannot work without a way to run Fossil. >&2
  exit 2
fi

# Get to the checkout root if possible, else fail.
projroot=$(/usr/bin/fossil info | awk '(/^local-root:/){print $2;}')
if [ -z "$projroot" ] ; then
  echo Not within a Fossil checkout. $pgm can do nothing. >&2
  exit 1
fi
(pushd $projroot >/dev/null \
     && trap "rm -f $fHave $fRepo $db; popd >/dev/null" EXIT) \
    || (echo "Cannot cd to $projroot." ; exit 1)

# Generate the input TSV for comparison.
HaveX > $fHave
RepoX > $fRepo

# Load SQLite transient DB with x-bit values for repo and actual files.
sqlite3 -batch -cmd '.mode tabs' -cmd ".import $fHave Have" \
        -cmd ".import $fRepo Repo" $db << "LOAD"
 create view XBitFlips
 as select h."File" as file, h."ExBit" as xHave, r."ExBit" as xRepo
 from Have h, Repo r where h."File"=r."File" and h."ExBit"<>r."ExBit" ;
.quit
LOAD

if (( $fixes==1 )) ; then
 # Output chmod invocations that would cure the discrepancies.
 set_chmods=$(
 sqlite3 -batch -cmd '.header off' $db << "SFIX"
 select group_concat(file,'|') from XBitFlips where xHave='-';
.quit
SFIX
 )
 clr_chmods=$(
 sqlite3 -batch -cmd '.header off' $db << "CFIX"
 select group_concat(file,'|') from XBitFlips where xHave='x';
.quit
CFIX
 )
 if (( ${#set_chmods}+${#clr_chmods} > 0 )) ; then
  declare -i nfa=200
  # Number of file arguments will be limited to avoid certain limits.
  let nfa=$(getconf ARG_MAX)/$(getconf _POSIX_PATH_MAX .)/3
  printf 'pushd %q >/dev/null ;\n' "$projroot"
  if (( ${#set_chmods}>0 )); then
    printf 'echo -n %q| xargs -n %d -d "|" chmod u+x ;' "$set_chmods" $nfa
  fi
  if (( ${#clr_chmods}>0 )); then
    printf 'echo -n %q| xargs -n %d -d "|" chmod u-x ;' "$clr_chmods" $nfa
  fi
  echo 'popd >/dev/null ;'
 else
   echo true
 fi
fi

if (( $tsv )) ; then
 # Output is TSV that may be .import'ed into SQLite for further processing.
  echo -e "file\txHave\txRepo"
  sqlite3 -batch -cmd '.header off' -cmd '.mode tabs' $db << "EOHR"
  select * from XBitFlips;
.quit
EOHR
fi

if (( $summary )) ; then
  declare -i badSetX badClrX badAny
  qBadSetX="select count(*) from XBitFlips where xHave='x'"
  qBadClrX="select count(*) from XBitFlips where xHave='-'"
  badSetX=$(echo -e "$qBadSetX;\n.quit" | sqlite3 -batch $db)
  badClrX=$(echo -e "$qBadClrX;\n.quit" | sqlite3 -batch $db)
  let badAny=$badSetX+$badClrX
  if (( $badAny > 0 )) ; then
    echo "X-bits flipped from Repo state: $badSetX set and $badClrX clear" >&2
    exit 1
  else
    exit 0
  fi
fi

(11) By Martin Gagnon (mgagnon) on 2022-04-01 16:27:48 in reply to 1.1 [link] [source]

Sorry to resuscitate this thread, but this is something I would really like to be handled by fossil.

As a proof of concept, I made a very simple (hardcoded) temporary fix to ignore executable bit (and symlink since they are handled together) exactly how windows compiled fossil handle it. It seems to works.

I would like have this as an option, so it can land on trunk eventually.
(Since it's a problem that affect any Windows/WSL users that use WSL fossil binary directly on a ntfs drive (example: on /mnt/c/somedir/...), I think I would not be the only one beneficing from this.

Questions about the implementation of this option:

  • How to enable/disable this new option ?

    • using fossil set ?
      • (caveat: per repo or global, would make sense per checkout)
    • command line argument on clone and saving the option on the checkout.
      • (may need a new command to flip the option after the clone is done, in case the switch was forgotten during clone)
      • we have no setting per checkout now, it would be a premiere.
    • other ideas ?
  • What name should we use for this option ?

    • ignore-unix-file-meta (default off, no effect on windows compiled binary)
      • (a bit verbose, but generic and cover symlink case)
    • ignore-exec-flag
      • (would need to ignore only exec flag to be consistent with the name)
    • other ideas ?

(12) By Martin Gagnon (mgagnon) on 2022-04-01 20:23:26 in reply to 11 [link] [source]

I just saw that git have a config called "fileMode" for this.

from git-config(1)

core.fileMode
    Tells Git if the executable bit of files in the working tree
    is to be honored.
_
    Some filesystems lose the executable bit when a file that is
    marked as executable is checked out, or checks out a
    non-executable file with executable bit on. git-clone(1)
    or git-init(1) probe the filesystem to see if it handles the 
    executable bit correctly and this variable is automatically
    set as necessary.
_ 
    A repository, however, may be on a filesystem that handles
    the filemode correctly, and this variable is set to true when
    created, but later may be made accessible from another
    environment that loses the filemode (e.g. exporting ext4
    via CIFS mount, visiting a Cygwin created repository with Git
    for Windows or Eclipse). In such a case it may be necessary
    to set this variable to false. See git-update-index(1).
_ 
    The default is true (when core.filemode is not specified
    in the config file).

( reference: https://stackoverflow.com/questions/1580596/how-do-i-make-git-ignore-file-mode-chmod-changes )