Fossil Forum

New option: "fossil changes --scp REMOTE"
Login

New option: "fossil changes --scp REMOTE"

(1) By Richard Hipp (drh) on 2021-02-05 21:36:40 [source]

There is new option to the "fossil changes" command on trunk:

 fossil changes --scp REMOTE

Where REMOTE is an scp-style remote directory name. This command invokes the "scp" command as many times as needed to copy all changed files to the REMOTE repository. It takes care to put the files in the right directories. REMOTE should specify the root of the remote repository.

Use Case

I'll be using this for SQLite development. When I'm working on a change, I will often want to move the modified files over to a nearby headless test machine and start some long-running CPU-intensive tests while I continue editing on my desktop. This operation used to require lots of manual "scp" commands. The --scp to "fossil changes" simplifies the task.

fossil changes --scp ryzen01:sqlite/sqlite

I'd also like to use this to copy the changed files to a nearby Win10 machine so that I can also compile with MSVC and check for warnings or problems that MSVC finds that GCC/Clang does not. Unfortunately, I don't have scpd running on my Win10 box. Any suggestions on how to accomplish that would be appreciated.

You can help

  1. Is this the right way to do this - an option to "fossil changes"? Is there a better way - a "cleaner" way? Does it need a new command "fossil scp"

  2. Should it use "rsync" instead of "scp"?

  3. Is this command option just too crazy and confusing? Should I take it back out?

(2) By Dan Shearer (danshearer) on 2021-02-05 21:45:34 in reply to 1 [link] [source]

no comment on the feature, because I'm going to have my evening now :-)

But this bit is easy:

I don't have scpd running on my Win10 box. Any suggestions on how to accomplish that would be appreciated.

  1. Official Microsoft Linux subsystem. It installs Ubuntu 20.04 (server, I think) with everything you'd expect.

  2. (ha!) but if you have Git for compatibility testing or whatever, then a whole Linux shell comes with it. It's called GitBash. And it definitely has scp. Git is the opposite of no-dependencies so they have to ship Bash and Perl and who-knows-what else. So at least there's a decent commandline prompt available as soon as you install windows Git.

  3. Cygwin used to have this, it probably still does, idk

Dan Shearer

(7) By anonymous on 2021-02-06 01:15:05 in reply to 2 [link] [source]

I understand the need, however the implementation is very much 'un-Unix' like. This should be achievable with already existing 'small' tools for the specific jobs.

My first inclination would be to use Fossil to only generate the list of the changed files which then could be used by scp command (whatever it is, say PuTTY's pscp or other) to copy it over.

Also, this kind of goes against Fossil's ideology to preserve the state either local or remote. Copying the files over alters the remote state. This is acceptable for a copy-like tool, yet for a VCS-kind of tool is too relaxed.

Hope you could consider alternatives.

(3) By Alan Bram (flyboy) on 2021-02-05 22:05:46 in reply to 1 [link] [source]

It feels quite weird to me for this to be on the "changes" command: the idea of "changes" seems to be (nothing more than) "please tell me some information."

I feel there might be a better place to wedge this in (if not a whole new command); how about putting it into "stash"? (I was also tempted by "publish", but I suppose that's for already committed work.)

(4) By John Rouillard (rouilj) on 2021-02-05 22:26:51 in reply to 1 [link] [source]

This seems wrong to be an option of changes. It is not where I would expect this functionality to exist. (Then again I don't expect my VCS to deploy except via pull/update.)

Maybe fossil deploy with --scp, (default??) --rsync (more efficient for large files with few changes), --ftp etc.

Also if you have rsync available, I think this will work as well:

fossil changes | awk '{print $2}' | rsync --files-from - . remote:/root/of/dir

assuming you are in the root of the fossil working directory.

If you are using scp a bash shell function like:

deploy() {fossil changes | while read category path;
do
  scp -p "$path" "$1/$path"
done
}

Run it like:

deploy remote:/root/of/dir

Note unlike the rsync example, scp will not create intermediate directories if they are missing. Adding an ssh ${1%%:*} mkdir -p "${1#*:}/$path" before the scp I think will handle that issue.

Regarding an sshd and scp (or rsync) on windows, I have used cygwin for years for exactly this purpose.

(5) By Scott Robison (sdr) on 2021-02-05 22:30:43 in reply to 1 [link] [source]

I think a separate command is a better idea. In this way it can be used to either scp all tracked files in the workdir to a remote location, or only changed files. Maybe something like:

fossil scp --all user:path/to/dest
fossil scp --changes user:path/to/dest

I like rsync as a solution to copy an entire work directory, including untracked files, and perhaps that could be an option to fossil scp as well, since it will handle the functionality of determining what is changed and must be transferred:

fossil scp --rsync user:path/to/dest

The only thing I don't like about that is mixing scp & rsync which are otherwise unrelated. Maybe a more generic name than scp could be used:

fossil xfer --scp ...
fossil xfer --rsync ...

The example named commands & options are just for consideration. Nor do they all need to be implemented right now.

For the purposes of doing something quickly to do what you need, I'd pick another command that defaults to --scp & --changed (they need not be real options). Then later it can be extended to add --rsync when and if it makes sense, or to do --all to copy all tracked files, etc.

(6) By Larry Brasfield (LarryBrasfield) on 2021-02-05 22:44:49 in reply to 1 [link] [source]

Should it use "rsync" instead of "scp"?

If there are well-functioning rsync implementations at both ends, the user should invoke it. Separation of concerns ... However, if Fossil were to output what it does or is saying changed, an option to put that info into a format which rsync can directly use, telling it which files to sync, would be very handy (and modular, still) and avoid making rsync do a lot of useless looking as it is most conveniently invoked.

The rsync tool is very wire-friendly, while scp blindly copies.

(8) By Warren Young (wyoung) on 2021-02-06 02:30:54 in reply to 1 [link] [source]

This command invokes the "scp" command as many times as needed to copy all changed files to the REMOTE repository.

scp and rsync will both take a list of files, and at least with rsync, you can arrange the copy destination to handle subdirectories.

This operation used to require lots of manual "scp" commands

That, or branches so you can commit and pull on the other machine, which can be annoying when all you're trying to figure out is "Does this work properly on all platforms?"

I expect I'm preachin' to the choir, but I do so as a counter to the criticism. Yes, this is a good feature. We can bikeshed the thing, but whether the bike shed needs to exist is not really in question, IMHO.

But it needs to be orange. 😛

Is there a better way - a "cleaner" way?

It's kinda sorta a commit. Instead of going into the repo, it goes to the other checkout.

It's also kind of a merge, or at least, I hope so: what happens when you scp multiple times? What if there are uncommitted changes on the other side?

Just a thought.

Should it use "rsync" instead of "scp"?

Both, if possible. rsync isn't available everywhere, but rsync is smarter, so it should also be an option. The two don't share compatible command formats beyond a trivial level, so I'm suggesting two different options, not some sort of autodetection magic.

Should I take it back out?

No. Refine, maybe, but it's not anti-Fossil.

You want a philosophical argument, this is test-before-commit writ across the LAN.

(9) By Stephan Beal (stephan) on 2021-02-06 03:46:27 in reply to 8 [link] [source]

whether the bike shed needs to exist is not really in question, IMHO.

i'm gonna disagree, but it seems the decision has already been made. Obviously it fills a need Richard has, but i agree with the commenters who propose that fossil changes instead offers a flag to emit input for rsync, as opposed to actually making the rsync call.

Both, if possible. rsync isn't available everywhere, but rsync is smarter, so it should also be an option.

Another alternative is tar, which, like scp, will copy entire contents but, like rsync, also creates directories. Abstractly:

fossil changes --tar - | ssh remote 'tar xf - -C /the/dir'

or:

fossil changes --tar foo.tar
scp foo.tar remote:/the/dir
ssh remote 'cd /the/dir; tar xf foo.tar'

Barring the use of tar, though, rsync seems like a better option than ssh. It creates directories, is more wire-efficient, and can use different shells. (Not that wire efficiency is an issue for Richard's use case of a LAN-local target machine, but still.)

(12) By jamsek on 2021-02-06 10:16:20 in reply to 8 [link] [source]

I agree with Warren insofar as I thought it would make a good addition.
I invariably do fossil diff > curr.patch then scp it and ssh over
then apply patch -s -p0 < curr.patch to the remote checkout. This
addition would've made this process a little simpler and also dealt with
instances when there are new files or directories.

But I see it's been backed out now so no matter. Though I also agree
with sdr@ that it might've been better as a new command with both scp
and rsync options.

(13) By Richard Hipp (drh) on 2021-02-06 13:11:47 in reply to 12 [link] [source]

There is now a TCL script at "tools/co-rsync.tcl" that you can copy into your ~/bin to do the same operation. This script is better in a number of ways:

  1. It uses rsync instead of scp, so it more efficient over the wire.

  2. As before, only managed files that have been modified get synced, so no time is wasted syncing compiled binaries and whatnot. This is as before but is a point that was seemingly overlooked by critics of the original command.

  3. The script does a single rsync to transfer all files, rather than a separate scp for each file. So if you have to type in a password (which I have to do when xfering to windows, as I have not been able to get authorized_hosts working on windows) you will only have to do so once.

The name of the script is an abbreviation for "checkout-rsync". I'm open to changing this to a better name if anybody has a suggestion.

(14.1) By Larry Brasfield (LarryBrasfield) on 2021-02-07 21:13:41 edited from 14.0 in reply to 13 [link] [source]

(Amended: As suggested to be more appropriate here, this more general topic (making fossil output amenable to non-human processing) is being discussed in a dedicated thread. So, better not to reply here.)

That's a nice way to solve the problem, IMHO. Your script brings up an issue I have meant to raise here for awhile.

Early in the development of Subversion, one design objective mentioned and well justified was what they now call "Parsable output" here. There was also a commitment to keeping the output stable so that tools could be written to use it and not be fragile as Subversion evolved. (No link. Sorry. I well remember this because shifting output to tools I have written has been an annoyance, so I was favorably impressed when I saw that intention years ago.)

I have not seen a similar commitment in the Fossil docs. (This is a "statement of ignorance", not a strong claim about Fossil.) Now, seeing the co-rsync.tcl script, which uses --no-classify and --no-merge options to the changes command, I wonder if there is a way to tell the Fossil commands that produce output potentially useful to automatons that a bare (unadorned, not summarized) and consistently formatted output is to be emitted. If there is, I would like to know about it, because I've not stumbled across it in my Fossil studies. If there is not, I submit that there should be. The option might even cause tabs to be used as separators for output which inherently has multiple columns.

Such an option would apply across the whole command set. It would be relatively easy to wrap adornment in something which honored the flag.

It's likely not something that gets done in an hour or two, but the intention could be expressed soon, later to become a promise when fully realized.

(15) By anonymous on 2021-02-06 22:18:20 in reply to 13 [link] [source]

I'm open to changing this to a better name if anybody has a suggestion.

Just tested this new script for exact same purpose -- lazily updating the "test" dir on one of VMs. It works great and fast!

Two suggestions:

  1. Perhaps naming it something other than checkout, as it in fact just does a selective copy, not really a checkout. Maybe "copy-current.tcl" or more precisely "rsync-current.tcl" ?

  2. In my case, I wanted to actually "refresh" the whole set of source files on the remote VM, not just changes. That is copy over all managed files from the current work dir to remote, kinda checkout, but without remote repo. Super lazy, I know... So, I had to amend this script to feed rsync from the output of 'fossil ls'.

Maybe, it would make sense to have this script operate in two modes "changes-only" and "all". Not sure which one to default to. For example:

./rsync-current.tcl --changes user@host:dir/

or in another way to rsync all managed files

./rsync-current.tcl --all user@host:dir/

Thank you for finding even more ways to improve Fossil!

(10) By anonymous on 2021-02-06 06:07:43 in reply to 1 [link] [source]

I'll be using this for SQLite development. When I'm working on a change, I will often want to move the modified files over to a nearby headless test machine and start some long-running CPU-intensive tests while I continue editing on my desktop. This operation used to require lots of manual "scp" commands. The --scp to "fossil changes" simplifies the task.

Based on your intended use, did you consider trying sshfs?

Basically, this would mount your target as a local filesystem. So you'd copy files locally as usual, while the transfers to/from target would be transparently handled via SSH.

(11) By anonymous on 2021-02-06 07:17:54 in reply to 10 [link] [source]

With SSHFS set up, your command-line (bash) usage may be something as simple as:

cp $(fossil changes --no-classify) /local-mount-dir/remote/dir

As for the SSHFS setup (assuming your platform has sshfs installed):

sshfs -o idmap=user remoteuser@remotehost:remotedir /local-mount-dir

#to unmount
fusermount -u /local-mount-dir

There're ways to make this setup permanent.

(16) By js (Midar3) on 2021-02-07 13:48:41 in reply to 1 [link] [source]

Unfortunately, the scp command has known security issues that result in code execution when talking to a malicious server. Could this be switched to sftp instead, which does not have these problems?

(17) By Stephan Beal (stephan) on 2021-02-07 13:51:50 in reply to 16 [link] [source]

Could this be switched to sftp instead, which does not have these problems?

It's a moot issue now - the change was rolled back and replaced with a script which uses rsync.

In any case, since it's only intended to be used for copying to one's own development systems where one already has code checked out (and therefore presumably has an ssh session open), the chances of encountering a malicious server approach zero.