Опасная зона

Опасная зона

Saturday, February 26, 2011

Rsync magic: rsync vs. scp

Accessing computers remotely is commonly made via ssh encrypted lines. In an earlier post on my (and Roy's) willworkforscience blog, I mentioned the blessings sshfs where you could mount a remote filesystem if the remote computer has an ssh daemon running.

Here I want to honor a very old program which is called "rsync" which can be used for fast copying of files to/from remote computers. Rsync has been extended with plenty of new functionalities, and generally I prefer rsync to scp when I have to copy more than one file. Let me give a few examples:

Example 1): speed

In my last post I presented a little script which generates an arbitrary amount of files with random contents. For this test I generated 1000 files which were 1k large each. Copying them from my home PC kepler to the remote computer lululu (via a 10/2Mbit ADSL line) will be signifcantly slower with scp than rsync:

bassler@kepler:~/test$ time scp * lululu:test/.
bassler@lululu's password:

real   1m19.619s
user   0m0.188s
sys   0m0.204s

bassler@kepler:~/test$ time rsync * lululu:test/.
bassler@lululu's password: 

real    0m14.780s
user    0m0.048s
sys    0m0.032s


That is 80 seconds for scp and 15 seconds for rsync. (Yes, the target directory was wiped empty before I issued the rsync command.) The overhead of scp becomes significant if many files are to be copied. If I copy a single 1M file instead of 1000 1k files, then there is no noticeable speed difference.

Example 2: transferring very large files

Rsync has a nice resume option for transferring large files, if transfer for some reason is interrupted. Basically what your options should look like:
rsync -rvv --inplace --append --progress  /disk1/Movies/* lululu:Movies/.
afaik, this is not possible with scp in a straightforward way.


Example 3: tunnel your way past firewalls and gateways

This is probably the coolest feature of rsync! If I am at home and want to update the files on my office PC baslup I face the problem that the PC is hidden behind the university firewall/gateway lululu. So with scp I would have to copy the files to lululu first, and from there again copy it to my office PC baslup.
Now, if rsync is installed on all PCs in the chain, I can issue a very simple command which will establish a link directly to my office PC.
rsync -va -e "ssh lululu ssh" test/* baslup:test/

You can add several PCs in the chain. Here fufufu is added:
rsync -va -e "ssh lululu ssh fufufu ssh" test/* baslup:test/
the only annoying thing is, you have to enter your password multiple times, (unless you add the ssh keys for password less logins).

Ok, finally a list of the most important options for rsync:

 -v verbose
 -r recurse into dirs (you might just as well use -a instead)
 -a archive: recurse into dirs, preserves symlinks, permissions, timestamps, group and owner.
 -u update: don't overwrite newer files on receiver
 -n dry run, copies no files, but shows what would be done
 --del delete files/dirs from remote server which do not exist locally
  -z enable compression. Yes, if you have files with lots of air in them, you can achieve dramatic transfer speedups. Works well for uncompressed scientific data, bitmap files, large textfiles etc, but is much slower when transferring mp3, jpg, gz, zip, pdf and similar compressed files...


Thank you, developers of rsync! This has saved me alot of hassle.

No comments:

Post a Comment