rsync - faster, flexible replacement for rcp
rsync [OPTION]... SRC [SRC]... [USER@]HOST:DEST
rsync [OPTION]... [USER@]HOST:SRC DEST
rsync [OPTION]... SRC [SRC]... DEST
rsync [OPTION]... [USER@]HOST::SRC [DEST]
rsync [OPTION]... SRC [SRC]... [USER@]HOST::DEST
rsync [OPTION]... rsync://[USER@]HOST[:PORT]/SRC [DEST]
rsync [OPTION]... SRC [SRC]... rsync://[USER@]HOST[:PORT]/DEST
rsync is a program that behaves in much the same way that rcp does, but has many more options and uses the rsync remote-update protocol to greatly speed up file transfers when the destination file is being updated.
The rsync remote-update protocol allows rsync to transfer just the differences between two sets of files across the network connection, using an efficient checksum-search algorithm described in the technical report that accompanies this package.
Some of the additional features of rsync are:
There are eight different ways of using rsync. They are:
Note that in all cases (other than listing) at least one of the source and destination paths must be local.
See the file README for installation instructions.
Once installed, you can use rsync to any machine that you can access via a remote shell (as well as some that you can access using the rsync daemon-mode protocol). For remote transfers, a modern rsync uses ssh for its communications, but it may have been configured to use a different remote shell by default, such as rsh or remsh.
You can also specify any remote shell you like, either by using the -e command line option, or by setting the RSYNC_RSH environment variable.
One common substitute is to use ssh, which offers a high degree of security.
Note that rsync must be installed on both the source and destination machines.
You use rsync in the same way you use rcp. You must specify a source and a destination, one of which may be remote.
Perhaps the best way to explain the syntax is with some examples:
rsync -t *.c foo:src/
This would transfer all files matching the pattern *.c from the current directory to the directory src on the machine foo. If any of the files already exist on the remote system then the rsync remote-update protocol is used to update the file by sending only the differences. See the tech report for details.
rsync -avz foo:src/bar /data/tmp
This would recursively transfer all files from the directory src/bar on the machine foo into the /data/tmp/bar directory on the local machine. The files are transferred in archive mode, which ensures that symbolic links, devices, attributes, permissions, ownerships, etc. are preserved in the transfer. Additionally, compression will be used to reduce the size of data portions of the transfer.
rsync -avz foo:src/bar/ /data/tmp
A trailing slash on the source changes this behavior to avoid creating an additional directory level at the destination. You can think of a trailing / on a source as meaning copy the contents of this directory" as opposed to copy the directory by name", but in both cases the attributes of the containing directory are transferred to the containing directory on the destination. In other words, each of the following commands copies the files in the same way, including their setting of the attributes of /dest/foo:
rsync -av /src/foo /dest
rsync -av /src/foo/ /dest/foo
You can also use rsync in local-only mode, where both the source and destination don't have a `:' in the name. In this case it behaves like an improved copy command.
rsync somehost.mydomain.com::
This would list all the anonymous rsync modules available on the host somehost.mydomain.com. (See the following section for more details.)
The syntax for requesting multiple files from a remote host involves using quoted spaces in the SRC. Some examples:
rsync host::'modname/dir1/file1 modname/dir2/file2' /dest
This would copy file1 and file2 into /dest from an rsync daemon. Each additional arg must include the same modname/ prefix as the first one, and must be preceded by a single space. All other spaces are assumed to be a part of the filenames.
rsync -av host:'dir1/file1 dir2/file2' /dest
This would copy file1 and file2 into /dest using a remote shell. This word-splitting is done by the remote shell, so if it doesn't work it means that the remote shell isn't configured to split its args based on whitespace (a very rare setting, but not unknown). If you need to transfer a filename that contains whitespace, you'll need to either escape the whitespace in a way that the remote shell will understand, or use wildcards in place of the spaces. Two examples of this are:
rsync -av host:'file\ name\ with\ spaces' /dest rsync -av host:file?name?with?spaces /dest
This latter example assumes that your shell passes through unmatched wildcards. If it complains about no match", put the name in quotes.
It is also possible to use rsync without a remote shell as the transport. In this case you will connect to a remote rsync server running on TCP port 873.
You may establish the connection via a web proxy by setting the environment variable RSYNC_PROXY to a hostname:port pair pointing to your web proxy. Note that your web proxy's configuration must support proxy connections to port 873.
Using rsync in this way is the same as using it with a remote shell except that:
Some paths on the remote server may require authentication. If so then you will receive a password prompt when you connect. You can avoid the password prompt by setting the environment variable RSYNC_PASSWORD to the password you want to use or using the --password-file option. This may be useful when scripting rsync.
WARNING: On some systems environment variables are visible to all users. On those systems using --password-file is recommended.
It is sometimes useful to be able to set up file transfers using rsync server capabilities on the remote machine, while still using ssh or rsh for transport. This is especially useful when you want to connect to a remote machine via ssh (for encryption or to get through a firewall), but you still want to have access to the rsync server features (see RUNNING AN RSYNC SERVER OVER A REMOTE SHELL PROGRAM, below).
From the user's perspective, using rsync in this way is the same as using it to connect to an rsync server, except that you must explicitly set the remote shell program on the command line with --rsh=COMMAND. (Setting RSYNC_RSH in the environment will not turn on this functionality.)
In order to distinguish between the remote-shell user and the rsync server user, you can use `-l user' on your remote-shell command:
rsync -av --rsh="ssh -l ssh-user rsync-user@host::module[/path] local-path
The ssh-user will be used at the ssh level; the rsync-user will be used to check against the rsyncd.conf on the remote host.
An rsync server is configured using a configuration file. Please see the rsyncd.conf(5) man page for more information. By default the configuration file is called /etc/rsyncd.conf, unless rsync is running over a remote shell program and is not running as root; in that case, the default name is rsyncd.conf in the current directory on the remote computer (typically $HOME).
See the rsyncd.conf(5) man page for full information on the rsync server configuration file.
Several configuration options will not be available unless the remote user is root (e.g. chroot, setuid/setgid, etc.). There is no need to configure inetd or the services map to include the rsync server port if you run an rsync server only via a remote shell program.
To run an rsync server out of a single-use ssh key, see this section in the rsyncd.conf(5) man page.
Here are some examples of how I use rsync.
To backup my wife's home directory, which consists of large MS Word files and mail folders, I use a cron job that runs
rsync -Cavz . arvidsjaur:backup
each night over a PPP connection to a duplicate directory on my machine arvidsjaur".
To synchronize my samba source trees I use the following Makefile targets:
get:
rsync -avuzb --exclude `*~' samba:samba/ .
put:
rsync -Cavuzb . samba:samba/
sync: get put
this allows me to sync with a CVS directory at the other end of the connection. I then do cvs operations on the remote machine, which saves a lot of time as the remote cvs protocol isn't very efficient.
I mirror a directory between my old and new ftp sites with the command
rsync -az -e ssh --delete ~ftp/pub/samba/ nimbus:"~ftp/pub/tridge/samba"
this is launched from cron every few hours.
Here is a short summary of the options available in rsync. Please refer to the detailed description below for a complete description.
rsync uses the GNU long options package. Many of the command line options have two variants, one short and one long. These are shown below, separated by commas. Some options only have a long variant. The `=' for options that take a parameter is optional; whitespace can be used instead.
Note however that -a does not preserve hardlinks, because finding multiply-linked files is expensive. You must separately specify -H.
rsync foo/bar/foo.c remote:/tmp/
then this would create a file called foo.c in /tmp/ on the remote machine. If instead you used
rsync -R foo/bar/foo.c remote:/tmp/
then a file called /tmp/foo/bar/foo.c would be created on the remote machine -- the full path name is preserved.
In the currently implementation, a difference of file format is always considered to be important enough for an update, no matter what date is on the objects. In other words, if the source has a directory or a symlink where the destination has a file, the transfer would occur regardless of the timestamps. This might change in the future (feel free to comment on this on the mailing list if you have an opinion).
This option is useful for transfer of large files with blockbased changes or appended data, and also on systems that are disk bound, not network bound.
The option implies --partial (since an interrupted transfer does not delete the file), but conflicts with --partial-dir, --compare-dest, and --link-dest (a future rsync version will hopefully update the protocol to remove these restrictions).
WARNING: The file's data will be in an inconsistent state during the transfer (and possibly afterward if the transfer gets interrupted), so you should not use this option to update files that are in use. Also note that rsync will be unable to update a file inplace that is not writable by the receiving user.
Note that rsync can only detect hard links if both parts of the link are in the list of files being sent.
This option can be quite slow, so only use it if you need it.
Without this option, each new file gets its permissions set based on the source file's permissions and the umask at the receiving end, while all other files (including updated files) retain their existing permissions (which is the same behavior as other file-copy utilities, such as cp).
NOTE: Don't use this option when the destination is a Solaris tmpfs filesystem. It doesn't seem to handle seeks over null regions correctly and ends up corrupting the files.
This option has no effect if directory recursion is not selected.
This option can be dangerous if used incorrectly! It is a very good idea to run first using the dry run option (-n) to see what files would be deleted to make sure important files aren't listed.
If the sending side detects any I/O errors then the deletion of any files at the destination will be automatically disabled. This is to prevent temporary filesystem failures (such as NFS errors) on the sending side causing a massive deletion of files on the destination. You can override this with the --ignoreerrors option.
If this option is used with [user@]host::module/path, then the remote shell COMMAND will be used to run an rsync server on the remote host, and all data will be transmitted through that remote shell connection, rather than through a direct socket connection to a running rsync server on the remote host. See the section CONNECTING TO AN RSYNC SERVER OVER A REMOTE SHELL PROGRAM above.
Command-line arguments are permitted in COMMAND provided that COMMAND is presented to rsync as a single argument. For example:
(Note that ssh users can alternately customize site-specific connect options in their .ssh/config file.)
You can also choose the remote shell program using the RSYNC_RSH environment variable, which accepts the same range of values as -e.
See also the --blocking-io option which is affected by this option.
The exclude list is initialized to:
RCS SCCS CVS CVS.adm RCSLOG cvslog.* tags TAGS .make.state .nse_depinfo *~ #* .#* ,* _$* *$ *.old *.bak *.BAK *.orig *.rej .del-* *.a *.olb *.o *.obj *.so *.exe *.Z *.elc *.ln core .svn/
then files listed in a $HOME/.cvsignore are added to the list and any files listed in the CVSIGNORE environment variable (all cvsignore names are delimited by whitespace).
Finally, any file is ignored if it is in the same directory as a .cvsignore file and matches one of the patterns listed therein. See the cvs(1) manual for more information.
You may use as many --exclude options on the command line as you like to build up the list of files to exclude.
See the EXCLUDE PATTERNS section for detailed information on this option.
See the EXCLUDE PATTERNS section for detailed information on this option.
The file names that are read from the FILE are all relative to the source dir -- any leading slashes are removed and no .." references are allowed to go higher than the source dir. For example, take this command:
rsync -a --files-from=/tmp/foo /usr remote:/backup
If /tmp/foo contains the string bin (or even /bin"), the /usr/bin directory will be created as /backup/bin on the remote host (but the contents of the /usr/bin dir would not be sent unless you specified -r or the names were explicitly listed in /tmp/foo). Also keep in mind that the effect of the (enabled by default) --relative option is to duplicate only the path info that is read from the file -- it does not force the duplication of the source-spec path (/usr in this case).
In addition, the --files-from file can be read from the remote host instead of the local host if you specify a host: in front of the file (the host must match one end of the transfer). As a short-cut, you can specify just a prefix of : to mean use the remote end of the transfer". For example:
rsync -a --files-from=:/path/file-list src:/ /tmp/copy
This would copy all the files specified in the /path/file-list file that was located on the remote src host.
rsync -av --link-dest=$PWD/prior_dir host:src_dir/ new_dir/
Like --compare-dest if DIR is a relative path, it is relative to the destination directory. Note that rsync versions prior to 2.6.1 had a bug that could prevent --link-dest from working properly for a non-root user when -o was specified (or implied by -a). If the receiving rsync is not new enough, you can work around this bug by avoiding the -o option.
Note this this option typically achieves better compression ratios that can be achieved by using a compressing remote shell, or a compressing transport, as it takes advantage of the implicit information sent for matching data blocks.
By default rsync will use the username and groupname to determine what ownership to give files. The special uid 0 and the special group 0 are never mapped via user/group names even if the --numeric-ids option is not specified.
If a user or group has no name on the source system or it has no match on the destination system, then the numeric ID from the source system is used instead. See also the comments on the use chroot setting in the rsyncd.conf manpage for information on how the chroot setting affects rsync's ability to look up the names of the users and groups and what you can do about it.
If standard input is a socket then rsync will assume that it is being run via inetd, otherwise it will detach from the current terminal and become a background daemon. The daemon will read the config file (rsyncd.conf) on each connect made by a client and respond to requests accordingly. See the rsyncd.conf(5) man page for more details.
Rsync will create the dir if it is missing (just the last dir -not the whole path). This makes it easy to use a relative path (such as --partial-dir=.rsync-partial") to have rsync create the partial-directory in the destination file's directory (rsync will also try to remove the DIR if a partial file was found to exist at the start of the transfer and the DIR was specified as a relative path).
If the partial-dir value is not an absolute path, rsync will also add an --exclude of this value at the end of all your existing excludes. This will prevent partial-dir files from being transferred and also prevent the untimely deletion of partial-dir items on the receiving side. An example: the above --partial-dir option would add an --exclude=.rsync-partial/" rule at the end of any other include/exclude rules. Note that if you are supplying your own include/exclude rules, you may need to manually insert a rule for this directory exclusion somewhere higher up in the list so that it has a high enough priority to be effective (e.g., if your rules specify a trailing --exclude=* rule, the auto-added rule will be ineffective).
IMPORTANT: the --partial-dir should not be writable by other users or it is a security risk. E.g. AVOID /tmp".
You can also set the partial-dir value the RSYNC_PARTIAL_DIR environment variable. Setting this in the environment does not force --partial to be enabled, but rather it effects where partial files go when --partial (or -P) is used. For instance, instead of specifying --partial-dir=.rsync-tmp along with --progress, you could set RSYNC_PARTIAL_DIR=.rsync-tmp in your environment and then just use the -P option to turn on the use of the .rsync-tmp dir for partial transfers. The only time the --partial option does not look for this environment value is when --inplace was also specified (since --inplace conflicts with --partial-dir).
When the file is transferring, the data looks like this:
782448 63% 110.64kB/s 0:00:04
This tells you the current file size, the percentage of the transfer that is complete, the current calculated file-completion rate (including both data over the wire and data being matched locally), and the estimated time remaining in this transfer.
After the a file is complete, it the data looks like this:
1238099 100% 146.38kB/s 0:00:08 (5, 57.1% of 396)
This tells you the final file size, that it's 100% complete, the final transfer rate for the file, the amount of elapsed time it took to transfer the file, and the addition of a total-transfer summary in parentheses. These additional numbers tell you how many files have been updated, and what percent of the total number of files has been scanned.
The exclude and include patterns specified to rsync allow for flexible selection of which files to transfer and which files to skip.
Rsync builds an ordered list of include/exclude options as specified on the command line. Rsync checks each file and directory name against each exclude/include pattern in turn. The first matching pattern is acted on. If it is an exclude pattern, then that file is skipped. If it is an include pattern then that filename is not skipped. If no matching include/exclude pattern is found then the filename is not skipped.
The filenames matched against the exclude/include patterns are relative to the root of the transfer". If you think of the transfer as a subtree of names that are being sent from sender to receiver, the root is where the tree starts to be duplicated in the destination directory. This root governs where patterns that start with a / match (see below).
Because the matching is relative to the transfer-root, changing the trailing slash on a source path or changing your use of the --relative option affects the path you need to use in your matching (in addition to changing how much of the file tree is duplicated on the destination system). The following examples demonstrate this.
Let's say that we want to match two source files, one with an absolute path of /home/me/foo/bar", and one with a path of /home/you/bar/baz". Here is how the various command choices differ for a 2-source transfer:
Example cmd: rsync -a /home/me /home/you /dest
+/- pattern: /me/foo/bar
+/- pattern: /you/bar/baz
Target file: /dest/me/foo/bar
Target file: /dest/you/bar/baz
Example cmd: rsync -a /home/me/ /home/you/ /dest
Example cmd: rsync -a --relative /home/me/ /home/you /dest
Example cmd: cd /home; rsync -a --relative me/foo you/ /dest
The easiest way to see what name you should include/exclude is to just look at the output when using --verbose and put a / in front of the name (use the --dry-run option if you're not yet ready to copy any files).
Note that, when using the --recursive (-r) option (which is implied by -a), every subcomponent of every path is visited from the top down, so include/exclude patterns get applied recursively to each subcomponent. The exclude patterns actually short-circuit the directory traversal stage when rsync finds the files to send. If a pattern excludes a particular parent directory, it can render a deeper include pattern ineffectual because rsync did not descend through that excluded section of the hierarchy.
Note also that the --include and --exclude options take one pattern each. To add multiple patterns use the --include-from and --excludefrom options or multiple --include and --exclude options.
The patterns can take several forms. The rules are:
The +/- rules are most useful in a list that was read from a file, allowing you to have a single exclude list that contains both include and exclude options in the proper order.
Remember that the matching occurs at every step in the traversal of the directory hierarchy, so you must be sure that all the parent directories of the files you want to include are not excluded. This is particularly important when using a trailing `*' rule. For instance, this won't work:
This fails because the parent directory some is excluded by the `*' rule, so rsync never visits any of the files in the some or some/path directories. One solution is to ask for all directories in the hierarchy to be included by using a single rule: --include='*/' (put it somewhere before the --exclude='*' rule). Another solution is to add specific include rules for all the parent dirs that need to be visited. For instance, this set of rules works fine:
Here are some examples of exclude/include matching:
Note: Batch mode should be considered experimental in this version of rsync. The interface and behavior have now stabilized, though, so feel free to try this out.
Batch mode can be used to apply the same set of updates to many identical systems. Suppose one has a tree which is replicated on a number of hosts. Now suppose some changes have been made to this source tree and those changes need to be propagated to the other hosts. In order to do this using batch mode, rsync is run with the write-batch option to apply the changes made to the source tree to one of the destination trees. The write-batch option causes the rsync client to store in a batch file all the information needed to repeat this operation against other, identical destination trees.
To apply the recorded changes to another destination tree, run rsync with the read-batch option, specifying the name of the same batch file, and the destination tree. Rsync updates the destination tree using the information stored in the batch file.
For convenience, one additional file is creating when the write-batch option is used. This file's name is created by appending .sh to the batch filename. The .sh file contains a command-line suitable for updating a destination tree using that batch file. It can be executed using a Bourne(-like) shell, optionally passing in an alternate destination tree pathname which is then used instead of the original path. This is useful when the destination tree path differs from the original destination tree path.
Generating the batch file once saves having to perform the file status, checksum, and data block generation more than once when updating multiple destination trees. Multicast transport protocols can be used to transfer the batch update files in parallel to many hosts at once, instead of sending the same data to every host individually.
Examples:
$ rsync --write-batch=foo -a host:/source/dir/ /adest/dir/
$ scp foo* remote:
$ ssh remote ./foo.sh /bdest/dir/
$ rsync --write-batch=foo -a /source/dir/ /adest/dir/ $ ssh remote rsync --read-batch=- -a /bdest/dir/ <foo
In these examples, rsync is used to update /adest/dir/ from /source/dir/ and the information to repeat this operation is stored in foo and foo.sh". The host remote is then updated with the batched data going into the directory /bdest/dir. The differences between the two examples reveals some of the flexibility you have in how you deal with batches:
Caveats:
The read-batch option expects the destination tree that it is updating to be identical to the destination tree that was used to create the batch update fileset. When a difference between the destination trees is encountered the update might be discarded with no error (if the file appears to be up-to-date already) or the file-update may be attempted and then, if the file fails to verify, the update discarded with an error. This means that it should be safe to re-run a read-batch operation if the command got interrupted. If you wish to force the batchedupdate to always be attempted regardless of the file's size and date, use the -I option (when reading the batch). If an error occurs, the destination tree will probably be in a partially updated state. In that case, rsync can be used in its regular (non-batch) mode of operation to fix up the destination tree.
The rsync version used on all destinations must be at least as new as the one used to generate the batch file. Rsync will die with an error if the protocol version in the batch file is too new for the batchreading rsync to handle.
The --dry-run (-n) option does not work in batch mode and yields a runtime error.
When reading a batch file, rsync will force the value of certain options to match the data in the batch file if you didn't set them to the same as the batch-writing command. Other options can (and should) be changed. For instance --write-batch changes to --read-batch, --files-from is dropped, and the --include/--exclude options are not needed unless --delete is specified without --delete-excluded.
The code that creates the BATCH.sh file transforms any include/exclude options into a single list that is appended as a here document to the shell script file. An advanced user can use this to modify the exclude list if a change in what gets deleted by --delete is desired. A normal user can ignore this detail and just use the shell script as an easy way to run the appropriate --read-batch command for the batched data.
The original batch mode in rsync was based on rsync+", but the latest version uses a new implementation.
Three basic behaviors are possible when rsync encounters a symbolic link in the source directory.
By default, symbolic links are not transferred at all. A message skipping non-regular file is emitted for any symlinks that exist.
If --links is specified, then symlinks are recreated with the same target on the destination. Note that --archive implies --links.
If --copy-links is specified, then symlinks are collapsed by copying their referent, rather than the symlink.
rsync also distinguishes safe and unsafe symbolic links. An example where this might be used is a web site mirror that wishes ensure the rsync module they copy does not include symbolic links to /etc/passwd in the public section of the site. Using --copy-unsafelinks will cause any links to be copied as the file they point to on the destination. Using --safe-links will cause unsafe links to be omitted altogether.
Symbolic links are considered unsafe if they are absolute symlinks (start with /), empty, or if they contain enough .. components to ascend from the directory being copied.
rsync occasionally produces error messages that may seem a little cryptic. The one that seems to cause the most confusion is protocol version mismatch - is your shell clean?".
This message is usually caused by your startup scripts or remote shell facility producing unwanted garbage on the stream that rsync is using for its transport. The way to diagnose this problem is to run your remote shell like this:
ssh remotehost /bin/true > out.dat
then look at out.dat. If everything is working correctly then out.dat should be a zero length file. If you are getting the above error from rsync then you will probably find that out.dat contains some text or data. Look at the contents and try to work out what is producing it. The most common cause is incorrectly configured shell startup scripts (such as .cshrc or .profile) that contain output statements for noninteractive logins.
If you are having trouble debugging include and exclude patterns, then try specifying the -vv option. At this level of verbosity rsync will show why each individual file is included or excluded.
CVSIGNORE
The CVSIGNORE environment variable supplements any ignore patterns
in .cvsignore files. See the --cvs-exclude option for more
details.
USER or LOGNAME
The USER or LOGNAME environment variables are used to determine
the default username sent to an rsync server. If neither is
set, the username defaults to nobody".
/etc/rsyncd.conf or rsyncd.conf
times are transferred as unix time_t values
When transferring to FAT filesystems rsync may re-sync unmodified files. See the comments on the --modify-window option.
file permissions, devices, etc. are transferred as native numerical values
see also the comments on the --delete option
Please report bugs! See the website at http://rsync.samba.org/
rsync is distributed under the GNU public license. See the file COPYING for details.
A WEB site is available at http://rsync.samba.org/. The site includes an FAQ-O-Matic which may cover questions unanswered by this manual page.
The primary ftp site for rsync is ftp://rsync.samba.org/pub/rsync.
We would be delighted to hear from you if you like this program.
This program uses the excellent zlib compression library written by Jean-loup Gailly and Mark Adler.
Thanks to Richard Brent, Brendan Mackay, Bill Waite, Stephen Rothwell and David Bell for helpful suggestions, patches and testing of rsync. I've probably missed some people, my apologies if I have.
Especial thanks also to: David Dykstra, Jos Backus, Sebastian Krahmer, Martin Pool, Wayne Davison, J.W. Schultz.
rsync was originally written by Andrew Tridgell and Paul Mackerras. Many people have later contributed to it.
Mailing lists for support and development are available at http://lists.samba.org