Simple Rsync Tutorial
Rsync is great software. It’s mature (read: mostly bug-free), extremely performant, and extremely well documented. The problem is of course that it’s too well documented: I can’t figure out what options I want to just transfer some files from a remote to my local. This tutorial is my attempt to make sense of the option list and justify which options should and should not be included in my usage. Hopefully this is helpful to other people. I know it will be helpful to me.
I’m going to cover widely-used options first, then as I continue to use rsync, I will put miscellaneous options in where necessary.
Archive with
-a, --archive
Just means -rlptgoD (no -H,-A,-X)
. You want recursion
-r
and want to preserve almost everything (but not
-H
, which means not preserving hard links). If you specify
--files-from
, then -r
(recursive) is not
implied.
I will explain these options quickly here rather than making you (me) scroll to them.
-l, --links
copies symlinks (as symlinks)
-p, --perms
copies permissions.
-t, --times
copies modification times.
-g, --group
copies group (of a file).
-o, --owner
copies owner (of a file).
-D
is the same as --devices --specials
,
which just means to preserve device files and special files (not sure
what this implies).
--archive
notably does not include
-H, --hard-links
copies hard links.
-A, --acls
copies ACLs (which implies
-p, --perm
). I don’t know what ACls are.
-X, --xattrs
copies extended attributes. Not sure what
these are either.
-i, --itemize-changes
This option is really powerful. It is the same as
--out-format='%i %n%L'
but I really like the information it
provides. From the man pages:
The lq%irq escape has a cryptic output that is 11 letters long. The general format is like the string YXcstpoguax, where Y is replaced by the type of update being done, X is replaced by the file-type, and the other letters represent attributes that may be output if they are being modified.
I agree that the output is cryptic, but armed with the right information, it is extremely useful. I am copying from the man pages here (and reformatting for markdown):
The update types that replace the Y are as follows:
- A < means that a file is being transferred to the remote host (sent).
- A > means that a file is being transferred to the local host (received).
- A c means that a local change/creation is occurring for the item (such as the creation of a directory or the changing of a symlink, etc.).
- A h means that the item is a hard link to another
item (requires
--hard-links
). - A . means that the item is not being updated (though it might have attributes that are being modified).
- A * means that the rest of the itemized-output area contains a message (e.g. lqdeletingrq).
The file-types that replace the X are:
- f for a file
- d for a directory
- L for a symlink
- D for a device
- S for a special file (e.g. named sockets and fifos).
The other letters in the string above are the actual letters that will be output if the associated attribute for the item is being updated or a lq.rq for no change. Three exceptions to this are:
- A newly created item replaces each letter with a lq+rq
- An identical item replaces the dots with spaces, and
- An unknown attribute replaces each letter with a lq?rq (this can happen when talking to an older rsync).
The attribute that is associated with each letter is as follows:
- A c means either that a regular file has a
different checksum (requires
--checksum
) or that a symlink, device, or special file has a changed value. Note that if you are sending files to an rsync prior to 3.0.1, this change flag will be present only for checksum-differing regular files. - A s means the size of a regular file is different and will be updated by the file transfer.
- A t means the modification time is different and is
being updated to the sender’s value (requires
--times
). An alternate value of T means that the modification time will be set to the transfer time, which happens when a file/symlink/device is updated without--times
and when a symlink is changed and the receiver can’t set its time. (Note: when using an rsync 3.0.0 client, you might see the s flag combined with t instead of the proper T flag for this time-setting failure.) - A p means the permissions are different and are
being updated to the sender’s value (requires
--perms
). - An o means the owner is different and is being
updated to the sender’s value (requires
--owner
and super-user privileges). - A g means the group is different and is being
updated to the sender’s value (requires
--group
and the authority to set the group). - The u slot is reserved for future use.
- The a means that the ACL information changed.
- The x means that the extended attribute information changed.
I found these options extremely useful for figuring out what rsync
was doing. For example, I realized I didn’t need the times to be
transfered, so I turned off --archive
and instead used
--recursive
.
-z, --compress
This compresses file data during the transfer. You almost always want this.
-v, --verbose
Copied from the man pages:
This option increases the amount of information you are given during the transfer. By default, rsync works silently. A single -v will give you information about what files are being transferred and a brief summary at the end. Two -v options will give you information on what files are being skipped and slightly more information at the end. More than two -v options should only be used if you are debugging rsync.
Basically, use –verbose to see at least some information.
Combinations I’ve Used
# Dry run.
rsync \
--recursive \
--compress \
--verbose \
--itemize-changes \
--human-readable \
--dry-run \
pitzer:~/projects/relics .
# Actual transfer.
rsync \
--recursive \
--compress \
--verbose \
--itemize-changes \
--human-readable \
pitzer:~/projects/relics .
I wanted to copy all experiments (relics) from a compute cluster
(pitzer) to my local (.). I recursively traversed the
relics/
directory, compressed all files, and itemized the
changes (using the above information to interpret the output).
First I used --dry-run
to see what would happen. Doing
this made me realize I just wanted --recursive
rather than
--archive
.
Closing
This is probably going to be a living document as I use rsync more.
I also think rsync could use a nicer CLI experience, similar to the way git has been slowly getting a makeover. Maybe splitting rsync into subcommands could help? But it’s all one concept. I’m not sure, but the current barrage of options is quite intimidating.
[Relevant link] [Source]
Sam Stevens, 2024