Behold, My Stuff

[Home] [Writing] [Links] [CV] [Contact]

Simple Rsync Tutorial

Rsync is great software. It’s mature (read: mostly bug-free), extremely performant, and extremely well documented. The problem is of course that it’s too well documented: I can’t figure out what options I want to just transfer some files from a remote to my local. This tutorial is my attempt to make sense of the option list and justify which options should and should not be included in my usage. Hopefully this is helpful to other people. I know it will be helpful to me.

I’m going to cover widely-used options first, then as I continue to use rsync, I will put miscellaneous options in where necessary.

Archive with -a, --archive

Just means -rlptgoD (no -H,-A,-X). You want recursion -r and want to preserve almost everything (but not -H, which means not preserving hard links). If you specify --files-from, then -r (recursive) is not implied.

I will explain these options quickly here rather than making you (me) scroll to them.

-l, --links copies symlinks (as symlinks)

-p, --perms copies permissions.

-t, --times copies modification times.

-g, --group copies group (of a file).

-o, --owner copies owner (of a file).

-D is the same as --devices --specials, which just means to preserve device files and special files (not sure what this implies).

--archive notably does not include

-H, --hard-links copies hard links.

-A, --acls copies ACLs (which implies -p, --perm). I don’t know what ACls are.

-X, --xattrs copies extended attributes. Not sure what these are either.

-i, --itemize-changes

This option is really powerful. It is the same as --out-format='%i %n%L' but I really like the information it provides. From the man pages:

The lq%irq escape has a cryptic output that is 11 letters long. The general format is like the string YXcstpoguax, where Y is replaced by the type of update being done, X is replaced by the file-type, and the other letters represent attributes that may be output if they are being modified.

I agree that the output is cryptic, but armed with the right information, it is extremely useful. I am copying from the man pages here (and reformatting for markdown):

The update types that replace the Y are as follows:

The file-types that replace the X are:

The other letters in the string above are the actual letters that will be output if the associated attribute for the item is being updated or a lq.rq for no change. Three exceptions to this are:

  1. A newly created item replaces each letter with a lq+rq
  2. An identical item replaces the dots with spaces, and
  3. An unknown attribute replaces each letter with a lq?rq (this can happen when talking to an older rsync).

The attribute that is associated with each letter is as follows:

I found these options extremely useful for figuring out what rsync was doing. For example, I realized I didn’t need the times to be transfered, so I turned off --archive and instead used --recursive.

-z, --compress

This compresses file data during the transfer. You almost always want this.

-v, --verbose

Copied from the man pages:

This option increases the amount of information you are given during the transfer. By default, rsync works silently. A single -v will give you information about what files are being transferred and a brief summary at the end. Two -v options will give you information on what files are being skipped and slightly more information at the end. More than two -v options should only be used if you are debugging rsync.

Basically, use –verbose to see at least some information.

Combinations I’ve Used

# Dry run.
rsync \
  --recursive \
  --compress \
  --verbose \
  --itemize-changes \
  --human-readable \
  --dry-run \
  pitzer:~/projects/relics .

# Actual transfer.
rsync \
  --recursive \
  --compress \
  --verbose \
  --itemize-changes \
  --human-readable \
  pitzer:~/projects/relics .

I wanted to copy all experiments (relics) from a compute cluster (pitzer) to my local (.). I recursively traversed the relics/ directory, compressed all files, and itemized the changes (using the above information to interpret the output).

First I used --dry-run to see what would happen. Doing this made me realize I just wanted --recursive rather than --archive.

Closing

This is probably going to be a living document as I use rsync more.

I also think rsync could use a nicer CLI experience, similar to the way git has been slowly getting a makeover. Maybe splitting rsync into subcommands could help? But it’s all one concept. I’m not sure, but the current barrage of options is quite intimidating.


[Relevant link] [Source]

Sam Stevens, 2024