Skip to main content

Sysadmin tools: Using rsync to manage backup, restore, and file synchronization

Rsync is a command-line tool for copying files and directories between local and remote systems that should be in every Linux sysadmin's toolbox.
Image
Introduction to rsync
Image by analogicus from Pixabay

As a sysadmin, I spend most of my energy on two things (other than making sure there is coffee): Worrying about having backups and figuring out the simplest, best way to do things. One of my favorite tools for solving both problems is called rsync.

Rsync was created by one of the same people who invented Samba, Andrew Tridgell. It is such a useful and flexible tool that it's included in every version of Linux and has been ported to other operating systems. Most simply, rsync is a tool for copying files. However, it's much more powerful than that.

  • It keeps two sets of files up to date and synchronized.
  • It runs as a command and can be scripted.
  • It compresses and encrypts the data stream.
  • It uses multiple types of remote access clients (SSH and RSH, for example).

So it's no surprise that it's a favorite of many systems administrators.

The basics

Like the mv and cp command, in its most basic form rsync just needs a source and a destination:

[root@milo enable]# rsync ./foo/testfoo ./bar/

[root@milo enable]# ls -ilR
.:
total 8
5079202 drwxrwxr-x 2 skipworthy skipworthy 4096 Jun 11 15:15 bar
5079201 drwxrwxr-x 2 skipworthy skipworthy 4096 Jun 11 15:08 foo

./bar:
total 8
5001398 -rw-rw-r-- 1 skipworthy skipworthy 8 Jun 11 15:08 testbar
4982446 -rw-rw-r-- 1 root       root       8 Jun 11 15:15 testfoo

./foo:
total 4
5001268 -rw-rw-r-- 1 skipworthy skipworthy 8 Jun 11 15:08 testfoo

We copied testfoo to the bar directory. No big deal, really.

Now, let's add a file to ./foo and re-sync:

[root@milo enable]# touch ./foo/bat.txt

[root@milo enable]# rsync ./foo/* ./bar/
[root@milo enable]# ls -ilR
.:
total 8
5079202 drwxrwxr-x 2 skipworthy skipworthy 4096 Jun 11 15:45 bar
5079201 drwxrwxr-x 2 skipworthy skipworthy 4096 Jun 11 15:25 foo

./bar:
total 8
4992599 -rw-r--r-- 1 root       root       0 Jun 11 15:45 bat.txt
5001398 -rw-rw-r-- 1 skipworthy skipworthy 8 Jun 11 15:08 testbar
4992604 -rw-rw-r-- 1 root       root       8 Jun 11 15:45 testfoo

./foo:
total 4
5002591 -rw-r--r-- 1 root       root       0 Jun 11 15:25 bat.txt
5001268 -rw-rw-r-- 1 skipworthy skipworthy 8 Jun 11 15:08 testfoo

At this point, we want to note a couple of things. First, when we re-ran rsync, it re-copied testfoo and updated the atime. Also, each time it copies a file, it gives the file a new inode number. Therefore, as far as the filesystem is concerned, it's a totally different file (because it is—it copied all the information each time). Finally, note that when we rsync the file, it changes the ownership to the user who executed the command (root, in this case).

All this is important if we want to make backups. This behavior is the same as the cp command. We can also use the cp command to copy directories recursively, as well as preserve attributes and ownership. The big difference is that rsync can do a checksum of the file and compare source and destination files, where cp just looks at the atime value. Rsync's additional functionality is useful for preserving the backup's integrity (we'll get into integrity later in this series).

So let's update just one of these files and see what rsync does:

[root@milo enable]# echo 'this is new text'>>./foo/testfoo
 
[root@milo enable]# ls -al ./foo
 
-rw-rw-r-- 1 skipworthy skipworthy   25 Jun 11 16:13 testfoo

[root@milo enable]# rsync -aruv ./foo/* ./bar/
sending incremental file list
testfoo

sent 194 bytes  received 35 bytes  458.00 bytes/sec
total size is 25  speedup is 0.11

[root@milo enable]# ls -ilR
.:
total 8
5079202 drwxrwxr-x 2 skipworthy skipworthy 4096 Jun 11 16:16 bar
5079201 drwxrwxr-x 2 skipworthy skipworthy 4096 Jun 11 15:56 foo

./bar:
total 8
4992599 -rw-r--r-- 1 root       root        0 Jun 11 15:45 bat.txt
4998080 -rw-r--r-- 1 root       root        0 Jun 11 15:56 footoo.txt
5001398 -rw-rw-r-- 1 skipworthy skipworthy  8 Jun 11 15:08 testbar
4983541 -rw-rw-r-- 1 skipworthy skipworthy 25 Jun 11 16:13 testfoo

./foo:
total 4
5002591 -rw-r--r-- 1 root       root        0 Jun 11 15:25 bat.txt
4997949 -rw-rw-r-- 1 skipworthy skipworthy  0 Jun 11 15:56 footoo.txt
5001268 -rw-rw-r-- 1 skipworthy skipworthy 25 Jun 11 16:13 testfoo

Note that this time we used some switches:

  • -a Archive mode, preserves mtime, permissions and symlinks.
  • -r Recursive mode, drills down into any directories and syncs those (should be redundant with the -a switch, but I always specify it anyway).
  • -u Only updates files if the mtime on the source is newer.
  • -v Verbose mode, tells you what it's doing (it's always nice to be able to monitor what's happening. Another useful trick is to pipe this output to a file and check it later).

Restore file with rsync

So let's pretend it's a few weeks later. The CFO calls and says something's wrong—many files are missing from his /foo directory.

./foo:
total 8
5002591 -rw-r--r-- 1 root       root        0 Jun 11 15:25 bat.txt
4997949 -rw-rw-r-- 1 skipworthy skipworthy 33 Jul 24 15:32 footoo.txt
5001268 -rw-rw-r-- 1 skipworthy skipworthy 25 Jun 11 16:13 testfoo

We take a look at the backups, and see the missing files:

./bar:
total 12
4992599 -rw-r--r-- 1 root       root        0 Jun 11 15:45 bat.txt
4994298 -rw-rw-r-- 1 skipworthy skipworthy 33 Jul 24 15:32 footoo.txt
4994359 -rw-r--r-- 1 root       root        0 Jul 24 15:31 laterfiles1.txt
4994367 -rw-r--r-- 1 root       root        0 Jul 24 15:31 laterfiles2.txt
4994374 -rw-r--r-- 1 root       root        0 Jul 24 15:31 laterfiles3.txt
4994413 -rw-r--r-- 1 root       root        0 Jul 24 15:31 laterfiles4.txt
5001398 -rw-rw-r-- 1 skipworthy skipworthy  8 Jun 11 15:08 testbar
4983541 -rw-rw-r-- 1 skipworthy skipworthy 25 Jun 11 16:13 testfoo

A quick rsync restore:

[root@milo enable]# rsync -aruv ./bar/* ./foo
sending incremental file list
bat.txt
laterfiles1.txt
laterfiles2.txt
laterfiles3.txt
laterfiles4.txt
testbar

And:

./foo:
total 12
4994387 -rw-r--r-- 1 root       root        0 Jun 11 15:45 bat.txt
4997949 -rw-rw-r-- 1 skipworthy skipworthy 33 Jul 24 15:32 footoo.txt
4994562 -rw-r--r-- 1 root       root        0 Jul 24 15:31 laterfiles1.txt
4994564 -rw-r--r-- 1 root       root        0 Jul 24 15:31 laterfiles2.txt
4994565 -rw-r--r-- 1 root       root        0 Jul 24 15:31 laterfiles3.txt
4994567 -rw-r--r-- 1 root       root        0 Jul 24 15:31 laterfiles4.txt
4994579 -rw-rw-r-- 1 skipworthy skipworthy  8 Jun 11 15:08 testbar
5001268 -rw-rw-r-- 1 skipworthy skipworthy 25 Jun 11 16:13 testfoo

The missing files are restored or updated from the more recent backups, but the existing files—which did not change—are left alone. Also, note that the ownership of footoo.txt was preserved.

Wrap up

I encourage you to take a look (as always) at the man page for rsync, and try out this useful command.

Here are a few more switches to consider:

  • -r (recursive)
  • -b (backups)
  • -R (relative)
  • -u (update - copy only changed files)
  • -P (progress)
  • -c (compress)
  • -p (preserve permissions)

In the next article in this series, we'll go a little further and look at remote rsync and some of the other more advanced features of this command.

[ Free online course: Red Hat Enterprise Linux technical overview. ]

Topics:   Backups   Linux  
Author’s photo

Glen Newell

Glen Newell has been solving problems with technology for 20 years. As a Systems Engineer and administrator, he’s built and managed servers for Web Services, Healthcare, Finance, Education, and a wide variety of enterprise applications. More about me

Try Red Hat Enterprise Linux

Download it at no charge from the Red Hat Developer program.