Simpsons: How Do You Feel Today? v1.0 MV Backup Server Safe Shutdown Utility

In this article i’m going to explain a recently backup plan i’ve implemented for my company.

The backup plan refers to a single Linux client machine which sends backup data to a backup server, but it can be straightly applied to an arbitrary number of client machines. After a quite extensive search over the net i’ve decided to implement the backup plan using dar - Disk ARchive to actually make the backups and rsync to ship data to the backup storage remote server.

I’ve chosen dar as backup software because i’ve found it enough simple to deploy and at the same time plenty of features; and i’ve chosen rsync to transfer data because of it smart file transfer: rsync uses the "rsync algorithm" which provides a very fast method for bringing remote files into sync. It does this by sending just the differences in the files across the link, without requiring that both sets of files are present at one of the ends of the link beforehand.

The backup strategy

dar is executed:

  1. every night for differential backups;
  2. on the first day of the month to make a full backup;

After successful backup rsync is invoked to transfer data to the backups storage server.

Sysadmin notifications: backups scheduling is done by using cron, so i’m using the cron built-in feature which sends the output of executed commands via mail.

 

Requirements

Client machine:

Server machine:

  • Linux o.s. (or Windows o.s. using cygwin)
  • rsync installed as server

Client machine(s) configuration

Dar installation

dar download page

I’ve installed dar by building it from source, but various packages are available for Redhat, Suze, Gentoo, Ubuntu. If you are installing from the source, pay attention to the configure script output, to see if it claims a "Libbz2 compression (bzip2) : YES": in case it says NO, then you need to install the bzip2-devel package to enable the bizp2 compression.

To install from source:

wget <dar_source_package_url>
tar zxvf dar-<release_version>.tar.gz
cd dar-<release_version>
./configure [--enable-mode=64]
make
make install-strip

Rsync installation

rsync download page

Rsync should be already installed on you Linux system, if not you can install it with yum or apt-get (for example yum install rsync).

Dar configuration

Dar has many options for its configuration, in this article i’m going to explain only the ones that i’ve used (that should be the common ones); check the complete documentation and the good mini-howto for further details and explanations.

The main dar command line that i’ve used (running as root) is:

dar -m 256 -y -s 600M -D -R / -c `date -I`_data \
	-Z "*.gz" -Z "*.zip" .... \            
	-X "<file_exclusion_pattern_1>" \         
	-X ...  \        
	-g <include_dir_1>  \         
	-g ... \         
	-P <exclude_dir_1> \         
	-P ... \            
	[-A previous_backup]

As you can see, the command line is splitted into several lines for easy reading and editing; this is possible by terminating each line with a ‘\’ char. This way you can add or remove file exclusions lines, include/exclude paths lines, etc. by simply adding or removing lines in that showed syntax.
Let’s see the various switches meaning:

  • -m 256

    Files lesser of 256 bytes are not compressed (by default files with 100 bytes or less won’t be compressed).

  • -y [level]

    This option activates Bzip2 archive compression, which by default is turned off. You can even specify a numeric compression level, which goes from 0 (no compression) to 9 (best compression, slow processing). Bzip2 by default uses 6, which is the best speed/compression ratio for most files. I don’t specify compression level, 6 is fine for me.

  • -s 600M

    Here comes DAR’s slice feature. The specified size of 600 Megabytes is the maximum file size DAR will create. If your backup is bigger, you will end up with different backup files each with a slice number before the file extension, so you can save each file to a different unit of your backup media (floppies, zip, CDROM, etc).

  • -D

    Stores directories excluded by the -P option or absent from the command line path list as empty directories. This is helpful when you are recovering a backup from scratch, so you don’t have to create manually all the excluded directories.

  • -R /

    Specifies the root directory for saving or restoring files. By default this points to the current working directory. We are doing a system backup here, so it will be the root directory.

  • -c `date -I`_data

    This mandatory switch means to create a backup archive. `date -I` will provide a date as YYYY-MM-DD format. This way you can create backup archives with the creation date embedded in the name.

  • -Z file_pattern

    Using normal file name globing you can specify patterns of files you want to store in your archive without compression. This only has sense if you use the -y switch. Compressing compressed files only yields bigger files and wasted CPU time.

  • -X mask

    The mask is a string with wild cards (like * and ?) which is applied to filenames which are not directories. If a given file matches the mask, it is excluded from the operation. By default (no -X on the command line), no file is excluded from the operation. -X may be present several times on the command line, in that case a file will not be considered for the given operation if it matches at least one -X mask

  • -g path

    Files or directory to only take in account. -g may be present several time on command-line. By default all files under the -R directory are considered. Else, if one or more -g option is given, just those are selected (if they do not match any -P option). All paths given this way must be relative to the -R directory. This is equivalent as giving <path> out of any option. Warning, -g option cannot receive wild-cards, theses would not be interpreted.

  • -P relative_path

    With this switch you tell DAR which paths you don’t want to store in your backup archive. Note that the paths you specify must be relative to the path specified by the -R switch.

  • -A base_name

    specifies the archive to use as reference (mandatory with -C). By default (default is only possible with -c option) no archive is used and all files are saved. This switch is used to make a differential backup.

Basically the command lines for making a full backup and a differential backup differs only for the final -A switch.

For example, to make a full backup, you can use:

/usr/local/bin/dar -m 256 -y -s 600M -D -R / -c backup_data \
	-g dir1/ \
	-g dir2/ 

this will result in various backup_data.xxx.dar archives (where xxx is the slice number); usually you will get only the backup_data.1.dar

To make instead a differential backup, you can use:

/usr/local/bin/dar -m 256 -y -s 600M -D -R / -c backup_diff \
	-g dir1/ \
	-g dir2/ \
	-A backup_data

this will make a differential backup by using the backup_data archive as reference (note the excluded xxx.dar), resulting in various backup_diff.xxx.dar archives.

After running regular backups, you backups folder will be popolated with the dar archive files:

# ls -al
total 709884
drwxr-xr-x  2 root root      4096 Sep  2 01:15 .
drwxr-xr-x 29 root root      4096 Aug 28 17:48 ..
-rw-r--r--  1 root root 344959943 Aug 29 18:25 2008-08-29_data.1.dar  # <-- full backup
-rw-r--r--  1 root root   3344515 Aug 29 18:59 2008-08-29_diff.1.dar  # <-- differential backup  
-rw-r--r--  1 root root   3608510 Aug 30 01:15 2008-08-30_diff.1.dar  # <-- differential backup
-rw-r--r--  1 root root  14285626 Aug 31 15:20 2008-08-31_diff.1.dar  # <-- differential backup
-rw-r--r--  1 root root 345299771 Sep  1 01:23 2008-09-01_data.1.dar  # <-- full backup
-rw-r--r--  1 root root  14666260 Sep  2 01:15 2008-09-02_diff.1.dar  # <-- differential backup

Rsync configuration

Once you have created your backups, it is recommended to put them on a remote backup server, and in this context comes in play rsync; the client machine configuration is limited to the single command to actually do the job; the command line is quite simple:

rsync --verbose  --progress --stats --compress \
    --recursive --times --perms --links --delete \
    --password-file <password_file> \
    <backups_directory> <user>@<remote_host>::<module_name> 

Let’s see the various switches meaning:

  • –verbose
    Turn on verbose output
  • –progress
    show progress during transfer
  • –stats
    give some file-transfer stats
  • –compress
    With this option, rsync compresses the file data as it is sent to the destination machine, which reduces the amount of data being transmitted.
  • –recursive

    recurse into directories

  • –times

    preserve modification times

  • –perms

    preserve permissions

  • –links

    copy symlinks as symlinks

  • –delete

    delete extraneous files from dest dirs

  • –password-file <password_file>

    This option allows you to provide a password in a file for accessing an rsync daemon. The file must not be world read- able. It should contain just the password as a single line. For example:

    #su
    #echo MyPassword > secretFile
    #chmod 600 secretFile
  • <backups_directory>

    specifies the directory that contains the files to be transferred, for example /backups/*

  • <user>@<remote_host>::<module_name>

    specifies the user name, the remote host and the rsync server module name to use, for example john@www.mybackupserver.com::backups_storage

Here is a sample output from rsync:

building file list ... 
0 files...
6 files to consider
2008-09-02_diff.1.dar
       32768   0%    0.00kB/s    0:00:00
      849356   5%  791.12kB/s    0:00:17
     1493682  10%  707.68kB/s    0:00:18
     2116274  14%  670.40kB/s    0:00:18
     2804402  18%  666.67kB/s    0:00:17
     3492530  23%  627.42kB/s    0:00:17
    14776414 100%    2.42MB/s    0:00:05 (xfer#1, to-check=0/6)

Number of files: 6
Number of files transferred: 1
Total file size: 726274779 bytes
Total transferred file size: 14776414 bytes
Literal data: 3771070 bytes
Matched data: 11005344 bytes
File list size: 156
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 3731735
Total bytes received: 23000

sent 3731735 bytes  received 23000 bytes  326498.70 bytes/sec
total size is 726274779  speedup is 193.43
Share and Enjoy:
  • Digg
  • del.icio.us
  • Technorati
  • Sphinn
  • Facebook
  • LinkedIn
  • Live
Pages: 1 2 3
Improve Microon Lounge rating this post
Tell me what do you think about "Implementing a remote Linux backup plan with Dar and Rsync": I'll write better and better entries.
1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 4.00 out of 5)
Loading ... Loading ...

Leave a Reply


your name
your e-mail address
your website/url

Isn't this worth at least €1?

Any donation is appreciated, consider a €1 donation for each download (or maybe €1.35 since PayPal takes €0.35 from a €1.00 donation). Your support allows me to continue to create free software.