Performance

The procedure described above assumes that an existing backup is being checked for identical files prior to a new backup of a file. This applies to files in the previous backup as well as to the newly created one. Of course it does not make much sense to directly compare every file to be backed up with the previous backup. So, the md5 sums of the previous backup are being compared with the md5 sum of the file to be backed up with the utilization of the hash table.

Computing the md5 sum is fast, but in case of a large amount of data it is still not fast enough. For this reason storeBackup checks initially if the file was not altered since the last backup (path + file name, ctime, mtime and size are the same). If that is the case, the md5 sum of the last backup is being adopted and the hard link set. If the initial check shows a difference, the md5 sum is being computed and a check takes place to see if another file with the same md5 sum exists. (The comparison with a number of backup series uses an expanded but similarly efficient process). For this approach only a few md5 sums need to be calculated for a backup. If you want to tune storeBackup, especially if you save via NFS, there are two things you can do:

tune NFS (see configuring nfs)
use the lateLinks option of storeBackup, and possibly delete your old backups independent from the backup process.
Using storeBackup with lateLinks is like using an asynchronous client / server application or to be more precisely like using multiple batches on (normally) multiple machines:
- Checking the source directory to know what has changed and to be compressed and save the changed data to the backup directory on the backup server. The missing directories and hard links in the backup are stored in a protocol file.
- Take this information and restore a ``normal'' fully linked backup.
- Delete old backups depending on the rules for the deletion.

The follwing performance measurements only show the direct backup time (without calling storeBackupUpdateBackup.pl⁸). They have been done with a beta version of storeBackup 2.0.
Some background information to the following numbers: The backup was run on an Athlon X2, 2.3 GHz, 4 GB RAM. The NFS server was an Athlon XP, 1.84 GHz, 1.5 GB RAM. The network was running with 100 MBit/s, storeBackup was used with standard parameters. The units of the measurements are in hours:minutes:seconds or minutes:seconds. The size of sourceDir was 12GB, the size of the backup done with storeBackup was 9.2 GB. The backups were done with 4769 directories and 38499 files. StoreBackup.pl linked 5038 files internally which means these were duplicates. The source for the data were my files and the ``Desktop'' from my Windows XP Laptop, i.e. ``real'' data.
The first table shows the time for copying the data to the nfs server with standard programs. The nfs server is mounted with option async⁹, which is a performance optimization and not the standard configuration.

command duration size of backup

cp -a 28:46 12 GB

tar jcf 01:58:20 9.4 GB

tar cf 21:06 12 GB

All is like it was to expect: tar with compression is much slower than the other ones; and cp is slower than tar, because it has to create lots of files. There is one astonishing number: The size of the backup file of tar jcf ist 9.4 GB, while the resulting size of the backup with storeBackup.pl is only 9.2 GB. We see the reason for this in the internal linked 5038 files - the duplicates are stored only once with storeBackup.

We do not see the effect of comparing the contents in this benchmark again, but it makes a lot of differences in performance and especially used disk space. If the time stamp of a file is changed, then traditional backup software will store this file in an incremental backup - storeBackup will only create a hard link.

Now let's run storeBackup.pl on the same contents. The nfs server is still mounted with option async. There are no changes in the source directory between the first to the second or third backup.

storeBackup 1.19, Standard 2.0, Standard 2.0, lateLinks mount with async

1. backup 49:51 100% 49:20 99% 31:14 63%

2. backup 02:45 100% 02:25 88% 00:42 25% file system read cache empty

3. backup 01:51 100% 01:54 100% 00:26 23% file system read cache filled

We can see the following:

The first run of storeBackup.pl is faster than tar jcf (tar with compression.) It is easy to understand why: storeBackup.pl uses both cores of the machine, while the compression with tar uses only one. But if you look a little bit deeper to the number, you see that storeBackup.pl needs less than half the time (42%) of tar with compression. It naturally additionally calculates all md5 sums and has to perform the overhead of creating thousands of files (look at the difference between cp and tar cf above). The effect of reducing the time for copying more than 50% comes from two effects: storeBackup.pl does not compress all files (depending on their suffix, e.g., .bz2 files are not compressed again) and it recognizes files with the same content and sets just a hard link (also the reason for 9.2 instead of 9.4 GB).
The second backup was done with a new mount of the source directory, so the read cache for it was not filled. You can see some improvement between version 1.19 and 2.0 because of better parallelization of reading the data in storeBackup itself.
You see no difference in the third run between version 1.19 and 2.0, because reading the source directory entries is now in the file system cache, which means that the blocking factor is now the speed of the nfs server - and that's the same in both runs.
With option lateLinks, you can see an improvement by a factor of 4. The time you see depends massively on the time needed for reading the source directory (plus reading the information from the previous backup, which is always the same).

Now let's do the same with an nfs mount without ``tricks'' like configuring async:

command duration size of backup

cp -a 37:51 12 GB

tar jcf 02:02:01 9.4 GB

tar cf 25:05 12 GB

storeBackup 1.19, Standard 2.0, Standard 2.0, lateLinks mount with sync

1. backup 53:35 100% 49:20 100% 38:53 63%

2. backup 05:36 100% 05:24 96% 00:43 13% file system read cache empty

3. backup 05:10 100% 04:54 95% 00:27 9% file system read cache filled

We can see the following:

Everything is more or less slower, because of higher latency due to the synchronous communication with the nfs server. If only one file is written (like with tar), the difference to the backups with async is smaller, if many files are written, it's bigger.
We see that the difference between sync and async using lateLinks is very small. The reason is simple: Only a few files are written over nfs, so the latency only has a small impact on the overall time for the backup. This results in the fact that the backup with lateLinks and a very fast source directory (cache) is now 10 times faster.
Because the latency is not important for making a backup, I mounted this file server over a vpn¹⁰ over the Internet. This means very high latency and a bandwidth of about 20KByte/s from the nfs server and 50KByte/s to the nfs server (seen on a network monitoring tool). With same same boundary conditions as before (mounted with async, source directory file system in cache, no changes) I got a speed up with lateLinks (compared with non-lateLinks backup) by a factor of 70.
So if your changed or new files are not too big compared with the available bandwidth, you can also use storeBackup (with lateLinks) for making a backup over a vpn on high latency lines.¹¹ Naturally you should not choose option lateCompress in such a case. Another advantage with lateLinks in such cases is, that parallelization works much better, because reading unchanged data in the source directory nearly needs no action on the NFS mount.

Conclusion: If you mount with nfs, you can make it really fast using option lateLinks. See section 7.6 for how to configure it.

Using ``blocked files'' also improves performance a lot because only a small percentage of an image file has to be copied or compressed. See the description about using blocked files for the influence of this option to performance and space needed.

Heinz-Josef Claes 2014-04-20

command	duration	size of backup
cp -a	28:46	12 GB
tar jcf	01:58:20	9.4 GB
tar cf	21:06	12 GB

storeBackup	1.19, Standard		2.0, Standard		2.0, lateLinks		mount with `async`
1. backup	49:51	100%	49:20	99%	31:14	63%
2. backup	02:45	100%	02:25	88%	00:42	25%	file system read cache empty
3. backup	01:51	100%	01:54	100%	00:26	23%	file system read cache filled