SmartSync Backup – Efficient NAS-to-NAS backup 1. Abstract

Allion – Ingrasys Europe
SmartSync Backup – Efficient NAS-to-NAS backup
1. Abstract
A common approach to back up data stored in a NAS server is to run backup software on a Windows or UNIX
systems and back it up into the NAS server via the network. The backup time depends on network traffics, NAS
performance and tape drive speed. It is also somewhat complicated to restore data when you deal with many tape
cassettes and use incremental backups, full backups, cross-tape backups, etc.
The SmartSync backup software runs on the NAStorage servers. It improves the backup speed by only transferring
the modified data blocks, instead of whole files. At the same time, it always builds full backups online and keeps
them online, making it easy for data restoration.
2. The design concepts of SmartSync backup
Currently, most NAS server backups rely on a backup server running some backup software. This backup solution is
common, yet not perfect in some aspects. The SmartSync backup software improves the NAS server backups by
incorporating the techniques of remote synchronization and to complement existing backup solutions.
The issues the SmartSync backup software deals with are:
l
Long backup time
Backup time is affected by many factors – how fast the source data can be read from the NAS server, how busy the
network traffics are, how many mega-bytes can be written to a tape per minute, etc. IT people always try to improve
backup speed and shorten the backup window to minimize the impact on the business operations. They have few
choices and keep investing money in high-end hardware, high-speed networks and fast tape drives.
What else can be done except paying a lot in purchasing high-end hardware, software and upgrading the network
infrastructure?
The SmartSync backup software improves the backup speed at the fundamentals with an algorithm.
1
Traditional backup software copies the whole files to tapes when making backups even if there are only several
bytes changed. In most cases, it wastes much resources since files usually do not undergo drastic changes.
Different versions of a file may have 70% in common even if it is modified very often. If we make daily backups of
the file, we might transfer the whole file everyday, including the same 70% of that file. Why not just transfer only the
different 30%?
The idea of the SmartSync backup software is to transfer only the modifications which occurred since the last
backup. Backup speed is greatly improved since only parts of files need to be transferred. In addition, it reduces
demands on network traffics during backups. The impact on business operations is greatly minimized.
l
Difficulties of restoring data
Typically, when data crashes, IT people have to restore the last full backup and some incremental backups in order
to re-establish the data. The process is complicated. Firstly, you have to locate the data; finding out which tapes
hold those files. Secondly, you have to find those tapes which are placed off-line or even off-site. Thirdly, perform
data restoration, twice or even more – one for restoring a full backup, others for incremental backups.
The SmartSync backup software keeps all backups on line. Even better, it always makes and keeps full backups. IT
people only have to perform data restoration once – just to pick and restore one backup version. All tasks can be
done online, straightforward and easy.
l
Security concerns
Since backup data are sent to the backup server over the network, there are some security risks. The SmartSync
backup software packs data with a strong encryption to prevent any eavesdropping. Anyone who intercepts data
packets will not be able to understand them.
Fast backups. Full backups. On-line backups.
These are the design concepts of the SmartSync backup software. The purpose is to shrink the backup window and
facilitate data restoration. The technology allowing this is our SmartSync algorithm, which utilizes differential block
transfers and as such reduces demands on network bandwidth and largely shortening the backup time. It is now
possible and feasible to make full backups all the time. MIS people do not have to deal with the complexity of picking
tape, finding files and restoring multiple backups. Just choose a backup and restore. Data restoration is much
simplified.
2
3. How does the SmartSync backup works?
The SmartSync backup software performs NAS-t o-NAS backup. Two or more NAStorage systems are required, one
as the SmartSync backup server, the other(s) as backup client(s ). The SmartSync backup server makes and keeps
backups of the client’s data.
(Note: All NAStorage models can be used as SmartSync backup clients, while the NAStorage 8200 can be used as a
SmartSync backup client AND server.)
Files are composed of data blocks. To back up data, the clients send data blocks to the SmartSync backup server. Let
us use some illustrations to see the internal operations of the SmartSync backup software and why it is so highly
efficient.
1.
Originally, there is a backup of the client data in the SmartSync backup server.
2. Some modifications were made onto the client data.
3.
The SmartSync backup software task starts.
It transfers the modified blocks to the SmartSync backup server.
4.
The SmartSync backup server reconstructs data and creates a new backup, “Client backup 2”, based on the
received blocks and the “Client backup 1”.
You can see that a new backup is created by transferring the several data blocks only.
3
4. Saving disk space
l
Keeping only one copy of the files
Since the SmartSync backup software always makes full backups and keeps them online, some might
criticize that it takes much disk space. It is not unusual that many files may stay un-modified from backup to
backup. It wastes disk space if we keep copies of the same files in different backup versions.
The SmartSync backup software resolves this issue by using hard links. It only keeps one copy of each file
on hard disks. At the file-system level, different backup versions use hard links to point to the files.
Please see the following figure.
l
Freeing up disk space automatically
The SmartSync backup keeps all backups in the backup server. Without proper management, the backup server
might unexpectedly run out of space. In addition to manually deleting obsolete backups, the SmartSync backup
software implements a backup deletion mechanism based on a pre-defined policy.
This backup version control is inspired by the GFS (Godfather-Father-Son) tape rotation scheme commonly used on
tape backups. We call it “advanced GFS media rotation scheme”. Instead of re-using tapes as in tape rotation, here it
frees hard disk space for future use. When a new backup version is created, it checks and deletes obsolete backups
automatically according to the rules.
The rules are described hereunder (X, Y and Z are user defined numbers):
(1) For the last X days, it keeps one backup version each day. If there are two or more backup versions in one day,
only the newest backup will remain. Others will be deleted.
(2) For the Y weeks prior to the X days, it keeps one backup per week – the newest backup of each week.
(3) For the Z months prior to the Y weeks, it keeps one backup per month – the newest backup of each month.
4
If X=10, Y=7, Z=5, the backup versions which remain on the backup server will be:
5. SmartSync backup vs. tape backup
l
Unit costs of backup media
Tape backups used to claim low cost per GB. However, hard disks are getting more attractive in terms of prices. The
prices below are listed on http://shopper.cnet.com , surveyed on Oct. 2003.
SuperDLT tape
110GB raw capacity
LTO tape
100GB raw capacity
Prices
~ US$100
~ US$50
Maxtor
DiamondMax Plus 9
200GB, 7200rpm
~ US$200
Cost/GB
~ US$0.9
~ US$0.5
~ US$1
The unit cost of HDDs is almost the same as that of SuperDLT tapes. Considering the maintenance and management
costs, tapes are not always the lowest costs.
l
Backup time
As described above, the SmartSync Backup software only sends modified blocks of files during backups. It saves much
backup time because there are usually partial changes to files.
In case of the first-time backup, the SmartSync backup software has to send all data for a baseline backup. In a real
world test, SmartSync can transfer data at 14MB/sec., about 840MB/min.. For reference, a Quantum SuperDLT tape
drive transfers data at 11MB/sec., or 660MB/min.
5
l
On-line, off-line and off-site data backups
The SmartSync backup software keeps all backups on-line, while tapes are always off-line. You cannot open files on tape
directly. You must restore them in order to read the files.
l
Concurrent backups of multiple clients
Tape media allow sequential access only. It is not possible to write multiple backups to one tape media at the same time.
The SmartSync backup software uses hard disks as backup media, which allows concurrent access. Multiple SmartSync
backup tasks can be performed at the same time. In fact, SmartSync backup servers allow up to 8 concurrent backups
onto one sync point.
6. SmartSync backup benefits
The SmartSync backup differs from the traditional tape backups in many ways. It does not have to replace tape backups,
but complements them. In cases where you prefer online backup data, SmartSync backup is your best choice. The
benefits are:
6
l
Full backups with less backup time than with incremental backups
l
Fast backup with differential block transfers
l
All backups are online, making it easy to restore
l
All communication and data transfers are encrypted and secure
l
Backup version control
How to use the SmartSync feature
SmartSync is a backup option inside the NAStorage. Its main use is for Remote Data Synchronization. It is used when
there are multiple NAStorage sets in both local and remote areas. A synchronous connection of data streaming is
created between the matching volumes or folders on the two servers, enabling the synchronization of data on both sides.
It allows the remote backup of large amounts of data stored on the NAStorage server, ensuring the security of the data.
The NAStorage solution for remote data replication is “SmartSync”. Just like any other NAS storage system on the
market, the current synchronization mode is mirror, meaning that the SmartSync synchronized data from the client-side
server to the remote-side server depends on selected target files/folders.
Sketch map => Mirror mode
Internet or
Intranet
Server destination
Client source
(set SmartSync point)
(set SmartSync task)
Operating flow chart
Start
On the server site
1.
2.
3.
4.
Create a “SmartSync” point.
Select a Volume or folder as “SmartSync” point (destination)
Selerct “Allowed Login Group”.
Determine the “SmartSync” mode. (mirror only).
1.
2.
3.
4.
5.
6.
Create a “SmartSync” task
Determine an IP address to connect onto the SmartSync” server.
Set a login user and password.
Assign a task name.
Select a Volume or folder as source directory for “Mirror”.
Set a schedule for the task
(on the client site)
View Summary during or after Sync
End
7
Settings and procedures
1-1. Setting up the SmartSync functionality for the first time.
Before setting up SmartSync for the first time, one has to create a System Folder in any one of the volumes. This
folder will be used to record a summary of every task that is performed by SmartSync.
1-2. As shown in the following screenshot, choose a volume where the System Folder will reside in the Service
Maintenance screen and press "Apply" for the new setting to take effect.
2-1. Creating a SmartSync Point.
The Synchronization Point, which is also called the Destination Point, is configured on the "synchronized" side of the
NAStorage Server environment. Through TCP/IP, It can easily find the functional interface of SmartSync in the
backup on the server side. (Of course, you need the rights to login to the administrative interface of the server.) See
the graph below:
8
2-2. Continue from the previous screen shot, press the "Add" button to add a new SmartSync Point.
On this page, we will set the path, sync point name, comment, group allowed, and mode of the SmartSync Point.
2-3. As the graph below, click on the little folder icon next to Path and a new child window will pop up, allowing one
to choose and confirm the SmartSync Point. Select the SmartSync Point level to be either the whole volume or some
folder in it.
Click the icon for
selecting Sync Point
2-4. Press the "OK" button on the pop-up window and continue to set up other parameters, including Sync Point
Name, Comment, Group Allowed and Mode.
9
2-5. Group Allowed:
The group here signifies the ones that are already set up on the server, including the default Admins and Everyone
groups, as shown below.
2-6. Synchronization Mode:
Currently, NAStorage only provides Mirror Mode.
2-7. Press the "Apply" button and the SmartSync Point setup on the server side is complete.
The graph below shows the configured SmartSync Point and its related information.
10
3-1. Creating a SmartSync Task:
The task is created on the NAStorage on the synchronizing side, which is also called the "source". As seen in the
graph, we can utilize the client and TCP/IP connection to easily locate the NAStorage that can allow us to perform
SmartSync backups.
3-2. Adding a Task:
Upon pressing the "Add Task" button, the system will ask you to specify the IP address of the NAStorage on the
synchronized side that will do the backup. See the screenshot below, enter the IP address of the synchronized server
in the input box and press the "Next" button.
3-3. By using TCP/IP, the application found the NAStorage on the Destination as well as the Sync Point Name that
were set up earlier.
As shown in the screenshot below, after selecting the SmartSync Point that will perform the synchronization, one
needs to enter the user's login name and password. The login user will need to have the login permission on the
synchronized server.
Input Login User Name :
A. Local User username
B. Domain User Domain \ username
11
3-4. Press "Next" to go to "Set Option" in "Tasks".
Please provide a "Task Name" for the current task. As shown below, the Task Name is "R-sync".
3-5. Setting up the local folder for SmartSync:
As seen below, click on the folder icon to the right of Local folder for SmartSync. A child window will pop up, allowing
you to select and confirm the path to the local folder for SmartSync. Choose either the whole volume or its
sub-folders.
Click this icon to select a Source directory
12
3-6. After clicking on the "OK" in the child window, continue with other setup fields, which include SmartSync
Schedule and its related options.
The screenshot below displays the schedule setup:
A. Immediate – Perform the synchronization immediately after the setup is complete.
B. According to the schedule – select Time, Daily, Weekly, or Day of Month to set up recurring task schedule.
As seen in the example below, the system will automatically perform SmartSync tasks everyday at 09:00(AM).
3-7. SmartSync options setup:
In the current environment of the Internet, in order to transfer SmartSync's synchronization data more efficiently, we
can make decisions about the following options..
A. (Compress the data stream during data transmission)
B. (Contain Security information)
C. (Bandwidth control)
D. (Include/excule file pattern)
PS : Online help -- SmartSync Include/Exclude File Pattern
By default, SmartSync will synchronize all files and folders under the source path. To exclude certain types of files, we
should specify the include/exclude file pattern. The include/exclude file pattern are best explained by examples.
Suppose that we want to exclude all WORD files except those beginning with xyz.
The include/exclude file pattern will be:
+xyz*;-*.doc;
Each pattern starts with a minus (-) or plus (+) sign, and ends with a semicolon (;). A plus sign is for including certain file
types.
A minus sign is for excluding certain file types. In this example, it will exclude *.doc from being synchronized, but
specifically keep those WORD files beginning with xyz. (* is a wild card, which means any characters).
Please note that the include file patterns are always placed before the exclude file patterns, otherwise the include file
patterns will not take effect.
E. ( Perform quick synchronization)
13
3-8. Press the "Finish" button to complete the setup!
The screenshot below shows a complete SmartSync task.
4-1. Checking the task progress and its summary.
When the task is in progress, the Status column will automatically display "Syncing" in blue color. In addition, clicking
on this will show the current task progress and some other detailed information in a child window, as seen in the
screenshot below:
View the Status dialog box during SmartSync.
You can also go to the summary interface to check the related information about the SmartSync that is currently under
way.
14
4-2. Viewing the Summary after SmartSync
Every time after the task is complete, the system will automatically keep a record in the Task Summary Logs. As seen
in the graph below:
15
Event logs and messages
Some possible conditions when performing SmartSync tasks:
1. Memory – at least 256MB
The system needs a large amount of memory to store the names of the synchronizing files on both sides when running
SmartSync, so that it can compare the names, sizes, create (modified) dates, and checksums of the files. Therefore,
to run SmartSync tasks as efficient as possible, the system will rigorously check the size of the installed memory on
the machine.
2. Not enough memory free for the SmartSync task operation.
Because SmartSync is a memory-cunsuming operation, memory utilization rate is critical when launching this task.
When NAStorage detects the free memory is low, the synchronization task will be terminated by the program. To avoid
this situation, we will suggest checking following configuration or timing before launching SmartSync task.
A. Add-on RAM. 512MB is suggested while 256 MB is the minimum.
B. Set the SmartSync task to perform at non-rush hour to aviod memory conflict with routine network services.
C. Set a proper SmartSync source path. If the SmartSync source includes up to millions files/directories, that will
occupy most of the memory capacity when creating check list, it is suggested assigning the source path in multiple
sub-directory in different tasks.
- Server log:
- Client log:
3. Space too low
Please make sure there is enough hard disk space for the chosen sync point on the destination.
See the Event Log in the graph below.
16
4. The Login failed — may be cause d by:
A. The Allow group is not set up correctly.
B. The Password is not set up correctly.
5. There is more than one task running on the same sync point.
Basically, a SmartSync Point does not allow other sync connection to happen concurrently when it is performing
synchronization tasks.
17