Well, you know! in traditional storage enviroments one of the most important problems is the capability to replicate data from one site to another or, sometime, inside the same site. Replicas are a base functionality for disaster recovery and business continuity issues. Every big enterprise, financial institution, hospital, etc., needs to replicate data as fast as possible on other sites to guarantee a good service level, save money and, some time, lifes!
There are a lot of replication methods, the most famous and adopted in high end enviroments is the synchronous replication: every block written will be instantly copied to a second array, you get the commit for the local write after the remote copy. This approach has pros and cons.
The pros are linked to the quality and consistency of data, you are sure of “what is where every time”, and you can restart second site at once when necessary! the cons are costs: you need speedy storage both for primary and secondary sites, availability of fiber optic cables, bandwidth and low latency (distance is a problem and telcos are another), and so on!
If your company doesn’t need a real time replica, or, in other words, you can loose the latest transactions because you’re not a bank, a first aid surgery department, or a nuclear plant then you can go for cheaper methods to replicate data: asynchronous replicas and snapshot delivery.
Asynchronous replica is an evolution of what described above… you get the OK, from your local storage system before the data is indeed written to the second storage: you will get a little difference between the two systems because the delay in writes. The controller of the array has to do more work because it needs to manage a larger cache and a journaling system to grant the quality of writes but, on the other hand, you need less communication resources (=spend less money).
In the last years, with the adoption of smart snapshots from some vendors, there is a third way: deliver snapshot on timed intervals! Snapshost are “point-in-time” copies of volumes stored in the system itself, they are space savvy and easy to use. Each vendor has his snapshot implementation and some of this implementations are very well tightened to the replication software.
You can take a lot of snasphots a day, in same cases every 15 minutes or less, and then it is possible to send these snapshot via FC or TCP/IP to a secondary storage. If your company can loose 15 min. (sometimes less) of transactions in case of a disaster recovery,you will likely be interested in this kind of replica.
Delivering snapshots has a lot of plus and few minus compared to traditional replicas .
The biggest disadvantage of snapshots is the delay, you cannot reach a real-time or a near-real-time copy… so it’s not useful for every application!
For the rest of the tale you have only advantages:
- Replicas are done after the snapshot: no performance impact on production enviroment;
- Replicas of Volumes/Luns are moved on TCP/IP, without hiring costly dedicated lines for FCP;
- Replicas can be done on high latency and low bandwidth lines (sometimes very far from the primary site);
- Replicas can be deduplicated! (some vendors are deduplication capabilities on the controllers, or you can use appliances like Rivrbed);
- Replicas have an history: it’s possible to restore old replicas if needed;
- you can configure N:1 replicas (i.e.: lot of remote sites and one central site);
- It’s possible to use snapshots for DR testings, without impact on production;
- … and more.
We use a lot Snapshot delivering, and in some cases it is used as a first on line backup!