InMage System’s Scout software provides real-time continuous data protection (journaling every write to provide recoveries to any point in time), or CDP, at the block level. So why is that important? Because a question that I am asked more often as of late is why should companies use host-offloaded, real-time CDP software that companies like InMage Systems provide versus near-CDP (snapshots taken periodically) or even asynchronous replication technology?
Asynchronous replication, near-CDP and real-time CDP bear some similarities in that all three technologies copy the write I/Os on a source system and then send the copied writes to a target system. They are also similar in that they first store a copy of the write I/O on a local disk cache on the source system; then send the copied writes to the target system as network bandwidth systems permits. It is at this juncture that the functionality of the three iterations of asynchronous replication begins to diverge.
Asynchronous replication is the earliest and simplest form of the three modes of replication but its flaw is that it only provides one recovery point – the current copy of data on the target system. While this replicated copy of data may provide a recoverable version of data for file systems with saved files, recovering open files or recovering usable instances of data for production applications like Microsoft Exchange or SQL Servers is not a guarantee. Replicated data for these applications needs to be in a consistent state. So unless the last copied write from the source to the target is the one that guarantees the open file’s or database consistency of the application, companies will not be able to recover the application despite the fact they have a full copy of the data on the target.
This type of problem led to the introduction of asynchronous replication performing multiple snapshots in conjunction with application pauses; in essence, the creation of near-CDP. By integrating asynchronous replication and the application, companies could now acquiesce applications, flush all writes out of memory to create a consistent database image, and then take a snapshot of the data. The term “near-CDP” came into use because companies could create tens or even hundreds of snapshots over days or weeks to create multiple recovery points. However the problem with near-CDP was that recovery points were limited to the last recovery point – sometimes minutes or even hours old that is still insufficient for more corporate applications.
Real-time CDP represents possibly the third and final evolution of asynchronous replication. Rather than creating just one recovery point like asynchronous replication does or multiple recovery points that are minutes or hours old like near-CDP, real-time CDP supports the creation of application recovery points back to any previous point in time. Unlike the previous versions of asynchronous replication, it captures and journals every write I/O on the target system to provide a theoretically infinite number of recovery points.
All technologies evolve and mature over time and, in the same way, asynchronous replication has also matured. It has evolved from an early state of simply replicating data from a source to a target to providing multiple recovery points in its near-CDP form to what it is today: real-time CDP that provides for a nearly endless number of recovery points that meet the specific recovery requirements of today’s world. What will be interesting to watch going forward is how will real-time CDP in general, and InMage Systems’ Scout in particular, evolve to provide higher levels of data protection for an even a wider number of enterprise applications and operating systems.