Welcome to this week’s DCIG video blog. Again, we’re the analysts here at DCIG. I’ve got Todd Dorsey, Ken Clipperton, and myself, Jerome Wendt, the president, and founder of DCIG.
[Jerome] We’re spending a few minutes sharing some thoughts that are on our minds from the different research areas we are working on.
One of the things that’s been on our mind this week that we wanted to talk about and really been hearing a lot more about: resilience.
This has been a topic for years. However, it seems like it’s taken on greater importance especially with ransomware becoming so prevalent. Companies are much more concerned about the resilience of their infrastructure, how quickly they can recover, and how quickly resilience is being facilitated.
Introducing resilience into your infrastructure may even facilitate the spread of ransomware. I’ve seen that with some of the people I have talked to about that. They have a resilient infrastructure and ransomware has, at times, got into areas where maybe it should not have gotten into. They have created failover scenarios which introduced ransomware into different areas.
So, Ken, I know you’ve been talking a lot about this and thinking about it. Why don’t you share a few thoughts that’s on your head and maybe Todd and I jump in as appropriate?
[Ken] Yes, well the reason for this came up for me is I’m working on a series of reports that really focus on the transition from storage management to effective data management. A lot of organizations are moving through that and one of the drivers is around business resilience. It’s how do we keep our business up and running and sustainable in the face of cyber-attacks is one of the key drivers.
But there are other drivers as well, again, that drive to derive more value from the data that organizations have been keeping. Some of the folks I’ve talked to talk about activating that data. They do not just want to keep it around “in case” but actually, really derive value from that.
This is just another driver of why organizations are moving to data management. It is the scale of the data as they get beyond that petabyte boundary into multiple petabytes. Manual processes they may have been able to use in the past just do not work for them anymore.
So, they need tools that can automate and maybe act as policy-based drivers for their infrastructure. They may move away from storage management to data management in order to achieve business objectives including business resilience, keeping the business open in the face of cyber attacks.
One of the things that jumped out at me as I’ve been doing the research was an interview with a gentleman who manages the provision of services at a security focused provider where they do incident response. They had helped over 100 customers recover from ransomware and cyber-attacks.
He said with them, and really the kind of thing they were talking about, was the benefits of snapshot technology in primary storage. It acted as a key enabler to bringing systems back online in hours or days instead of weeks.
But even where they have been engaged, they bring in a team of professionals that are just focused on this recovery process, typical to get critical core services online again. They target getting those up in within 48 hours or up to a few days. Then other high priority workloads, typically they’re talking about a month and that’s with again, a dedicated team.
This is not do-it-yourself. This is bringing in the experts, unless of course, you’ve been their client before. In that case they have hardened your infrastructure and helped you be ready to recover.
But really the the point coming out of that conversation was a couple things.
One, for me every business really needs to develop and implement a strategy for recovering from a successful cyber-attack within the time frames necessary to meet business continuity objectives. You know, preferably, I mean before disaster strikes.
There are these firms out there that can help you after the fact and get you up and going. I’m sure they come at a good price. But even then, you’re talking days for just your core services to get up and running; active directory, or your virtualized environment, your backup infrastructure, so you can begin to recover things.
I thought it was interesting learning from those experts from working with over 100 clients that’s the typical thing. A few days to get the core going, a month to get other important stuff going, and you did not even talk about the rest of the infrastructure.
Certainly, we saw that in the Omaha area, a major medical facility got hit a little over a year ago. I know they were not able to schedule new appointments for weeks. Systems were still down multiple weeks later and into months later.
Then more recently there’s another medical facility that their parent organization got hit and they were doing scheduling on paper. It was two or three weeks later and they were still doing scheduling on paper. It had significant impacts on people that were receiving treatments.
I’m thinking about how to make this threat of cyber-attack into a win for the business and a couple things came to mind.
One, this certainly has C-level executive and even board level awareness for most enterprises. I mean this is a real threat. I mean, Jerome, you’ve been saying for a couple years now it’s not “If” there will be a successful attack it’s “When,” right?
So, I mean, you’ve just got to have a plan that will work for recovering in time. It’s necessary. So, there’s visibility but also, it’s an opportunity to take a kind of look at the big picture. It’s okay, what are the other priorities for the business that moving to a data management approach with policy-based data management workflow automations and capabilities that essentially you can take advantage of.
Projects that you know are in the pipeline to facilitate accomplishing those things. Maybe it’s a cost reduction opportunity. Maybe it’s a revenue enhancing opportunity. You know accomplishing an enterprise project but taking advantage of those low hanging fruit to be more strategic and accomplish multiple objectives with that and then move toward data management.
[Jerome] Hey Todd, I have a question for you. I know you’re working on some HCI stuff right now. We were talking earlier resilience is just kind of built into HCI. That’s probably the biggest driver that you can potentially have; a highly resilient local infrastructure, even extend it to other sites potentially.
But we started talking about ransomware and I know I’m putting you on the spot here a little bit, but does anything come to mind like when you think about HCI? What are these guys are doing? What some of the providers are doing to detect and stop ransomware in their solutions? Are they doing anything? Does anything come to mind? It just in light of what Ken was talking about? Anything come to mind with those solutions?
[Todd] Nothing comes to mind as part of those solutions specifically around ransomware, other than some of the typical features that we see in software-defined storage solutions and HCI software solutions that are there.
I do think though, related to, or maybe a deeper question, it kind of comes back to a webinar Ken, that you and I did, in terms of how people get started on transforming their infrastructure, which business resilience is a part of that.
I think one of the questions that enterprises have to ask is, “What sort of HCI software solution that they want?” That is dependent upon a lot of things, like their skill sets, the budgets, and whether they want integrated solutions. But the business resilience features and how hard or how custom they want those, kind of plays into the solutions of different vendors offer.
Some vendors offer very strong out of the box solutions with built-in cookie cutter approaches to restoring workloads and data sets. Others are more custom developed and require skill sets to fine-tune. So, I don’t know if that answers your question but it’s a lot of dependencies there.
Anyway, there’s deeper questions that enterprises need to ask that fit into how they run their enterprises with regards to business resiliency features that they need to tie into to be able to recover and keep their business running.
Yeah, I think, you know HCI, I think one of its most appealing attributes is it really does try to make sure that you continue running uninterrupted. If one of the devices, or even maybe more than one device goes offline, they could stay operational. Or even an entire site goes offline, and you may continue running your applications continue running elsewhere.
As Ken was talking about even with snapshots. That has really become a key enabler for recovering more quickly. It’s not quite like HCI where it’s sort of you can transparently recover, but at least you have data you could bring back online.
I know even when I’m working more on the data protection backup it’s sort of like, okay, if you guys are dealing more with production systems, I’m sort of dealing like well if production systems fail what do backups, what can they provide? I think organizations really need to take, especially if you’re deduplicating your data (I’ve been doing quite a bit of research on that lately) you really need to think about that.
OK, well, I’ve got my data, I get these deduplication ratios, but when you start switching over to instant recoveries you start talking about doing these quick recoveries from the deduplicated data. Boy, I tell you one thing, they may use the term instant restore. But from what I’ve seen it’s anything but instant restore, except for maybe a couple of products.
You’re talking just to restore the data for a few applications you may be talking certainly 15 minutes if not hours to do a restore. You start talking about doing restores at scale for multiple applications, just to recover the data, get off of your backup onto storage, I mean you could be talking days to restore.
I don’t want to prolong this conversation. Resiliency has captivated the attention of enterprises right now. Companies are examining it more critically; before it used to be sort of check the box item. We can do backups. We have a backup. We’ve done a snapshot.
Now the importance of how quickly you can access that backup or that snapshot and bring it back online. Or, in case of HCI, what it looks like when what an appliance goes offline, or a node goes offline, or entire sites go offline, what does that look like and how do you recover. That is really taking on much more significance in this age of ransomware.
I would say it’s a common feature though within all the HCI Solutions or high availability features really, and minimum two-node solutions. Some offer multi-node solutions for recovering if a site goes offline. But I do think to on thought to your opening, Ken, with regards to data management, I think one of the challenges that needs to go into data management is given the huge volume of data that enterprises are managing, part of that decision is pre-thinking ahead of time, “What is my critical data?” “What are my critical workflows?” Not all data is critical to running the business. But when you need to identify which data is critical. Then do we have an architecture that supports fast recovery.
[Ken] In the article we published in early December on our site I talk about how achieving data resilience at scale requires a new approach, that leaders need to change their focus from successful backups to successful recoveries. That’s where the applications are live again so that we can meet those recovery time requirements and keep the business running.
[Jerome] Well Ken, with those thoughts thank you both for your contributing and just sort of today’s conversation on resiliency for anyone who’s watching. Thank you for joining us. If you want to learn more about DCIG please go to our website at https://www.dcig.com and if you haven’t already done so, subscribe to our newsletter so you can keep up to date what’s going on with us.
Thank you for joining us and we look for you, say, subscribe to our DCIG YouTube channel so you can keep abreast of all our blogs that we’re putting out here on a regular basis. Thank you.