“This is what it sounds like when Doves Cloud Service Providers cry”
Some time ago, I wrote a piece arguing that not all storage media should be believed to be equal and that it’s important to understand the realities of both hard disk and solid state drive technology. The key takeaway from this was that it’s important that storage buyers take into account a sensible and healthy balance of cost, growth and risk in any solution implementation.
This is important for many businesses, but is especially critical for cloud or managed service providers who provide a platform for multiple organizations. In these cases, an outage or performance issue can have devastating results not only for a provider’s reputation but its bottom line.
Well now reality has hit home for customers of Dimension Data in Australia (see http://forums.theregister.co.uk/forum/1/2014/07/04/dimension_data_in_cloud_outage/ ). It’s hard to know exactly what went on but they’ve been quite open in admitting that they suffered an outage on their EMC storage implementation. The result was no service to customers for more than 24 hours – OUCH!
Sadly this is an all too common with storage architectures being implemented in “cloud” data centers but it really doesn’t have to be this way. If we look at the two most common causes of storage failures (be it partial through performance issue or a complete outage) in enterprise data centers then they’re very simple.
- Human error (e.g. knocked cable, wrong controller rebooted, wrong drive pulled)
- Drive failure either a RAID rebuild or multiple failures causing outage
Both of these scenarios are entirely avoidable through the realization of true zero-touch storage. The storage industry has done a fantastic job of conditioning storage buyers and administrators into believing that hard disk failure and subsequent replacement is entirely acceptable and poses no risk when this couldn’t be further from the truth. I myself have worked (many, many years ago) as a storage field engineer and I’ve seen, heard, and I have to confess been involved in horror stories involving either human error or multiple drive failures resulting in outage and / or data loss.
However this doesn’t need to be the case. I’m starting to see a trend of all flash array vendors arguing that this can easily be solved by moving to an all SSD / flash architecture but the same issue can easily happen again. Don’t believe the hype about “there’s no moving parts so there’s nothing to fail” – that’s simply marketing hype and the truth is – ALL drives have the ability to fail, spinning or non-spinning. The only way to avoid drive failure is to have the ability to repair drives in-situ with no impact on the workload. X-IO has spent a great deal of time and money on solving this issue (with its Intelligent Storage Element architecture) and cloud service providers are perfectly positioned to take advantage of this.
The ideal storage for cloud providers is something that is:
- Truly zero-touch. Many cloud and managed service providers have remote or even third-party data centers – wouldn’t it be nice if you never had to go near a storage array?
- Consistent. Storage should give consistent performance and reliability regardless of its utilization. It should give the same performance at 1% capacity utilization as it does at 99%.
- Scalable. You shouldn’t have to buy a 500 disk monster array up front to get predictable performance, neither should you have to suffer when you add an extra shelf of disks.
- Commercially viable. At the end of the day, it’s key that this predictability and reliability doesn’t come a cost that breaks the business model of the service provider.
Hey guess what, X-IO’s ISE delivers all the above and has the case studies here to prove that it’s not just marketing hype but reality!