Blog

The Issue Behind Data Evaporation

The recent issues faced by millions of Myspace customers with the deletion of pictures and music have again highlighted some of the fundamental problems that can be created on both supplier and customer side if the use of cloud technology and in this case, cloud storage, is misunderstood by the user and mismanaged by the supplier.

There are many articles, news items and blogs out there on the top of ten things to do or not do when using cloud storage and a thousand pieces of advice to boot. The one thing that struck me when sifting through all of this information was that much of it ignores some of the very fundamental things anyone should consider when depositing valuable files and data, (actually anything of value to you), in a remote, inaccessible, partially controllable resource.

Some of the weird advice I came across said something like “choose a supplier that stores data for companies that are in your sector as they know what they are doing”, citing scalability as a potential issue, which if you follow through the reasoning means that the same types of companies probably have the same peaks and troughs, the same data types and usage patterns which actually could work against you if they are all grouped together.

More weird advice was to watch out for the problems with bandwidths from data centres.  Given this advice appeared in 2017, I can’t really understand the concern from a data centre capability point of view, as the vast majority of data centres have had a multiple range of carriers, and really big data transits to the internet for a long time.  This advice seems to be focused on selling WAN accelerators and devices that shape data traffic, which is all well and good if you can get Amazon, Google or Microsoft to cooperate with your plan.

Other advice given details how to ensure that your cloud storage supplier guarantees a 99.99% uptime on network availability – which shows a complete misunderstanding of how cloud providers of all shapes and sizes provision yours and their connectivity on two important counts.

Firstly, if an upstream network provider will only guarantee 99.99% availability of their network, then anyone using that service, including the cloud storage provider is not going to be able to offer any better service to a downstream client. Any service using these upstream transits has to start from the premise that the best uptime that can be offered is 99.99%.

Secondly, most networks at the data centre end are built with N+1 resilience in mind and therefore are very stable. But what about consideration of the other end? Is the building that the other end of the cloud store which is using an internet point to point connection going to be resilient also? In my experience it’s not. So, the data may be available in the cloud, but not accessible from your office as you have not connectivity.

Where Things Go Wrong

The main consideration when planning to use cloud storage that is not widely discussed is your data use and the categorisation of what data is needed at what point. Critical data that is needed to run your business must always be available, where historical data may well have a different access profile.  Let’s boil these down to three headings. Risk, Accessibility and Cost.

In terms of risk, the primary consideration must be what would I do if I couldn’t get access to the data or couldn’t get it back. This is a question being faced by the clients of Myspace, who inadvertently deleted “old” data.

The accessibility question, what steps do I have to take to ensure I have access to this data on my terms, and if I need it all the time, is putting it in the cloud the right option?

Considering Cost, what is it going to cost to store and retrieve my data, which may include applications to get the data in and out, encryption tools to secure the data while in transit, egress charges (charges to retrieve your own data) and cost of the circuit or internet transit that connects you to the cloud store, which you might already have, but that will be chewed up in the additional input and output flows in seeding and maintaining your cloud data store.

What about risk? I for one have all my important files and pictures backed up onto a device that I can connect and disconnect from the outside world when I need it. Of course, I use the cloud storage that my broadband provider or Amazon prime offers me as they all allow a multi-tiered approach to how I secure the things I consider valuable and need protecting. The use of these also allows me to plan how I want or need to use this data, dependent on the requirement at the time – essentially storage tiering.

So why wouldn’t I do the same for my business data? – the principle is identical and the notion of tiering how you store data is well understood by cloud and storage companies. They all have solutions that can be provisioned in this way, but many companies just consider the cloud proposition an opportunity to just dump data in a convenient, ever expanding bucket – a bit like storing stuff in the garage or attic because we haven’t worked out if we need it or not.

The companies that bemoaned Amazon when their storage platform had a big outage a couple of years back and which blacked out many companies on the Eastern seaboard of the US, just priced out of their minds the options to prevent that from happening, like Geo-redundancy for their data.

Those people now facing the loss of treasured music or pictures also didn’t consider the impact of not having these files more than likely due to the price of extra protection, not the limitations of the technology.

Accessibility is usually high on the list of important features when looking at cloud storage but is generally focused on questions like  “how quick can I get things in and out”, not the more important  “What happens if I can’t get things in and out”, which is normally a bigger operational issue and one you can’t fix on a smartphone. It also flags up another question seldom asked which is, what stops other people getting at my data, particularly when in transit. The worrying fact is that with all of the warnings that are posted, vast amounts of data, corporate and personal are still sent in the clear every day and therefore being stolen.

And what of the cost? Are we really fixated on the sub dollar Gb of storage, or are we more concerned about how we can secure data at a cost effective price and build into it a plan that demonstrates that if one side of the storage plan goes south, and your cloud storage is inaccessible, you have control  of the options available as to how you will react to the outage, or as the case in point deletion of your data.

Whatever your opinion of the merits of cloud storage, it should never be taken for granted that these companies will secure your data at all costs and with cast iron guarantees.  It should also be considered that where some form of physical control of the devices or products used to store your valued files is possible, this hybrid version of storage tiering is always a good option, and if you’re not sure, seek expert advice before you the loose control that many assume they have until it’s too late and your valuable data as evaporated for ever.