We’ve all heard the stories of how our data is growing exponentially and let’s face it our storage spend is probably backing that up, well certainly that’s what the CFO will tell you!
But how often do we stop and really think about why it’s growing and how to control it?
I had some of the traditional thinking about this challenged in an interesting way a couple of weeks back by an old friend who has just undertaken a new role with Actifio (www.actifio.com) and, as people from solution providers do, he was sharing some information on what they do and the value they provide, then he threw up the following information;
It certainly struck a chord with me, the numbers where based on some IDC figures and the basics of the graph are that a staggering 80% of the data in many organisations storage architectures is in fact copies of the production data sets.
according to IDC figures around 80% of data in production storage is copies of the production data set
As you can see from the graph above lots of that data is there for all the right reasons, dev & test, Backups, DR, so it’s not that the capacity is wasted or shouldn’t be there, its not all Johnny in accounts and his holiday snaps!
Well if all the data has a place and is valid, then what do we do about controlling it?
Firstly there are definitely a number of technology solutions out there that can help – for example, I’ve worked with NetApp storage for around 9 years now and their message has always been incredibly strong about storage efficiency with some of the industries leading efficiency technologies around snapshots, de-duplication and compression, thin provisioning etc… many other vendors now bring these technologies to market, some do it well..some not so much…but the option is there…
What else can we do to control the growth of data in our organisations ? – I did a little research and came up with 5 tips that you can follow and then one thing you can look at as an emerging trend that may change the way you look at managing data in your business;
- Classify and understand your data – know where it is, who has access to it, even if anyone does access it
- Store it in the right place – we hear lots about automated tiering etc.…but maybe more importantly ensure you understand what storage tier your data should sit in and place it there at the outset
- Look at an archiving policy – if you’re applying pressure to your production storage, look at what is filling it and does it really need to be there – if no one has accessed data for 5 years does it need to sit on your production storage
- Manage data retirement – How much data is in your organisation that no longer has an owner, look at how a strong governance solution can identify this data and help you to remove or archive it
- Storage efficiency – earlier I mentioned NetApp and their storage efficiency technology, make sure if your storage solution can dedupe and compress then use it where you can.
Back to the start of this article and my meeting with the chaps at Actifio, where do they sit in this, well those tips are all great if the data we are talking about is no longer needed or can be shifted out of the production environment, but what if the data you need is still key and critical, if you think about the graph I showed, most of that data is key to the business, it’s part of DR and Backup, it operates in QA and Dev environments, so it is needed within that production environment.
How do we deal with that then? and that’s how this emerging trend of copy data virtualisation can help
Copy data virtualisation is an emerging trend for managing storage growth
what’s copy data virtualisation? – it’s the ability for a solution from companies like Actifio or Catalogic (www.catalogicsoftware.com) to take a copy of production data and store it outside of the production environment, but unlike archiving or traditional backups, the data is housed in such a way it can be manipulated and presented back to the business instantly for a range of uses, not only a really efficient model for backup and recovery but great for presenting test and dev environments, or presenting data to a data analytics solution or maybe extracting data and moving it to the cloud. All in all providing a hugely efficient and flexible way of handling the challenge of so many copies of our data sitting in production storage systems and as we all know, efficiency and flexibility is all part of the future for business IT.
Copy Data Virtualisation certainly addresses the data growth challenge in a new and interesting way, but don’t rule out the more traditional approaches we listed as well, data growth is only going to continue to be a massive challenge for all of us charged with delivering business IT services, regardless of size of organisation, don’t fear though there is plenty of tech out there to help, some great traditional approaches which are still hugely valid, but also some clever new emerging solutions that can change the way we manipulate and handle our data in the future.