A quick look back at my blogging and social media back catalogue will show that I’m a bit of a fan of the concept of “Data Fabric” , yep guilty – Well one thing I’ve noticed while I’m on my data fabric rounds sharing the importance of why a fabric strategy is important, one question that often comes up is,
all sounds great this fabric idea, but where do I start?
It’s a great question, which has inspired me to produce a series of posts explaining the practicality of how you can build your own data fabric.
First though, some background including the answer to the critical question “Why on earth go on about this fabric thing in the first place?”
Why a Fabric?
In this most transformative of times in technology, the need for flexibility in our technical architectures has never being greater, the march toward “cloud” models of technical deployment continues at a pace, be that private, public or hybrid clouds. One part of our infrastructure presents a bigger challenge than most, our data, and that’s a problem!
Why a problem? ultimately the reason we build any infrastructure is so that we can present data, protect data, make data available, manipulate data, analyse data – but it’s all about data, compute, cloud, mobility, all about getting value from our data and delivering to our data consumers.
The issue is that data has weight and volume, this makes it hard to move around as well as potentially expensive (look at how much the public cloud providers charge you to get it into and out of their platforms) and of course slow (You cannot beat the laws of physics – to throw in a Star Trek misquote!). But these problems don’t help in a world where we want complete flexibility, where we want to be able to drop our data into a development environment, where we want to have our data moved into and out of appropriate repositories for backup,recovery or DR, without the commercials or the physics defeating us.
All of these challenges are among the considerations that we have to make and why a fabric strategy is important.
What is a fabric?
Data fabric is a strategy rather than a technology, but that doesn’t alter just how critical it is, all of the reasons we want a fabric are outlined above and a fabric strategy is the answer to those challenges, it provides us complete mobility of our data between many data repositories with the minimum amount of tools and is absolutely key to a successful data strategy, for today, and certainly for the future.
it provides us complete mobility of our data between many data repositories with the minimum amount of tools and is absolutely key to a successful data strategy for today, and certainly for the future.
Think about how a data fabric could change the way we deal with public cloud storage. One of the questions I always get about cloud is “Am I locked in” (or “do they have me over a barrel”) and the reality is yes, because getting your data in and out is hard – but what if you could break that barrier so you had complete flexibility of choice, one month you have your data in Azure, the next AWS are commercially a better fit, so you quickly flip your resources across and save yourself significant costs. Now that not only allows us to exploit many of the capabilities available to us, but also opens up whole new ways to operate our business.
It is this kind of flexibility that makes a data fabric strategy a critical part of our future infrastructure plans, be they on-premises, public cloud, private cloud or a mixture of them all, our data strategy has to ensure our data is available wherever we need it to be, whenever we want it to be there.
The idea of building a fabric makes sense, of course we want and need that ability to move our data between different storage repositories.
This begs the question, who’s technology is capable of building such a fabric?
There are technologies that kind of allow bits of a solution, things like migration tools that move VM’s into public clouds, storage gateways, backup and DR as a service solutions that allow us to replicate our data into clouds. These technologies are great and can indeed be part of an overall strategy, but in those cases they are solution silos and there is the potential for an awful lot of stitching to be done to create a data fabric.
It will probably come as no surprise to those who’ve looked at my stuff on data fabric before, that the main strategic partner for me in this space is NetApp. The NetApp fabric strategy is extremely compelling, built on a backbone of Data ONTAP, but including so much more, cloud and virtual versions of ONTAP, AltaVault, Storage Grid, NetApp Private Storage (NPS) for public cloud and of course the upcoming addition of SolidFire.
All provide NetApp with a wide range of storage solutions, but importantly the fabric strategy builds into this the ability to move data between each of these platforms. Many of these tools are in already in place, moving between ONTAP and it’s physical, virtual and cloud solutions is as easy as you’d expect, but the capability to move between Object Stores, AltaVault, 3rd party storage, E-Series, all with a simple set of tools is either already with us or will be in the not to distant future.
This in my opinion delivers the most complete strategy of any of the data storage players.
So if Data Fabric sounds like something you want to deliver into your business then read on, as we look at how you can start that journey.
Starting the fabric journey
What part of our current infrastructure is a good place to start? A NetApp fabric world presents us with multiple starting points, over this occasional series we’ll look at each of these potential Data Fabric entry points ;
- Production Storage
- Test and Dev
- Public Cloud
- IT Continuity
Today though we’ll start with a bit of “low hanging fruit” as the sales folk like to say, by looking at backup and archive.
Backup and archive is often a good place to start with any new technology, it’s relatively non disruptive and relatively low risk, as we can keep existing strategies in place until we are absolutely sure our new solution is what we need.
With that in mind then, how do NetApp help us move into data fabric through our backups and archive.
If we think about what we want, which is our data in the most appropriate place, then public cloud is a great fit for many of our backup and archive needs, hugely scalable and relatively cheap and there are lots of cloud backup products out there – from the simple to the complex, however the key to data fabric is ensuring this is flexible.
Step up NetApp AltaVault. AltaVault is a cloud integrated backup appliance, presenting itself to your existing backup solution (so not necessarily any need to change that) as a backup target, while at the other end of the appliance – it talks to an object store, be that yours, or more likely a cloud based service (such as Azure, Amazon, Softlayer etc..) the AltaVault appliance then works as a gateway between your on-premises solution and your business appropriate object store, deduplicating, compressing and encrypting data before sending it off to your storage repository, for performance it also caches a large segment of that data for local recovery of the most recent data sets, as well as of course optimising the performance as the backup/archive job is written to it.
That’s all great and is a really nice way of opening up the advantages of private and public cloud platforms to our data backup and archive. But how is this part of a fabric? how does this give me flexibility?
Where AltaVault really opens up data fabric is with the availability of public cloud based variants of the on-premises appliances.
How does this help?
Let’s say that we have decided the best place for our backup and archive data is an Amazon S3 store, we deploy our on-prem AltaVault which takes our backup data and sends it off, securely and efficiently to the cloud.
Role back to the beginning, why do we need fabric?
Because we want to be able to have access to our data in the best place possible.
Let’s say we have a disaster and lose the site that houses our Alta Vault appliance, fear not, we go off to AWS marketplace and fire up a cloud version of AltaVault . With this cloud appliance we can point it at our AWS based cloud storage and heah presto, all of our backups… and even better, if we want that back and don’t have access to our original data store, we can restore it into the cloud, maybe even a version of Cloud ONTAP, and there it is, available to us in the best and most convenient place we could need it.
Remember what we said at the start, the idea of a fabric is to ensure that our data is where we want it when we want it, hopefully you can see here how AltaVault takes one part of our data infrastructure and starts to weave that straight into a future data fabric, no disruption on site, no changing of any of our fundamental infrastructure, just taking our existing backup approach and taking advantage of today’s technology paradigms and giving you a whole new and flexible way of protecting your data.
the idea of a fabric is to ensure that our data is where we want it when we want it
That’s what you can do today, right now – but it doesn’t stop there, why not check out what NetApp have planned for data fabric, have a look at this demo presented by NetApp founder Dave Hitz at the recent NetApp Insight conference in Berlin (running time about 13 minutes)
There you go then, step one on how you can start your move to a data fabric, and yes, this is very much about NetApp and their fabric, as I believe their vision is by far and away the most complete in the market, but heah, even if you don’t want to use NetApp in it’s entirety or even in part, hopefully this has opened up some of the practical considerations of a data fabric and gives you some ideas to consider as you plan the next part of your data strategy.
Any questions, feel free to contact me on twitter, LinkedIn or the BLOG comments and I’d love to talk more with you about Data Fabric.
Below are links to a bunch of other things you may want to read, some by me, some from others.