Building a modern data platform – what have we learned?

As I reach the end of this series, it raises the question “what have we learned?”. If you’ve read through it all, you’ve learned you are patient and I’ve learned that writing a series of posts actually takes quite a bit of time. But I digress!

Let’s start at the beginning – what is a modern data platform?

I’ve used the term throughout, but what does it mean? In the introductory post I stated “In today’s modern world however, storing our data is no longer enough, we need to consider much more” and that’s true as organisations now want their data to provide modern data platformcompetitive edge and insights, we also need to ensure we are “developing an appropriate data strategy and building a data platform that is fit for today’s business needs”. In essence those two areas neatly define a modern data platform, storing data is no longer enough and our platform needs to fit today’s rapidly changing demands, integrate with new technologies and give the scale and flexibility we need to turn our data into an asset, all of this while ensuring our data maintains its privacy, security and we maintain governance and control

It’s not storage

While storage plays an important part in any data strategy (our data has to live somewhere) it’s important to realise when we talk about a data platform, it’s not about storage, while the right storage partner plays a crucial part, the choice isn’t driven by media types, IOPS, or colour of bezel, it’s about a wider strategy and ensuring our technology choice enables us to provide the scale, flexibility and security a modern platform demands.

Break down walls

We have also learned that data cannot be stored in silo’s, be that an on-prem storage repository or its modern equivalent the “cloud silo” placing our data somewhere without consideration of how we move it so we can do what we need to with it quickly and easily, is not designing a modern data platform.

Data Insight is crucial

Where our data is held and on what, while important, pales when compared to the managing the futureimportance of insight into how our data is used. Our modern data platform must provide visibility into the who’s, where’s, when’s, what’s and why’s of data usage, who’s accessing it, where is it and when, if ever, are they accessing it, what are they accessing and why. Knowing this, is critical for a modern data platform, it allows us to build retention, security and compliance policies, it allows us to start to build effective data leak protections and be more efficient with our storage and control the costs and challenges that comes with our ever increasing reliance on data.

Without this insight you don’t have a modern data platform.

Data is everywhere

We have also learned that our data is everywhere, it no longer resides in the protected walls of our data centers, it’s living on a range of devices both sat inside and outside of those walls. That’s not just the data we have, it’s also the increasing range of devices creating data for us, our platform needs to be able to ingest, process and control all of it. Protecting data on the very edges of our network to the same degree that we protect, secure and govern that which sits inside our data centers is crucial.

Cloud, cloud and more cloud

Just a few years ago the prognosis for the data industry was that cloud was going to swallow it all and those who looked to use “traditional” thinking around data would be swept away by the cloud driven tide.

080118_0950_Optimisingt1.jpgNow while cloud is unlikely to wipe out all data life as we know it, cloud should certainly play a part in your data strategy, it has many of the attributes that make it an ideal repository, its flexibility, scale, even commercial models make it an attractive proposition.

But it has limits, however ensuring our data platform can integrate cloud where appropriate and maintain all of the enterprise control we need is a core part of a modern platform, you can’t design a modern platform without considering cloud.

It’s a platform

The reason I used the word platform, is because that is what it is, it’s not one component, it is built up of multiple components, as I’ve shown here, it’s storage, data management, governance, control, be it in the datacentre, on the edges of your network or utilising the cloud.

The days of our data just been about one element are gone, we need a strategy that looks at how we use data in its entirety.

Building a modern data platform

The point of this series has been to provide some practical examples of the tools and technologies I’ve used building modern data platforms. Not every platform uses all of these technologies all of the time and it doesn’t have to be these specific ones to build your platform. What is more important is the concept of a data platform and hopefully this series has introduced you to some areas you may not have considered previously and will help you design a platform to get the very best from your data assets.

If you have any questions, please leave a comment on the site, or contact me on twitter @techstringy or LinkedIn

If you’ve missed any of the series head back to the introduction where you’ll find links to all of the parts of the series.

Thanks for reading.

Advertisements

Assessing the risk in public cloud – Darron Gibbard – Ep72

As the desire to integrate public cloud into our organisations IT continues to grow, the need to ensure we maintain control and security of our key assets is a challenge but one that we need to overcome if we are going to use cloud as a fundamental part of our future IT infrastructure.

The importance of security and reducing our vulnerabilities is not, of course, unique to using public cloud, it’s a key part of any organisations IT and data strategy. However, the move to public cloud does introduce some different challenges with many of our services and data now sitting well outside the protective walls of our datacentre. This means that if our risks and vulnerabilities go unidentified and unmanaged it can open us up to the potential of major and wide-reaching security breaches.

This weeks Tech Interviews is the second in our series looking at what organisations need to consider as they make the move to public cloud. In this episode we focus on risk, how to assess it, gain visibility into our systems regardless of location and how to mitigate the risks that our modern infrastructure may come across.

To help discuss the topic of risk management in the cloud, I’m joined by Darron Gibbard. Darron is the Managing Director for EMEA North and Chief Technology Security Officer for Qualys with 25 years’ experience in the enterprise security, risk and compliance industry, he is well placed too discuss the challenges of public cloud.

In this episode we look at the vulnerabilities that a move to cloud can create as our data and services are no longer the preserve of the data centre. We discuss whether the cloud is as high a risk as we may be led to believe and why a lack of visibility to risk and threats is more of a problem than any inherent risk in a cloud platform.

Darron shares some insight into building a risk-based approach to using cloud and how to assess risk and why understanding the impact of a vulnerability is just, if not more useful that working out the likelihood of a cloud based “event”.

We wrap up with a discussion around Qaulys’s 5 principles of security and their approach to transparent orchestration ensuring that all this additional information we can gather can be used effectively.

The challenges presented around vulnerability and risk management when we move to public cloud shouldn’t be ignored, but it was refreshing to hear Darron presenting a balanced view and discussing that the cloud is no riskier than any enterprise environment when managed correctly.

Qualys are an interesting company with a great portfolio of tools, including a number that are free to use and can assist companies of all sizes to reduce their risk exposure both on-prem and in the cloud, to find out more about Qualys you can visit www.qualys.com.

You can also contact Darron by email dgibbard@qualys.com or connect with him on LinkedIn.

Thanks for listening.

For the first show in this series then check out – Optimising the public cloud – Andrew Hillier – Ep71

Fear of the delete button – Microsoft and compliance – Stefanie Jacobs – Ep69

Compliance of data continues to trouble many business execs, whether IT focused or not, it is high on the agenda for most organisations. Anyone who has listened to this show in the past will know, while technology only plays a small part in a building an organisations compliance programme, it can play a significant part in their ability to execute it.

A few weeks ago I wrote an article as part of the “Building a modern data platform” series, this article Building a modern data platform “prevention” focussed on how Microsoft Office365 could aide an organisation in preventing the loss of data, either accidental or malicious. This article explains how Microsoft have some excellent, if not well known tools, inside Office365 including a number of predefined templates which when enabled allow us to deploy a range of governance and control capabilities quickly and easily, immediately improving an organisations ability to execute its compliance plans and reduce the risk of data leaks.

This got me to thinking, what else do Microsoft have in their portfolio that people don’t know about? What is their approach to business compliance and can that help organisations to more effectively deliver their compliance plans?

This episode of the podcast explores that exact topic, this is a show I’ve wanted to do for a while and finally have found the right person to help explore Microsoft’s approach and what tools are quickly and easily available to help us deliver robust compliance.

This week’s guest is Stefanie Jacobs, a Technology Solutions Professional at Microsoft, with 18 years’ experience in compliance. Stefanie, who has the fantastic twitter handle of @GDPRQueen, shares with fantastic enthusiasm the importance of compliance, Microsoft’s approach and how their technology is enabling organisations to make compliance a key part of their business strategy.

In this episode we explore all the compliance areas you’d ever want, including the dreaded “fear of the delete button”. Stefanie shares Microsoft’s view of compliance and how it took them a while to realise that security and compliance are different things.

We talk about people, the importance of education and shared responsibility. We also look at the triangle of compliance, people, process and technology. Stefanie explains the importance of terminology and understanding exactly what we mean when we discuss compliance.

We also discuss Microsoft’s 4 steps to developing a compliance strategy, before we delve into some of the technology they have available to help underpin your compliance strategy, especially the security and compliance section of Office365.

We wrap up with a chat on what a regulator looks for when you have had a data breach and also what Joan Collins has to do with compliance!

Finally, Stefanie provides some guidance on the first steps you can take as you develop your compliance strategy.

Stefanie is a great guest, with a real enthusiasm for compliance and how Microsoft can help you deliver your strategy.

To find out more about how Microsoft can help with compliance you can visit both their Service Trust and GDPR Assessment portals.

You can contact Stefanie via email Stefanie.jacobs@microsoft.com as well as follow her on twitter @GDPRQueen.

Thanks for listening

If you enjoyed the show, why not subscribe, you’ll find Techstringy Tech Interviews in all good homes of podcasts.

While you are here, why not check out a challenge I’m undertaking with Mrs Techstringy to raise money for the Marie Curie charity here in the UK, you can find the details here.

Building a modern data platform – Prevention (Office365)

In this series so far, we have looked at getting our initial foundations right and ensuring we have insight and control of our data and have looked at components that I use to help achieve this. However, this time we are looking at something that many organisations are already using which has a wide range of capabilities that can help to manage and control data but which are often underutilised.

For ever-increasing numbers of us Office365 has become the primary data and communications repository. However, I often find organisations are unaware of many powerful capabilities within their subscription which can greatly reduce the risks of data breach.

Tucked away with Office365 is the Security and Compliance Section (protection.office.com) and is the gateway to several powerful features that should be part of your modern data strategy.

In this article we are going to focus on two such features “Data Loss Prevention” and “Data Governance”, both offer powerful capabilities that can be deployed quickly across your organisation and can help to significantly mitigate against the risks of data breach.

Data Loss Prevention (DLP)

DLP is an important weapon in our data management arsenal, DLP policies are designed to ensure sensitive information does not leave our organisation in ways that it shouldn’t and Office365 makes this straightforward for us to get started.

We can quickly create policies that we can apply across our organisation to help identify types of data that we hold, several predefined options already exist including ones that identify financial data, personally identifiable information (PII), social security numbers, health records, passport numbers etc. with templates for a number of countries and regions across the world.

Once our policies which identify our data types are created we can apply rules to that data on how it can be used, we can apply several rules and, depending on requirement, make them increasingly stringent.

The importance of DLP rules should not be underestimated, while it’s important we understand who has access to and uses our data, too many times we feel this is enough and don’t take that next crucial step of controlling the use and movement of that data.

We shouldn’t forget that those with the right access to the right data, may accidentally or maliciously do the wrong thing with it!

Data Governance

Governance should be a cornerstone of a modern data platform it is what defines the way we use, manage, secure, classify and retain our data and can impact the cost of our data storage, it’s security and our ability to deliver compliance to our organisations.

Office365 provides two key governance capabilities.

Labels

Labels allow us to apply classifications to our data so we can start to understand what is important and what isn’t. We can highlight what is for public consumption, what is private, sensitive, commercial in confidence or any other range of potential classifications that you have within your organisation.

Classification is crucial part of delivering a successful data compliance capability, giving us granular control on exactly how we handle data of all types.

Labels can be applied automatically based on the contents of the data we have stored, they can be applied by users as they create content or in conjunction with the DLP rules we discussed earlier.

For example a DLP policy can identify a document with credit card details in, then automatically apply a rule that labels it as sensitive information.

Retention

Once we have classified our data into what is important and what isn’t we can then, with retention policies, define what we keep and for how long.

These policies allow us to effectively manage and govern our information and subsequently allows us to reduce the risk of litigation or security breach by either retaining data for a period, as defined by a regulatory requirement, or, importantly, permanently deleting old content that you’re no longer required to keep.

The policies can be assigned automatically based on classifications or can be applied manually by a user as they generate new data.

For example, a user creates a new document containing financial data which must be retained for 7 years, that user can classify the data accordingly, ensuring that both our DLP and retention rules are applied as needed

Management

Alongside these capabilities Office365 provides us with two management tools, disposition and supervision.

Disposition is our holding pen for data to be deleted so we can review any deletions before actioning.

Supervision is a powerful capability allowing us to capture employee communications for examination by internal or external reviewers.

These tools are important in allowing us to show we have auditable processes and control within our platform and are taking the steps necessary to protect our data assets as we should.

Summary

The ability to govern and control our data wherever we hold it is a critical part of a modern data platform. If you use Office365 and are not using these capabilities then you are missing out.

The importance of governance is only going to continue to grow as ever more stringent data privacy and security regulations develop, governance can allow us to greatly reduce many of the risks associated with data breach and services such as Office365 have taken things that have been traditionally difficult to achieve and made them a whole lot easier.

If you are building a modern data platform then compliance and governance should be at the heart of your strategy.

This is part 4 in a series of posts on building a modern data platform, the previous parts of the series can be found below.

modern data platform
Introduction

modern storage
The Storage

031318_0833_Availabilit1.png
Availability

control
Control

Keeping your data incognito – Harry Keen – Ep 45

Sharing our data is an important part of our day to day activities, be that for analysis, collaboration or system development, we need to be able to share data sets.

However, this need to share has to be balanced with our needs to maintain the security of our data assets.

I saw a great example of this recently with a company who were convinced they were suffering a data breach and having data leak to their competitors. They investigated all the areas you’d expect, data going out via email, been uploaded to sites that it shouldn’t, or been copied to external devices and leaving the company. None of this investigation seemed to identify any areas of leak.

They then discovered that they had a team of developers who, in order to carry out their dev and test work, where given copies of the full production database, so not only given all of the organisations sensitive data, but they had full and unencumbered administrative access to it.

Now, I’m not saying the developers where at the centre of the leak, however you can see the dilemma, for the business to function and develop, the software teams needed access to real data that represented actual working sets, but too provide that, the business was exposing itself to a real data security threat.

How do we address that problem and allow our data to be useful for analysis, collaboration and development, while keeping it secure and the information contained safe and private?

One answer is data anonymization and that is the subject of this week’s show, as I’m joined by Harry Keen, CEO and founder of anon.ai an innovative new company looking to address many of the challenges that come with data anonymization.

In our wide-ranging discussion, we explore the part anonymization plays in compliance and protection and why the difficulty of current techniques means that we often poorly anonymize data, or we are not even bothering.

We explore why anonymization is so difficult and how solutions that can automate and simplify the process will make this important addition to our data security toolkit, more accessible to us all.

Anonymization plays an important part in allowing us to maintain the value of our data as a usable and flexible asset while maintaining its privacy and our compliance with ever-tightening regulation.

Harry provides some great insights into the challenge and some of the ways to address it.

To find out more on this topic, check out the following resources;

The UK Anonymization Network (UKAN)

The UK Information Commissioner (ICO)

And of course you can find out more about anon.ai here

You can follow Harry on twitter @harry_keen18 and anon.ai @anon_dot_ai

You can contact anon.ai via info@anon.ai

Hopefully, that’s given you some background into the challenges of data anonymization and how you can start to address them, allowing you to continue to extract value from your data while maintaining its privacy.

Next week I’m joined by Ian Moore as we take a Blockchain 101, to ensure you catch that episode why not subscribe to the show? you can find us in all the usual podcast homes.

Until next time, thanks for listening.