How to Implement a Data Self-Service Program - staging-devopsy.kinsta.cloud

Data self-service is a hot topic these days. By giving different people across an organization–marketers, technologists, business leaders, etc.–access to available data and analytics tools, you empower business users and decision makers to do their jobs effectively. But data self-service is still a work in progress at most organizations.

According to a recent survey by IDG Research, more than half of the respondents’ lines of business rely on IT or data teams to build and set up BI and analytics dashboards, versus handling these tasks themselves. The point being, at most companies, employees do not have direct access to data or to the tools that help them analyze their data. For many businesses, data self-service is the next milestone in achieving rapid time-to-insight and real-time business agility.

Whether giving increased data access to new users or helping traditional data users achieve more advanced analytics tasks, the goal is the same: more people with the ability to quickly get the information they need, without submitting a ticket or waiting in a backlog. However, moving to a more democratized data and analytics system takes strategy and planning.

Here are six steps that can help you organize your data self-service initiatives.

Move Data to the Cloud

Any data self-service initiative should start with seriously considering moving your data from an on-prem data warehouse to the cloud, or expanding your cloud presence if you’re already there. Many obstacles to achieving data self-service are related to infrastructure and the inability to use data quickly and in a cost-effective way. On-prem data architectures often come up short when trying to handle the sheer volume and complexity of data we see today, especially within a practical budget.

Build a Data and Analytics Strategy

Before your organization moves forward with a data self-service initiative, there are several important questions you should ask to help develop a comprehensive plan that is more likely to succeed and benefit your end user:

Where is your organization now, and where is it going?
What does data self-service really mean to your organization?
What about security and data governance?
What are the blockers you might face?

Choose Where You Will Put Your Data

In order to facilitate data self-service, you need to break down silos between data sources and bring data into a centralized location. If that location is the cloud, there are a few choices of how that will look. In moving data to the cloud, you will need to focus on storage, data acquisition, preparation and transformation.

In determining where and how your data will reside in the cloud, there are many factors that come into play:

Will a cloud data warehouse or a data lake best suit your needs? Sometimes compliance or regulatory requirements will guide or dictate your choice.
How important is familiarity? For example, Amazon Redshift is based on PostgreSQL 8.0.2. A company operating an on-premises PostgreSQL data warehouse may be more comfortable moving to Amazon Redshift, since they share many aspects due to common ancestry.
What cloud platform/vendor services do you already use? Choosing a cloud platform related to your existing services may make it easier to trust, land and expand. Sometimes a vendor offers a clear differentiator, which in itself could be a winner. For example, if your company needs to share data with multiple third parties in a safe and secure manner, Snowflake’s data sharing feature may appeal to you.

When bringing data into the cloud, you have three main choices of cloud architectures to choose from:

Platform as a service (PaaS) is a managed service that sits inside a cloud infrastructure that gives you scalability and also lets you retain some control. Examples include certain cloud data warehouses or an Amazon S3 bucket.
Infrastructure as a service (IaaS) is a bare-bones cloud infrastructure from a cloud provider–for example, AWS, Google Cloud or Microsoft Azure. It allows you to set up your own data infrastructure in the cloud to suit your needs. It offers the most control, but it can also be a complex choice.
Software as a Service (SaaS) is an end-to-end managed service with little or no maintenance or overhead that lets you focus on business needs and not worry about maintaining a tech stack. But, you have the least amount of control over factors such as security and how data is stored.

Choose Your BI and Analytics Tools

After choosing the platform that will hold your data, consider your BI and analytics tools. As with the data platform, analytics infrastructures and tools are structured in different ways, depending on IaaS, PaaS and SaaS.

Depending on the choices you make, you may be able to continue to use some or all of your existing ETL/reporting/analytics tools. But moving to the cloud allows you to explore newer technologies and products that align better with your data analytics needs and your future strategies.

PaaS: Sometimes, on-prem services can be replaced with equivalent or similar PaaS cloud offerings. For example, if you move to a CDW such as Amazon Redshift, Snowflake or Google BigQuery, you may be able to load a CDW database driver into your reporting platform and continue to use your existing investments. Or, you could have an ELT process that loads your file directly into a table in your CDW.

IaaS: Within an IaaS, you could move your existing analytics infrastructure into the cloud. End-users may see improved performance due to changes in topology as well as any infrastructure-related upgrades that were carried out as part of this move.

SaaS: Some SaaS offerings come with their own visualization tools. Like PaaS, depending on your choice, you may be able to use existing tools or explore alternatives.

Implement and Train

Once you select your cloud data and analytics platforms, comes the equally important step of implementing your solution and getting your team up to speed.

Before wide-scale implementation, you should embark on a POC to try the tools and determine whether they meet your requirements. Many companies with in-house technical teams tend to carry out the POC themselves. This involves researching the technologies, learning to use them and building a minimum viable product

Many users in your organization will have limited or no experience with running their own analytics and creating their own data visualizations. Even if you think users have experience and will quickly get up to speed, training on new tools and processes benefits everyone.

Build a Support Structure

Every self-service initiative needs a well defined support structure and every user needs a clear path to get help if needed. These are the two types of support:

Support for products and tools. Most vendors have robust support offerings available for a subscription. Most noteworthy and popular open source products tend to have active user-forums or paid support. If you have a technical partner, they may also step in to fill any gaps.
Business process support. It’s nearly impossible to train an end-user for every eventuality or requirement they may encounter. For example, a user may need information that is not available in their daily dashboard. To create an ad-hoc report, they need to understand the underlying data model to be able to make effective use of it.

Establishing a self-service data analytics platform is a multi-step process, especially if you need to modernize your data architecture to do it. Remember to take the process step by step: use the right tools to get data into the cloud and ready for analytics; lean on the expertise of your IT and data teams and your partners; and choose ELT, analytics and visualization tools that are graphical and intuitive for end users. If you follow the steps and keep these things in mind, you will be on your way to making data accessible to every person in your organization who needs it.