Why Data Vault Won’t Work Long Term Without End-to-End DataOps
- Andrew Griffin
- Sep 20, 2021
- 3 min read
Justin Mullen and Guy Adams explain why embracing DataOps principles will be key to any successful data platform project and how tool can help
For the past three years, the UK Data Vault User Group has set out to increase organisation’s experiences using Data Vault 2.0, and to increase knowledge of the best ways to ensure your data platforms are properly designed and operate in the best way to achieve your data science and analytics needs. This presentation ‘Why Data Vault Won’t Work Long Term Without End-to-End DataOps’ – which can be viewed here – certainly encapsulated many of the lessons our members have learned about and shared in that time, condensed into one single presentation. So when you are considering how to ensure the most agile and up-to-date way to either build a new data platform, or migrate an existing data warehouse to the Cloud – maximising the benefits using Data Vault 2.0 methodology – DataOps.live utilises complete automation and exhaustive testing to ensure delivery of high-quality data products, without sacrificing any stability or governance. DataOps.live CEO Justin Mullen explained that end-to-end pipeline automation is critical for a Data Vault’s successful use in data analytics and data science. Those pipelines must be robust enough to pass from data extraction and preparation, to loading the raw vault and modelling the business vault – before creating the information marts that can then be shared to the business users. Without DataOps’ full automation and regression testing at every step, an organisation’s data platform’s long-term outlook is it will not work successfully, Justin said. “A lack of automation, lack of robustness, lack of testing, lack of enterprise security, and – more critically – lack of data testing all results in a lack of trust from stakeholders,” he added. “That effectively is the thing that will make data platforms, data projects and data products fail.” DataOps.live makes no apology for utilising the power of Snowflake to achieve its aims, set up alongside a Git repository to track everything, with a “Single Source of Truth.” DataOps for Snowflake provides a single user-interface to carry out: •Simplified end-to-end orchestration and management of data pipelines •Extraction, loading and transformation (ELT) and •Automated testing via massively parallel SQL queries It allows for faster development, increased efficiencies while reducing costs and maintaining data assurance at all times, with no compromise on data governance or security. DataOps.live chief technical officer Guy Adams, who has 20 years’ experience working in DevOps in the software development world, undertook a 20-minute demonstration of how the platform works. You can start with the smallest of projects, and work in isolation or collaboratively in teams, he stressed. That means, not only can maximum agility be brought to bear – and at the most minimal cost – it also ensures no damage can be done to the business’s existing data platform. And by using Snowflake, the smallest companies to the biggest multi-nationals can also guarantee all the automated testing DataOps requires, is done cheaply and avoiding very costly errors further down the line. It also prevents bad data from getting to business users. During a quick Q&A Guy explained why Snowflake is DataOps’ chosen technology partner. “DataOps.live only supports Snowflake because it is the only data platform that has the requisite set of functionality that are required for good DataOps,” said Guy. But he said other market players are catching up and when they do, he expects DataOps.live to be available with the likes of Big Query and Azure Synapse Analytics in the not so distant future. DataOps.live offers four-day training courses as part of its package, but Guy stressed that staff with knowledge of writing “basic-to-moderate” SQL code and an ability to edit YAML files, will find their feet within a couple of days. Justin and Guy have produced the DataOps for Dummies book for anyone wanting a crash course. You can also find out more by going to
and