Clearing Skies for Cloud Data Warehousing
- Hannah Dowse
- Apr 4, 2024
- 3 min read
ASSESSING THE STRENGTHS AND WEAKNESSES OF DATA MESH, DATA FABRIC AND DATA LAKEHOUSE
As one of the founders of the data warehousing industry, 9sight Consulting founder, Barry Devlin, is well-placed to try and make sense of where data analytics is heading. Over the last four years, there have been three key developments in trying to find a faster and more efficient way to make business intelligence (BI) functions perform better – while offering key insights that deliver meaningful returns on investment, and, crucially, in terms of time.
And while the world’s economies moved to great digitalisation – thanks to the opportunities provided by a rapid expansion in moving data from on the premises to the Cloud – Data Lakehouse, has joined Data Mesh and Data Fabric as new forms of data architecture. That all came about before huge strides in large language models driven by artificial intelligence were made over the last 18 months. Barry, who started out more than 30 years ago working for IBM, has written three books on BI, data warehousing and more recently ‘Cloud Data Warehousing, Vol. 1’ with a fourth – ‘Cloud Data Warehousing Vol. 2’ being published later this year. Having worked for more than three decades as a data architect, manager, consultant and software evangelist, Barry continues to develop new architectural models for critical decision-making, including new thinking on the use – and consequences of – implementing artificial intelligence in data analytics. Barry’s talk started out by recapping what data architecture primarily looked like since the mid 1980s before looking at
what he considers the common architectural features an enterprise model needs whether on prem, or in the Cloud.
He then considered how Databricks’ promotion of the Data Lakehouse had produced a new way of handling the streaming of large data sets, –utilising graphs and image databases – before exploring the main tenets of Data Fabric and Data Mesh.
PRO’S AND CONS OF EACH ALTERNATIVE TO DATA WAREHOUSE
Barry’s presentation covered the pros and cons of each of the three data architectures he discussed.
In summary, the Data Lakehouse utilises low-cost scalable object storage for all data types, but it does not address the typical ingestion of a data warehouse i.e. multiple operational and other sources of data.
Nor does it offer the full range of functions of a relational database, creating a vacuum in referential integrity and multi-table transaction support, with limited support for governance and ownership.
Gartner has championed the use of Data Fabric primarily as it supports design, deployment and utilisation of integrated and reusable data in every environment, including hybrid and multi-Cloud platforms.
But it is short of specific key software, and metadata catalogues and standards also continue to lag in required function and interoperability, Barry stated.
Data Mesh’s key strength is that it removes bottlenecks in centralised data delivery teams, including in data governance by embedding the latter in the infrastructure.
But there are still many concerns – primarily, multiple definitions of just what a Data Mesh is, and whether it will still exist in its current form years down the line.
It requires a significant in-house development effort, while lacking standards in metadata, tools and methodology.
Reconciliation of data across sources, which is the backbone of an enterprise data warehouse, is excluded in the original Data Mesh approach – although some working models have used a Data Vault to underpin the architecture.
Barry also warned that centralised governance is still needed for common policies and standards for data management, while the decentralisation can lead to “chaos” in immature data governance organisations.
THE MORE THINGS CHANGE, THE MORE THEY STAY THE SAME!
Ultimately, he stated, committing to a Data Mesh could see development resources spread across many domains with resulting inefficiencies and a dilution of skills in the set-up and delivery. Quoting Gartner’s 2023 Hype Cycle for Data Management, Barry believes all three data architectures will barely pass the “peak of inflated expectations” by 2028. And in Data Mesh’s case, it risks being obsolete before even reaching that plateau, while Data Fabric could limp into the trough of disillusionment by 2033. Watch to hear a fascinating independent analysis of the advantages and disadvantages of the different approaches from someone who has seen the evolution of the industry over many years.

