top of page

Cutting Data Fabric and Mesh to Measure

  • Andrew Griffin
  • Nov 19, 2021
  • 4 min read

How to decide if Data Mesh or Data Fabric is the right solution

Arguments have raged over the last decade over whether you should stick with a data warehouse or go for a data lake. Data warehouses have been around since the 1980s while the Data Vault method of managing your enterprise data warehouse – Invented by Dan Linstedt – has undergone two incarnations over the last 20 years. A rapidly changing world – and the data it generates – has left companies and organisations worldwide grappling with larger and larger data sets and struggling to store and analyse their data to gain the insights business intelligence can bring – not just to the boardroom table but across the whole operation. Now there is a new terminology in town – well two to be precise … data mesh and data fabric. They aim to address the issues created by legacy data warehouses and heavily centralised data platforms, which can seriously limit an organisation’s ability to react to the rapidly changing data landscape, as business teams demand new and different views of the aggregated data and new projections of the latter. Data sources now come from the Internet of Things (machine-generated) and everything from videos and web traffic to tweets (human-sourced information) on top of long-standing operational and informational data (process-mediated). Dr Barry Devlin, one of the original IBM team that created the original data warehouse architecture in the mid-1980s, now runs his own consultancy business 9sight. He has been studying the latest incarnations in data analysis platforms and gave a very interesting insight into the worlds of Data Mesh and Data Fabric. He drew upon his 30 years’ experience in the data world to deliver an assessment of both at this month’s UK Data Vault User Group online meeting, in a presentation titled Cutting Data Fabric and Data Mesh to Measure They both aim to grapple with increasingly complex data domains – in the supply chain, production, and marketing – as well as the proliferation of data, both in terms of consumers and sources. Business of all shapes and sizes crave increasing agility, with less demand on manpower and financial resources. So, in the quest to become more data-driven, the hunt is on to find ways to satisfy data demands in a timely manner and work around the dangers of a lack of collaboration – where data teams are stuck in silos from the domains that originate the data, or use it for decision-making in isolation. Barry defined a Data Fabric as an “integrated layer of data and connecting processes, supporting design, deployment and utilisation of integrated and reusable data across all environments, including hybrid and multi-cloud platforms.” Data Fabrics were first spoken about by Forrester, among others, as far back as 2016, and Dr Devlin believes they are simply an evolution of a logical data warehouse. While offering those traditional benefits of a data warehouse, the challenge of a Data Fabric is ensuring choreographed workflows and active metadata management – plus embedded machine-learning. Barry believes deciding on whether your company or organisation requires a Data Fabric depends on the number of data warehouses and data lakes it has, as well as the number of operational systems the user’s access – coupled with the rate of change in information-preparation processes and marts involved. Utilising a Data Fabric can limit maintenance costs for data models and information preparation through increased automation and empower your data-users to find and easily use the right information. A Data Mesh addresses the need to unlock analytical data at scale, offering autonomy and democratisation against what are often highly centralised data lakes and data warehouses, and the dangers of the latter two failing. The principles of Data Mesh borrow heavily from microservices and DevOps, and treat data very much as a product, with decentralised data ownership and architecture, and a new approach to governance with the emphasis on local versus central control. The technical concepts of a Data Mesh incorporate a multi-plane data platform with computational policies embedded in the mesh, pulling data on demand thanks to a virtualisation of the data. A Data Mesh can remove the need for centralised copies of data as far as possible but relies on a strong focus on active context-setting information (CSI). But Dr Devlin also warned a Data Mesh requires extensive changes to design thinking and new technologies, with some gaps in the open source and commercial tooling required. A business characterised by data engineering bottlenecks preventing quick implementation of new data products, and how much of a priority data governance is for an organisation, should determine if a Data Mesh is possibly the right solution. Then the quantity of data sources, size of the data team and number of data domains utilised will also be very important in determining the answers. Barry finished his presentation with a series of criteria that should be met if you are seriously considering either solution to your data platform needs.

bottom of page