top of page

Why Time Matters To Your Data Platform

  • Andrew Griffin
  • Feb 22, 2022
  • 3 min read

Christian Kaul tries to solve the mystery of time when working with the Data Vault Method

The subject of time has probably spawned more famous sayings than any other subject in the modern history of mankind… after all time waits for no man. The idea of time travel has pre-occupied fictionwriters since HG Wells wrote The Time Machine in 1895. Certainly time never stands still and the simple four-letter word has arguably provoked many a four-letter mutterings by frustrated data analysts and engineers. Grappling with the best way to define it when constructing a data warehouse and interpreting its contents. Time certainly flew when Christian Kaul – who heads the Knowledge Gap – asked the simple question ‘What time is it?’ at the latest monthly meeting of the Data Vault User Group. His presentation aimed at looking at the different kinds of time encountered in the world of data analytics and business intelligence – and how best to utilise them to maximise effective use of data for enterprises and organisations of all shapes and sizes. Definitions of time in the context of a data warehouse have developed over the past 30 years as the architecture and implementation of different methodologies have moved forward – Christian summarised the most important ones. One advantage of following Dan Linstedt’s Data Vault method is that every row of data has a record source and load date information – a significant advantage and help when it comes to good data governance and data lineage. So much data is now streamed thanks to the phenomenal growth of the Internet of Things, with consumer products in the ‘Alexa period’ now loaded with sensors capturing data 24/7. As a result, the importance of time has never been greater if we are to make sense of it in terms of the enterprise. So in working out what kind of time is most relevant and important to your business, Christian stressed that, effectively, all of the many definitions he mentioned can be summarised in three dimensions which he listed concisely. Having settled on your definitions of time, another important issue to resolve is how much time your operational system can store this.. Can it handle Appearance Time and Assertion Time or Times? Does it have its own Recording Time, and if not, why not? And if the answers are not affirmative, can the system be changed? Which promptly moves the debate to the next key point… How much time should you have in your data warehouse? That is a matter determined by how much your end-users want to know – the Appearance Time, the Assertion Time, the operational Recording Time and the data warehouse’s Recording Time. If the answer is yes to all four, then why do they? Or if not, can that change as well? Christian advised that your data set-up should store all measures of time operational systems can deliver – and every kind of time needed to meet auditory and regulatory requirements. Importantly, the data warehouse recording time must be stored. But when it comes to presenting the data, as few kinds of time as possible should be available for end-users to see. It really comes down to the kinds of time those users need for any given use case – NOT MORE! Records of time must also relate to real-life measurements– and if not exact, then approximately. Learn more about the complicated subject of time by watching Christian’s presentation above.

bottom of page