top of page

Jump start your data warehouse

  • Writer: Rhys Hanscombe
    Rhys Hanscombe
  • Dec 3, 2019
  • 2 min read

In December 2019, Alex Higgs delivered a practical webinar on how to combine dbt, Data Vault 2.0, and Snowflake to build modern, scalable, and maintainable data warehouses. Here’s a summary of the key insights and actionable strategies from that session.


The Modern Data Warehousing Landscape

Today’s data warehousing is defined by:

  • Cloud-based infrastructure

  • Agile development and ETL automation

  • Continuous integration and shorter development cycles

  • Strong data governance

  • Self-service business intelligence and advanced dashboarding

  • AI-driven analytics


These trends demand tools and frameworks that are flexible, scalable, and easy to automate.


What Is dbt?

dbt (data build tool) is a free, open-source command-line tool built in Python. It’s designed for data analysts and engineers to transform raw data into actionable insights. dbt is the “T” in ELT, enabling you to:

  • Write modular, reusable SQL with templates and macros

  • Standardize and automate data transformations

  • Reduce errors and manual coding

  • Leverage multi-threaded and incremental processing

  • Maintain documentation and test data quality

  • Integrate with popular databases like Snowflake, Redshift, BigQuery, and more


dbt Cloud offers enterprise features such as a web-based IDE, scheduling, version control, and role-based security.


Introducing AutomateDV: Easy-Entry Data Vault Automation

AutomateDV is a dbt package that automates the creation of Data Vault 2.0 structures. Key features include:

  • Metadata-driven automation: Provide metadata, not SQL

  • Standardized templates for hubs, links, satellites, and transactional links

  • Macros to simplify complex SQL

  • Fully tested with documentation and hands-on examples

  • Designed for easy adoption and proof-of-concept projects


AutomateDV helps you quickly implement Data Vault 2.0 best practices, saving time and reducing complexity.


How Does AutomateDV Fit into the Data Pipeline?

AutomateDV sits at the heart of the modern data pipeline:

  1. Source systems feed data into persistent staging layers.

  2. Data is loaded and transformed using dbt and dbtvault macros.

  3. Raw and business Data Vault layers are built in Snowflake.

  4. Analytics and dashboards are powered by clean, trusted data marts.


This architecture enables rapid, repeatable, and auditable data warehouse development.


Real-World Example: Snowflake TPC-H with Data Vault 2.0

The webinar showcased a worked example using Snowflake’s TPC-H dataset:

  • Profiling source data and identifying relationships

  • Simulating transaction feeds

  • Building 25 tables: 7 hubs, 7 links, 1 transactional link, and 10 satellites


This demonstrates how dbtvault and Snowflake can handle complex, real-world data models with ease.


Development Insights and What’s Next

  • Metadata ingestion and balancing usability with maintainability are key challenges.

  • AutomateDV is evolving, with plans for more Data Vault 2.0 structures (effectivity satellites, PIT tables, bridge tables, reference tables, and more).

  • Internal tools and a planned web app will further automate and simplify Data Vault development.


Summary: The Power of dbt, Data Vault, and Snowflake

  • dbt streamlines and automates your data warehouse workflow with templated transformations and robust features.

  • Snowflake provides a scalable, cloud-native backbone for your data warehouse.

  • AutomateDV delivers metadata-driven templates for rapid Data Vault 2.0 implementation.


Together, these tools enable you to jump start your Data Vault 2.0 data warehouse and deliver trusted, scalable analytics faster than ever.

bottom of page