Latest developments in AutomateDV: pilot your Data Vault project for free
- Andrew Griffin
- May 12, 2021
- 3 min read
The latest in dbtvault from developers Alex Higgs and Chris Fisher and a case study from Borna Almasi of Georgian
Datavault launched its dbtvault tool – to help automate the key stages in creating a data warehouse on a Snowflake database – at the end of 2019.It was originally developed by Datavault show clients how easy it is, relatively, to create a working demo or prototype of a Data Vault 2.0 data warehouse.
Using dbtvault generates a data warehouse conforming to the latest Data Vault 2.0 standards building on dbt an open source tool from Fishtown Analytics.
Among the advantages of using dbtvault for creating a data platform is it is driven by metadata provided by the user, with fully-templated structures for the raw vault. No SQL coding is required – it is all provided within the macros that manage the process of creating your data warehouse.
As a piece of open-source software, it is constantly being refined, and there is a growing community out there helping to constantly revise and improve the software as Datavault’s data engineers Alex Higgs and Chris Fisher explained during a presentation at the UK Datavault User Group online gathering – which can be viewed here.
Alex and Chris’s team – by working closely with Datavault’s customers – are constantly improving the quality of life of the tool, adding new features and finding improvements.
One of dbtvault’s users is Georgian, a Canadian venture capital company, which specialises in helping companies transform their businesses through machine learning and artificial intelligence (AI) by having technical specialists in-house.
Georgian’s data engineer Borna Almasi was looking to create a data warehouse for tracking and identifying possible clients, so he researched how to create the right kind of data platform for the company’s needs.
When he realised he needed a proof of concept rather than employing expensive, contractors to build a data vault from scratch, Borna concluded an open source dbt tool was the way forward. Once he had found dbtvault, he was able to produce that proof of concept in little more than two weeks.
Borna told the online meeting: “For a risky project, it made sense for us to do it as fast as we could, for as little money as we could spend. Using open source allowed us to have agility and gave us options – and made it extensible.”
Borna added: “We are still using dbtvault today, and I think we are going to stick around for a long time and see where it goes.”
Alex outlined a number of new features released for 2021 expand dbtvault’s capabilities.Recent improvements include an improved staging macro, which generates hash keys and hash differences.
Another big feature added has been extension support – is people can now over-ride some of the built-in functionality of dbtvault, enabling them to customise how the hashing works.
Looking further ahead, the dbtvault team is already looking at external table and stage support for Snowflake and creating user-defined naming standards.
Plans for the addition of new custom schema tests shipped with dbtvault will help provide users with Data Vault-specific tests which can be applied to dbt models, leading to improved data reliability and all the other benefits of testing.
They are also working on a command line input (CLI) tool will also help generate and apply standard tests to existing models, and also generation of documentation and dbtvault models themselves, whereas users currently have to develop them manually.
If you want more information or help, check out the dbtvault documentation website or join the Slack Channel dedicated to the tool. The free software is available at Github.
It may still be early days as far as dbtvault’s history is concerned, but the development team at Datavault have big ambitions.