Understanding PIT and Bridge Tables in Data Vault 2.0

Rhys Hanscombe
Feb 11
2 min read

In Data Vault 2.0, PIT (Point-In-Time) tables and Bridge tables are advanced, performance-oriented structures that live in the Business Vault. They are not part of the raw vault; instead, they simplify and accelerate queries for common analytical use cases.

A forum discussion clarified when and how to use PIT and Bridge tables, and the practical considerations for implementing them.

What Are PIT Tables?

PIT tables are snapshots that consolidate historical satellite data around a hub or link at specific points in time.

Purpose: Avoid complex subqueries for “latest load” logic.
Typical use case: “What did this entity look like last week?” or historical reporting.
Implementation:
- Start with a single hub or link key set.
- Include timestamps or sequence numbers to maintain uniqueness and enable restarts.
- Optional virtualization as views until performance needs dictate physical tables.

PITs reduce join complexity. This is particularly valuable when satellites load in parallel and traditional joins would be expensive or cumbersome.

[PIT tables vs Star Schema]: Logarithmic PITs and snapshot windows function similarly to star schema facts/dimensions—they can be taken at any time, since satellite data loads independently.

What Are Bridge Tables?

Bridge tables pre-join multiple, often distant, hubs and links to reduce the number of joins required in queries.

Purpose: Improve query performance and simplify analytics.
Use case example: Reduce an 8-join query to just 3 joins.
Optional features:
- Include PIT-like snapshots if the use case requires time-span facts or historical aggregation.
- Typically rebuilt per ETL run rather than persisted long-term.

Bridge tables are especially useful for analytics that span multiple business entities, where repeated join chains would otherwise slow queries.

Key Implementation Considerations

Primary Keys
- PIT tables: Use a sequence number to maintain uniqueness and allow restartability
- Virtual SCD2 scenarios: Hash the hub key + load date to create a surrogate key.
Metadata
- Embedded hash keys (BKCC, MLTID) typically do not persist beyond the Data Vault.
- Optional metadata such as task IDs can help with traceability but is not required.
Virtualization vs Physical Tables
- Both PIT and Bridge tables should initially be implemented as views.
- Physical tables are only necessary when query performance or business requirements demand it.
Security and Segmentation
- PITs and Bridges can also help enforce data security, e.g., segmenting PII from sensitive entities.

When to Use PIT and Bridge Tables

Structure	Main Purpose	Typical Use Cases
PIT Table	Snapshots historical data around hubs/links	Historical reporting, point-in-time analysis, SCD2 simulations
Bridge Table	Pre-joined distant hubs/links	Complex analytics requiring multiple joins, performance optimization, security segmentation

Practical Takeaways

Start with virtual PITs/Bridges until performance justifies physical tables.
Use sequence numbers or hashed surrogates to guarantee uniqueness.
Keep metadata minimal—track what you need for traceability, but avoid persisting embedded hash keys.
Understand the difference: PIT = batch snapshot, Bridge = pre-joined performance structure.
Iterate: Both structures are part of the Business Vault, designed to accelerate analytics without altering the Raw Vault.

Join the Conversation

This blog is based on a real discussion among experienced data professionals in the Data Community forum.

If you are designing PITs or Bridge tables, these decisions impact both query performance and model clarity.