Data Models in Data Warehousing
A data model is a conceptual representation of data structures and relationships within a data warehouse. It defines how data is organized, stored, and accessed, ensuring consistency and facilitating efficient querying for business intelligence and analytical purposes. Choosing the right data model is crucial for the performance and usability of your data warehouse.
Key Data Modeling Concepts
- Dimensions: Descriptive attributes that categorize facts. Examples include Time, Product, Customer, and Location.
- Facts: Quantitative measures that represent business events or transactions. Examples include Sales Amount, Quantity Sold, and Profit.
- Fact Tables: Contain the measures (facts) and foreign keys to dimension tables.
- Dimension Tables: Contain descriptive attributes that provide context for the facts.
Common Data Model Types
1. Star Schema
The star schema is the simplest and most common data model for data warehouses. It consists of a central fact table surrounded by multiple dimension tables, resembling a star.
- Structure: Denormalized dimension tables.
- Pros: Simple to understand, fast query performance for common analytical queries.
- Cons: Redundancy in dimension tables can lead to data integrity issues if not managed carefully.
Star Schema Example

A central SalesFact
table linked to DateDim
, ProductDim
, and CustomerDim
tables.
2. Snowflake Schema
The snowflake schema is an extension of the star schema where dimension tables are further normalized into sub-dimensions. This creates a more complex, snowflake-like structure.
- Structure: Normalized dimension tables.
- Pros: Reduces data redundancy, improves data integrity, can save storage space.
- Cons: More complex queries due to additional joins, potentially slower query performance compared to star schema.
Snowflake Schema Example

The ProductDim
table in a snowflake schema might be normalized into ProductDim
, CategoryDim
, and BrandDim
tables.
3. Data Vault Model
The Data Vault model is a hybrid approach designed for agile and scalable data warehousing. It focuses on tracking historical changes and providing an audit trail.
- Structure: Three core table types: Hubs (business keys), Links (relationships between keys), and Satellites (descriptive attributes and history).
- Pros: Highly scalable, adaptable to changes, excellent for auditing and tracking history.
- Cons: Can be more complex to query for end-users, requires specialized BI tools or views for easier access.
Choosing the Right Model
The choice of data model depends on several factors:
- Project requirements: Complexity of business logic, data sources.
- Performance needs: Query speed expectations for analytical reports.
- Scalability: Future growth of data volume and complexity.
- Team expertise: Familiarity with different modeling techniques.
For many new data warehouse projects, the star schema is a good starting point due to its simplicity and performance. As requirements evolve or for highly complex environments, snowflake or Data Vault models might be more suitable.