MSDN Documentation

Understanding Facts and Dimensions in Data Warehousing

Data warehousing is built upon a dimensional model, which organizes data into facts and dimensions. This structure makes it easier to query and analyze business processes. Understanding the distinction between facts and dimensions is fundamental to designing and implementing an effective data warehouse.

Facts: The Measurements of a Business Process

Facts are the numerical, additive measures that represent the key performance indicators (KPIs) or metrics of a business. They are typically associated with a specific business event or transaction.

Common examples of facts include:

Facts are stored in fact tables, which are typically large and contain foreign keys linking to dimension tables.

Example Fact Table (Sales):


CREATE TABLE FactSales (
    SalesKey INT PRIMARY KEY,
    DateKey INT,
    ProductKey INT,
    StoreKey INT,
    CustomerKey INT,
    SalesAmount DECIMAL(10, 2),
    QuantitySold INT,
    FOREIGN KEY (DateKey) REFERENCES DimDate(DateKey),
    FOREIGN KEY (ProductKey) REFERENCES DimProduct(ProductKey),
    FOREIGN KEY (StoreKey) REFERENCES DimStore(StoreKey),
    FOREIGN KEY (CustomerKey) REFERENCES DimCustomer(CustomerKey)
);
            

Dimensions: The Context for Facts

Dimensions provide the context for the facts. They describe the who, what, where, when, why, and how of the business event. Dimension tables contain descriptive attributes that allow users to slice, dice, and filter the fact data.

Dimension attributes are typically used for grouping, filtering, and labeling in reports and queries. They are usually textual or categorical and are not directly aggregated.

Common examples of dimensions include:

Dimension tables are usually smaller than fact tables and contain primary keys that are referenced by foreign keys in the fact table.

Example Dimension Table (Product):


CREATE TABLE DimProduct (
    ProductKey INT PRIMARY KEY,
    ProductName VARCHAR(255),
    ProductCategory VARCHAR(100),
    ProductBrand VARCHAR(100),
    SKU VARCHAR(50)
);
            

The Relationship: A Sales Transaction

Consider a single sales transaction. The facts might be the SalesAmount and QuantitySold. The dimensions providing context would be:

  • When: The date of the sale (from DimDate).
  • What: The product sold (from DimProduct).
  • Where: The store where the sale occurred (from DimStore).
  • Who: The customer who made the purchase (from DimCustomer).

A query to find the total sales amount for a specific product category in a particular month would join the FactSales table with DimProduct and DimDate tables, filtering by category and month.

Key Design Principles

By effectively separating facts and dimensions, data warehouses enable powerful analytical capabilities, allowing businesses to gain deep insights into their operations and make informed decisions.

Last Updated: October 26, 2023