Processing Objects in Multidimensional Modeling
This document explains the concepts and methods for processing objects in SQL Server Analysis Services (SSAS) multidimensional models. Processing is the act of loading data into an Analysis Services database, cube, or dimension, and making that data available for querying.
Understanding Processing
Processing involves several key steps:
- Data Extraction: Reading data from the underlying relational data sources.
- Data Transformation: Applying any necessary transformations or calculations.
- Data Loading: Writing the processed data into the Analysis Services storage engine.
- Indexing and Aggregation: Building indexes and calculating aggregations to optimize query performance.
The processing of an object depends on its type and its relationships with other objects. Analysis Services intelligently determines the most efficient way to process objects to ensure data integrity and performance.
Processing Methods
There are several processing modes:
- Full Processing: This is the most comprehensive method. It clears all existing data for the object and repopulates it from the source. This is typically used for initial loads or when significant schema changes have occurred.
- Incremental Processing: This method updates only the new or changed data since the last processing. It's more efficient than full processing for frequently updated data.
- Clear Processing: This method empties the data for an object without reloading it. This is useful for scenarios where you want to reset an object's data but intend to reload it later.
- Data Rebuild: This operation rebuilds the internal structures of an object (like aggregations) without re-extracting data from the source.
Processing a Database
Processing an entire Analysis Services database involves processing all its constituent objects. You can choose to process the entire database in one go, or process individual objects within it.
When you process a database, Analysis Services follows a defined order to ensure dependencies are met:
- Dimensions
- Cubes (including their associated dimensions and measures)
- Mining structures (if any)
Processing Tables and Partitions
Tables and partitions are fundamental units for data storage. When a table or partition is processed, its associated data is loaded from the data source.
Table Processing
Processing a table typically involves:
- Clearing existing data in the table.
- Extracting new data from the source query.
- Loading the extracted data.
Partition Processing
Partitions allow you to divide large tables into smaller, manageable units. Processing a partition means loading data for that specific partition.
You can process partitions individually or as part of a larger processing operation for the table or cube.
Processing Cubes and Dimensions
Dimension Processing
Dimensions hold the descriptive attributes used for slicing and dicing data. Processing a dimension:
- Updates attribute data.
- Rebuilds dimension hierarchies.
- Updates dimension caches for faster lookups.
Key concepts for dimension processing include:
- Attribute Relationships: Processing ensures that relationships between attributes are correctly maintained.
- Key Attribute Order: The order of attributes influences processing performance.
Cube Processing
Processing a cube is a crucial step that makes the data available for analysis. When a cube is processed:
- Dimension Usage: The cube's relationships with its dimensions are considered.
- Measure Calculations: Aggregations and calculations for measures are performed.
- Aggregations: Pre-calculated aggregations are built to speed up query responses.
You can choose to process the entire cube or specific parts of it, such as individual partitions or calculated members.
Processing Relationships
Analysis Services manages relationships between dimensions and fact tables (within cubes) to ensure data consistency. During processing, these relationships are enforced.
When a dimension linked to a cube is processed, the cube might need to be reprocessed if the change affects the dimension's key or name attributes. Analysis Services often prompts for this or can be configured to handle it automatically.
- Schedule processing during off-peak hours.
- Use incremental processing whenever possible.
- Monitor processing jobs for errors and performance.
- Test processing thoroughly after schema changes.
- Consider the order of processing objects (dimensions before cubes).
Understanding and effectively managing the processing of your Analysis Services objects is fundamental to maintaining a responsive and accurate data model.