Mining Model Extensions (DMX)

This document provides comprehensive information on Data Mining Extensions (DMX), a query language for SQL Server Analysis Services (SSAS) that enables you to query and manipulate data mining models.

On This Page

Introduction to DMX

DMX is a declarative query language designed specifically for interacting with OLAP data mining models. It shares similarities with SQL but is tailored for data mining tasks. DMX allows you to:

Understanding DMX is crucial for any developer or analyst working with SQL Server Analysis Services data mining features.

DMX Syntax Overview

DMX statements typically follow a structure that includes clauses for selecting data, specifying the source model, applying filters, and defining output. The core statements are often SELECT, INSERT, CREATE MINING MODEL, ALTER MINING MODEL, and DROP MINING MODEL.

A common pattern for querying is:

SELECT
    
FROM
    
[WHERE
    ]
[ORDER BY
    ]

For predictions, the syntax often involves a prediction function:

SELECT
    Predict()
FROM
    
WHERE
    

Common DMX Operations

Creating Mining Models

DMX can be used to create mining structures and models programmatically. This involves defining the source data, the mining algorithm, and the parameters.

Example structure (simplified):

CREATE MINING MODEL MyNewModel
(
    [CustomerID] LONG KEY,
    [Age] DOUBLE DISCRETIZED(10),
    [Gender] TEXT
)
USING
    Microsoft_Decision_Trees
(
    MAX_DEPTH = 8
);

Predicting Data

The Predict function is used to generate predictions for new data instances. You can predict a single value or multiple possible values.

SELECT
    Predict([Product]) AS PredictedProduct
FROM
    MyModel
WHERE
    [CustomerID] = 'ABC-123';

Browsing Model Content

DMX allows you to explore the internal structure of a trained mining model, such as decision tree nodes, clusters, or association rules.

SELECT * FROM MyModel.Tree([NodeID])

Scoring Data

Scoring involves applying a trained model to a dataset to get predictions or probabilities. This can be done using Predict or other prediction functions.

SELECT
    [CustomerID],
    Predict([IsHighValueCustomer]) AS PredictedValue,
    PredictProbability([IsHighValueCustomer], 1) AS ProbabilityOfHighValue
FROM
    MyCustomerModel
WHERE
    [TotalSpend] > 1000;

Key DMX Statements

DMX Examples

Here are a few more practical examples:

Example 1: Predict next purchase for a customer
SELECT
    Predict([ProductBought]) AS NextProduct
FROM
    MyPurchaseModel
WHERE
    [CustomerID] = 'XYZ-789'
    AND EXISTS (SELECT * FROM [OrderHistory] WHERE [CustomerID] = 'XYZ-789');
Example 2: Get probability of a customer belonging to a cluster
SELECT
    Cluster(),
    ClusterProbability([CustomerID]) AS Probability
FROM
    MyClusteringModel
WHERE
    [CustomerID] = 'PQR-456';
Example 3: Browse rules in an association model
SELECT
    [Model],
    [RuleID],
    [Support],
    [Confidence]
FROM
    MyAssociationModel.Rules
WHERE
    [Support] > 0.05;

Best Practices