DMX Syntax Reference
This document provides a comprehensive reference for the Data Mining Extensions (DMX) language used in SQL Server Analysis Services. DMX is a query language that enables you to query data mining models, retrieve results, and perform data mining operations.
Overview of DMX
DMX is designed for mining structured data. It provides a rich set of functions and statements to interact with various mining models, including:
- Classification Models
- Clustering Models
- Association Rules Models
- Sequence Clustering Models
- Time Series Models
- Linear Regression Models
- Logistic Regression Models
DMX syntax is similar to SQL, but it is tailored for the unique requirements of data mining.
Core DMX Statements
The following are some of the fundamental DMX statements:
SELECT Statement
The SELECT statement is used to retrieve data from a mining model or to generate predictions. It supports various clauses for filtering, grouping, and ordering results.
Example: Predicting values from a classification model
SELECT
[Customer Name].Name AS Customer,
[Product].Name AS PredictedProduct
FROM
[Generic Classification]
PREDICTION JOIN
OPENQUERY(MyDataSource, 'SELECT * FROM vTargetMail') AS TargetMail
ON
[Generic Classification].[Customer] = TargetMail.[Customer Name]
INSERT Statement
The INSERT statement is used to populate a mining model with data. This is typically done during the training or retraining of a model.
Example: Inserting data into a mining model
INSERT INTO [MyMiningModel] (
[CustomerID],
[Age],
[Gender],
[Income],
[Purchased]
)
SELECT
CustomerID,
Age,
Gender,
Income,
Purchased
FROM
[SalesData]
CREATE MINING MODEL Statement
This statement creates a new mining model definition within a database. It specifies the algorithm, data sources, and parameters to be used for model creation.
ALTER MINING MODEL Statement
Used to modify an existing mining model, such as adding or removing columns, or changing parameters.
DROP MINING MODEL Statement
Removes a mining model from the database.
Common DMX Functions
DMX provides a rich library of built-in functions for data manipulation, calculations, and model-specific operations. Some common categories include:
Prediction Functions
Functions used to retrieve predicted values, probabilities, or costs from a trained model.
Predict()PredictProbability()PredictCost()
Model Navigation Functions
Functions that allow you to navigate through the structure of a mining model, such as exploring clusters or association rules.
MEMBER()PATH()PROPERTIES()
Set Functions
Functions for working with sets of data, similar to those found in MDX.
NON EMPTYSUMMARIZE()
String and Numeric Functions
Standard functions for manipulating strings and performing numerical calculations.
DMX Syntax Elements
DMX uses various keywords, identifiers, and operators. Understanding these elements is crucial for writing effective DMX queries.
Keywords
Reserved words with special meanings, such as SELECT, FROM, WHERE, INSERT, CREATE, PREDICTION JOIN.
Identifiers
Names given to mining models, columns, and other objects. They can be simple or quoted.
[MyMiningModel]
[Customer].[Age]
Operators
Symbols used to perform operations, such as comparison operators (=, <, >), logical operators (AND, OR), and arithmetic operators (+, -).
Data Types in DMX
DMX supports various data types, including:
- Numeric (Integer, Float)
- String
- Date/Time
- Boolean
- GUID
- BLOB
The data types used in DMX queries should generally correspond to the data types of the columns in the mining model and the underlying data sources.