MSDN Documentation

Microsoft Developer Network

SQL Functions: A Comprehensive Guide

Functions are powerful tools in SQL that allow you to perform calculations, manipulate data, and return a single value. They enhance code reusability and modularity, making your SQL scripts cleaner and more efficient.

Types of SQL Functions

SQL functions can be broadly categorized into several types:

Scalar Functions

Scalar functions return a single value based on the input. These are the most common types of functions.

  • String Functions: Manipulate string data (e.g., UPPER(), LOWER(), SUBSTRING(), LENGTH()).
  • Numeric Functions: Perform mathematical operations (e.g., ABS(), ROUND(), CEILING(), FLOOR()).
  • Date and Time Functions: Work with date and time values (e.g., GETDATE(), DATEPART(), DATEDIFF(), DATEADD()).
  • Aggregate Functions: Operate on a set of rows and return a single value (e.g., COUNT(), SUM(), AVG(), MIN(), MAX()).
  • System Functions: Provide information about the database or system (e.g., DB_NAME(), HOST_NAME()).
  • Conversion Functions: Convert values from one data type to another (e.g., CAST(), CONVERT()).

Table-Valued Functions (TVFs)

Table-valued functions return a table as their result set. They are similar to views but can accept parameters and contain complex logic.

  • Inline TVFs: Contain a single statement that returns a table.
  • Multi-statement TVFs: Can contain multiple SQL statements to build the result table.

Creating User-Defined Functions (UDFs)

You can create your own functions to encapsulate specific logic that you use frequently. This is a core aspect of extending SQL capabilities.

Scalar UDFs

Used when you need to compute and return a single value.

CREATE FUNCTION dbo.CalculateTax
(
    @Amount DECIMAL(18, 2),
    @TaxRate DECIMAL(5, 2)
)
RETURNS DECIMAL(18, 2)
AS
BEGIN
    RETURN @Amount * @TaxRate / 100.0;
END;

Usage:

SELECT dbo.CalculateTax(100.00, 5.0) AS TaxAmount;

Inline Table-Valued Functions (ITVF)

Efficient for returning tabular data based on parameters.

CREATE FUNCTION dbo.GetProductsByCategory
(
    @CategoryID INT
)
RETURNS TABLE
AS
RETURN
(
    SELECT ProductID, ProductName, UnitPrice
    FROM Products
    WHERE CategoryID = @CategoryID
);

Usage:

SELECT * FROM dbo.GetProductsByCategory(3);

Multi-Statement Table-Valued Functions (MSTVF)

Provide more flexibility for complex data manipulation before returning a table.

CREATE FUNCTION dbo.GetEmployeeSalesSummary
(
    @StartDate DATE,
    @EndDate DATE
)
RETURNS @SalesSummary TABLE
(
    EmployeeID INT PRIMARY KEY,
    TotalSales DECIMAL(18, 2)
)
AS
BEGIN
    INSERT INTO @SalesSummary (EmployeeID, TotalSales)
    SELECT
        e.EmployeeID,
        SUM(od.Quantity * od.UnitPrice) AS TotalSales
    FROM Employees AS e
    JOIN Orders AS o ON e.EmployeeID = o.EmployeeID
    JOIN [Order Details] AS od ON o.OrderID = od.OrderID
    WHERE o.OrderDate BETWEEN @StartDate AND @EndDate
    GROUP BY e.EmployeeID;
    RETURN;
END;

Usage:

SELECT * FROM dbo.GetEmployeeSalesSummary('2023-01-01', '2023-12-31');

Best Practices for Using Functions

  • Performance: Be mindful of performance, especially with UDFs that are called for every row in a large dataset. Scalar UDFs can sometimes hinder query optimization.
  • Readability: Use functions to break down complex logic into manageable, reusable units.
  • Determinism: Understand whether your functions are deterministic (always return the same result for the same input) or non-deterministic, as this can affect indexing and caching.
  • Error Handling: Implement appropriate error handling within your UDFs.

Common Built-in Functions

Here are some widely used built-in functions:

  • COUNT(*)
    Counts the number of rows in a result set.
  • SUM(column)
    Calculates the sum of values in a numeric column.
  • AVG(column)
    Computes the average of values in a numeric column.
  • GETDATE()
    Returns the current system date and time.
  • CONVERT(data_type, expression)
    Converts an expression from one data type to another.
  • ISNULL(check_expression, replacement_value)
    Replaces NULL values with a specified replacement value.