What are UDFs?
User-Defined Functions (UDFs) in Azure Cosmos DB allow you to write and execute custom JavaScript functions within your database queries. This enables you to perform complex data transformations, custom aggregations, or business logic directly within your database, improving performance and reducing the need for client-side processing.
UDFs are written in JavaScript and are executed within the Cosmos DB server-side environment. They can be scalar (returning a single value) or table-valued (returning a set of rows). However, Cosmos DB currently only supports scalar JavaScript UDFs.
Key Concepts
- Server-Side Execution: UDFs run directly on the Cosmos DB server, minimizing network latency.
- JavaScript: UDFs are written using standard JavaScript syntax.
- Scalar Functions: Currently, only scalar UDFs are supported, meaning they return a single value.
- Integration with SQL API: UDFs are invoked within your SQL API queries using the
UDF()function. - Registering UDFs: UDFs must be registered with your Cosmos DB collection before they can be used.
Creating and Registering UDFs
You can create and manage UDFs using the Azure portal, Azure CLI, Azure PowerShell, or the Cosmos DB SDKs.
Example: Creating a Scalar UDF
Let's create a simple UDF that calculates the area of a circle given its radius.
function calculateCircleArea(radius) {
return Math.PI * radius * radius;
}
To register this UDF in Azure Cosmos DB (using the SQL API), you would typically submit a JSON document with the following structure:
{
"id": "calculateCircleArea",
"body": "function calculateCircleArea(radius) { return Math.PI * radius * radius; }",
"kind": "Scalar"
}
This can be done programmatically using the SDKs or via the Azure portal's "Script explorer" under the "Blade Explorer" section.
Using UDFs in Queries
Once registered, you can call your UDF within SQL queries. The syntax is UDF_NAME(argument1, argument2, ...).
Example: Querying with the UDF
Assuming you have a collection named products with documents containing a radius field, you can query for the area of circles using the calculateCircleArea UDF:
SELECT
p.id,
p.radius,
UDF.calculateCircleArea(p.radius) AS area
FROM
products p
WHERE
UDF.calculateCircleArea(p.radius) > 100
Note the use of UDF. prefix before the function name in the query. This is important to distinguish UDFs from built-in functions.
Best Practices and Considerations
- Performance: While UDFs are powerful, complex or inefficient UDFs can impact query performance. Optimize your UDF logic.
- Error Handling: Implement robust error handling within your UDFs to prevent unexpected query failures.
- Idempotency: Design UDFs to be idempotent where possible, especially if they perform operations that might be retried.
- Security: Be mindful of the code you deploy as UDFs, as it runs server-side.
- Alternatives: Consider if stored procedures or client-side logic might be more appropriate for certain scenarios.
- Data Types: Ensure data types passed to and returned from UDFs are handled correctly.
Limitations
- Currently, only scalar JavaScript UDFs are supported.
- UDFs cannot perform operations that modify data.
- UDFs have resource limits (e.g., execution time, memory) to prevent runaway queries.
- Access to external resources or network calls is not permitted from UDFs.