Azure Functions Blob Trigger (Python)

This document details how to use the blob storage trigger for Azure Functions with Python. The blob trigger allows your function to run automatically in response to changes in a blob container.

When to use the blob trigger

The blob trigger is ideal for scenarios where you need to process blobs as they are created or updated in Azure Blob Storage. Common use cases include:

Image resizing or thumbnail generation.
Data validation and transformation.
Processing uploaded files (e.g., CSV, JSON).
Orchestrating workflows based on file arrival.

Creating a Blob Trigger Function

To create a blob trigger function in Python, you define a function that accepts a BlobTrigger object as input. You specify the connection string and the blob container path in the function's binding configuration.

`function.json` Configuration

The bindings for your Azure Function are typically defined in a function.json file. For a blob trigger, it looks like this:

{
  "scriptFile": "__init__.py",
  "bindings": [
    {
      "name": "myblob",
      "type": "blobTrigger",
      "direction": "in",
      "path": "samples-workitems/{name}",
      "connection": "AzureWebJobsStorage"
    }
  ]
}

Let's break down the properties:

name: The name of the variable that will represent the blob content in your Python code.
type: Must be blobTrigger for a blob trigger.
direction: Must be in for a trigger.
path: Specifies the blob container and a pattern for matching blob names. In this example, samples-workitems/{name} means the function will trigger for any blob in the samples-workitems container, and the blob name will be available in the name variable.
connection: The name of an App Setting that contains the Azure Storage connection string. AzureWebJobsStorage is a common default.

Python Code (`init.py`)

Your Python function code will receive the blob data as a stream or a string, depending on how you configure it. Here's a basic example:

import logging
import azure.functions as func

def main(myblob: func.InputStream):
    logging.info(f"Python blob trigger function processed blob\\n"
                 f"Name: {myblob.name}\\n"
                 f"URI: {myblob.uri}\\n"
                 f"Size: {myblob.length} Bytes")

    # Read the blob content
    blob_content = myblob.read().decode('utf-8')
    logging.info(f"Blob content: {blob_content[:200]}...") # Log first 200 chars

In this code:

myblob: func.InputStream defines the input parameter, typed as an InputStream from the azure.functions library.
myblob.name provides the name of the blob.
myblob.uri provides the URI to access the blob.
myblob.length gives the size of the blob in bytes.
myblob.read().decode('utf-8') reads the entire blob content and decodes it as a UTF-8 string. For large blobs, consider processing in chunks.

Working with Blob Data

The func.InputStream object provides methods to interact with the blob content:

read(): Reads the entire blob content into memory as bytes.
chunks(): Returns an iterator that yields chunks of the blob content. Useful for large files.

Blob Trigger Path Patterns

You can use wildcards and binding expressions in the path property to make your trigger more flexible:

{name}: Captures the blob name and makes it available as a variable.
{container}: Captures the container name.
{rand-guid}: Generates a unique GUID.

For example, to trigger only for .csv files in a specific container:

"path": "input-csv/{name}.csv"

To trigger for blobs in a container named by an App Setting:

"path": "{inputcontainer}/processed/{name}"

And in your Python code, you can access these bound parameters:

import logging
import azure.functions as func

def main(myblob: func.InputStream, name: str, inputcontainer: str):
    logging.info(f"Blob '{name}' in container '{inputcontainer}' processed.")
    # ... rest of your logic

Error Handling and Retries

Azure Functions provides built-in retry mechanisms. If your function execution fails, the Functions runtime may retry the execution. For the blob trigger, this typically means the function will be invoked again with the same blob. Ensure your function is idempotent to handle retries gracefully.

Important: For production scenarios, especially with large files or complex processing, consider using the blob output binding to write results to a different location or service instead of modifying the input blob directly.

Azure Functions Documentation