Azure Data Factory Documentation
Azure Data Factory (ADF) is a cloud‑based data integration service that allows you to create data‑driven workflows for orchestrating and automating data movement and transformation at scale.
Key Concepts
- Pipeline: A logical grouping of activities that together perform a task.
- Activity: A processing step in a pipeline, such as copying data or running a Spark job.
- Linked Service: Connection information that defines the source or destination of data.
- Dataset: Represents data structure within a data store that points to or contains the data being used.
- Trigger: Defines when a pipeline execution should be started.
Sample Pipeline JSON
{
"name": "CopyPipeline",
"properties": {
"activities": [
{
"name": "CopyFromBlobToSQL",
"type": "Copy",
"inputs": [
{ "referenceName": "BlobDataset", "type": "DatasetReference" }
],
"outputs": [
{ "referenceName": "SqlDataset", "type": "DatasetReference" }
],
"typeProperties": {
"source": { "type": "BlobSource" },
"sink": { "type": "SqlSink" }
}
}
]
}
}
Quickstart: Create a Pipeline
- Open the Azure portal and navigate to your Data Factory instance.
- Select Author & Monitor to launch the ADF UI.
- Click + New pipeline and give it a name.
- Add a Copy data activity and configure source & sink linked services.
- Validate and publish the pipeline.