Introduction to Azure Cognitive Search
Azure Cognitive Search is a cloud search service that provides developers with a rich set of APIs and tools for adding a powerful search experience over private, heterogeneous content in web, mobile, and enterprise applications. Its infrastructure is fully managed, abstracting away the complexities of provisioning, managing, and scaling search infrastructure.
With Azure Cognitive Search, you can:
- Index diverse data: Connect to various data sources like Azure Blob Storage, Azure Cosmos DB, Azure SQL Database, and more.
- Enrich data: Use built-in cognitive skills (image analysis, OCR, natural language processing) to extract insights and add them to your search index.
- Query data: Implement sophisticated search queries with features like full-text search, faceting, filtering, fuzzy matching, and suggestions.
- Customize relevance: Tune search results using scoring profiles and analyzers.
- Scale seamlessly: Automatically scales to handle varying workloads.
Creating a Search Service
Before you can use Azure Cognitive Search, you need to create a search service instance in the Azure portal.
- Sign in to the Azure portal.
- Click Create a resource.
- Search for "Azure Cognitive Search" and select it.
- Click Create.
- Fill in the required details: Subscription, Resource group, URL, Pricing tier, etc.
- Click Review + create, then Create.
Indexing Data
Indexing is the process of loading data into Azure Cognitive Search. This involves defining an index schema and creating an indexer to pull data from a data source.
Index Schema Definition
An index schema defines the fields, data types, and attributes of the documents in your search index. This is often done using JSON.
{
"name": "my-search-index",
"fields": [
{ "name": "id", "type": "Edm.String", "key": true },
{ "name": "title", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false },
{ "name": "content", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": false, "facetable": false },
{ "name": "category", "type": "Edm.String", "searchable": true, "filterable": true, "sortable": true, "facetable": true },
{ "name": "lastUpdated", "type": "Edm.DateTimeOffset", "filterable": true, "sortable": true, "facetable": true }
]
}
Indexers
Indexers automate the process of pulling data from supported data sources into your search index. You can create indexers using the Azure portal, REST API, or SDKs.
{
"name": "my-blob-indexer",
"dataSourceName": "my-blob-data-source",
"targetIndexName": "my-search-index",
"schedule": { "interval": "PT1H" },
"parameters": {
"batchSize": 500,
"configuration": {
"indexedVectors": []
}
}
}
Querying Data
Once your data is indexed, you can search it using the Azure Cognitive Search REST API or SDKs. The search query syntax is powerful and flexible.
Simple Search
A basic search query for documents containing "azure" and "search":
GET /indexes/my-search-index/docs?api-version=2023-11-01&search=azure+search
Filtered Search
Search for documents containing "cloud" within the "technology" category:
GET /indexes/my-search-index/docs?api-version=2023-11-01&search=cloud&filter=category eq 'technology'
Faceting
Retrieve search results along with counts for each category:
GET /indexes/my-search-index/docs?api-version=2023-11-01&search=*&facet=category countable
Scoring Profiles
Scoring profiles allow you to influence the relevance of search results. You can boost or demote documents based on certain criteria.
Example of a scoring profile that boosts documents with a higher "priority" field:
{
"name": "priorityBoost",
"functions": [
{
"type": "freshness",
"boost": 2,
"context": null
},
{
"type": "tag",
"tags": ["premium"],
"boost": 3,
"context": null
}
]
}
Analyzers
Analyzers determine how text is tokenized and processed for indexing and querying. Azure Cognitive Search supports built-in analyzers (e.g., `standard.lucene`, `whitespace`, `simple`) and custom analyzers.
Using the standard Lucene analyzer:
GET /indexes/my-search-index/docs?api-version=2023-11-01&search=searching&searchMode=any&scoringProfile=scoring_profile_name
Synonym Maps
Synonym maps allow you to define terms that should be treated as equivalent during searches. For example, "cloud computing" and "SaaS" could be synonyms.
Suggestions & Autocomplete
Enhance user experience with features like autocomplete (suggesting query terms as the user types) and suggesters (providing related search suggestions based on query terms).
POST /indexes/my-search-index/docs/index?api-version=2023-11-01
{
"value": [
{ "@search.action": "mergeOrUpload", "id": "doc1", "title": "Azure Functions Tutorial", "content": "Learn how to build serverless applications with Azure Functions." },
{ "@search.action": "mergeOrUpload", "id": "doc2", "title": "Getting Started with Azure Kubernetes Service", "content": "Deploy and manage containers using AKS." }
]
}
Geographic Search
Azure Cognitive Search supports geocoordinate data types and queries, allowing you to perform spatial searches like finding points within a radius or bounding box.
Define a `GeographyPoint` field in your index:
{ "name": "location", "type": "Edm.GeographyPoint", "filterable": true, "sortable": true, "facetable": true }
Querying for points within a certain distance:
GET /indexes/my-search-index/docs?api-version=2023-11-01&search=*&filter=search.inRange(location, 'POINT(-122.1373 47.6351)', 'POINT(-122.0000 48.0000)')
Security
Secure your search service using API keys for access control. You can generate query keys (read-only) and admin keys (full access).