Overview
The Azure Speech SDK enables developers to integrate speech‑to‑text, text‑to‑speech, and speech translation capabilities directly into web applications using JavaScript.
Prerequisites
- Azure subscription with a Speech resource (key & region).
- Modern browser with Web Audio API support.
- Node.js 14+ if you plan to run the sample locally (optional).
Installation
Include the SDK via CDN or npm.
<script src="https://aka.ms/csspeech/jsbrowserpackageraw"></script>
npm install microsoft-cognitiveservices-speech-sdk
Quick Start – Speech to Text
Below is a minimal example that captures audio from the microphone and displays the recognized text.
import * as SpeechSDK from "microsoft-cognitiveservices-speech-sdk";
const speechConfig = SpeechSDK.SpeechConfig.fromSubscription("YOUR_SPEECH_KEY", "YOUR_REGION");
speechConfig.speechRecognitionLanguage = "en-US";
const audioConfig = SpeechSDK.AudioConfig.fromDefaultMicrophoneInput();
const recognizer = new SpeechSDK.SpeechRecognizer(speechConfig, audioConfig);
document.getElementById("startBtn").onclick = () => {
recognizer.recognizeOnceAsync(result => {
document.getElementById("result").textContent = result.text;
recognizer.close();
});
};
Recognized text will appear below:
Next Steps
- Explore Text‑to‑Speech and Speech Translation samples.
- Fine‑tune pronunciation with Custom Voice.
- Integrate with Azure Functions for server‑side processing.