JavaScript Speech SDK – API Reference
Comprehensive reference for the Azure Cognitive Services Speech SDK for JavaScript.
Installation
Install via npm or include directly from a CDN.
npm install microsoft-cognitiveservices-speech-sdk
Or use the CDN:
<script src="https://aka.ms/csspeech/jsbrowserpackageraw"></script>
Getting Started
Recognize speech from the microphone and synthesize text to speech.
const SpeechSDK = window.SpeechSDK || require('microsoft-cognitiveservices-speech-sdk');
const speechConfig = SpeechSDK.SpeechConfig.fromSubscription("YOUR_SUBSCRIPTION_KEY", "YOUR_REGION");
speechConfig.speechRecognitionLanguage = "en-US";
const audioConfig = SpeechSDK.AudioConfig.fromDefaultMicrophoneInput();
const recognizer = new SpeechSDK.SpeechRecognizer(speechConfig, audioConfig);
recognizer.recognizeOnceAsync(result => {
console.log(`RECOGNIZED: Text=${result.text}`);
recognizer.close();
});
Demo unavailable in static preview.
Core Classes
Configuration for speech recognition and synthesis.
| Method / Property | Description |
|---|---|
static fromSubscription(key, region) | Creates a config from an Azure subscription. |
speechRecognitionLanguage | Locale for recognition, e.g., "en-US". |
speechSynthesisVoiceName | Name of the voice used for synthesis. |
outputFormat | Audio output format (e.g., SpeechSynthesisOutputFormat.Riff24Khz16BitMonoPcm). |
Defines audio input/output sources.
| Method | Description |
|---|---|
static fromDefaultMicrophoneInput() | Uses the system microphone. |
static fromDefaultSpeakerOutput() | Uses the default speaker for synthesis. |
static fromWavFileInput(file) | Audio from a WAV file (browser only). |
Performs speech‑to‑text.
| Method | Description |
|---|---|
recognizeOnceAsync(callback, errorCallback) | Recognizes a single utterance. |
startContinuousRecognitionAsync() | Begins continuous recognition. |
stopContinuousRecognitionAsync() | Stops continuous recognition. |
close() | Releases resources. |
Events:
recognizing– interim results.recognized– final result.canceled– errors or cancellations.
Synthesizes text to speech.
| Method | Description |
|---|---|
speakTextAsync(text, callback, errorCallback) | Synthesizes plain text. |
speakSsmlAsync(ssml, callback, errorCallback) | Synthesizes SSML. |
close() | Releases resources. |
Events:
speakingStartedspeakingCompletedcanceled
Error Handling
All async methods accept an errorCallback. The error object contains errorDetails and errorCode.
recognizer.recognizeOnceAsync(
result => console.log(result.text),
err => console.error(`Error ${err.errorCode}: ${err.errorDetails}`)
);