Class SpeechRecognizer
- java.lang.Object
-
- com.microsoft.cognitiveservices.speech.Recognizer
-
- com.microsoft.cognitiveservices.speech.SpeechRecognizer
-
- All Implemented Interfaces:
AutoCloseable
public final class SpeechRecognizer extends Recognizer
Performs speech recognition from microphone, file, or other audio input streams, and gets transcribed text as result. Note: close() must be called in order to release underlying resources held by the object.
-
-
Field Summary
Fields Modifier and Type Field Description EventHandlerImpl<SpeechRecognitionCanceledEventArgs>canceledThe event canceled signals that the recognition was canceled.EventHandlerImpl<SpeechRecognitionEventArgs>recognizedThe event recognized signals that a final recognition result is received.EventHandlerImpl<SpeechRecognitionEventArgs>recognizingThe event recognizing signals that an intermediate recognition result is received.-
Fields inherited from class com.microsoft.cognitiveservices.speech.Recognizer
disposed, eventCounter, sessionStarted, sessionStopped, speechEndDetected, speechStartDetected
-
-
Constructor Summary
Constructors Constructor Description SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig)Initializes a new instance of Speech Recognizer for embedded speech recognition.SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig, AudioConfig audioConfig)Initializes a new instance of Speech Recognizer for embedded speech recognition.SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)Initializes a new instance of Speech Recognizer for embedded speech recognition.SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig, AudioConfig audioConfig)Initializes a new instance of Speech Recognizer for embedded speech recognition.SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig)Initializes a new instance of Speech Recognizer for hybrid speech recognition.SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig, AudioConfig audioConfig)Initializes a new instance of Speech Recognizer for hybrid speech recognition.SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)Initializes a new instance of Speech Recognizer for hybrid speech recognition.SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig, AudioConfig audioConfig)Initializes a new instance of Speech Recognizer for hybrid speech recognition.SpeechRecognizer(SpeechConfig speechConfig)Initializes a new instance of Speech Recognizer.SpeechRecognizer(SpeechConfig speechConfig, AudioConfig audioConfig)Initializes a new instance of Speech Recognizer.SpeechRecognizer(SpeechConfig speechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)Initializes a new instance of Speech Recognizer.SpeechRecognizer(SpeechConfig speechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig, AudioConfig audioConfig)Initializes a new instance of Speech Recognizer.SpeechRecognizer(SpeechConfig speechConfig, SourceLanguageConfig sourceLanguageConfig)Initializes a new instance of Speech Recognizer.SpeechRecognizer(SpeechConfig speechConfig, SourceLanguageConfig sourceLanguageConfig, AudioConfig audioConfig)Initializes a new instance of Speech Recognizer.SpeechRecognizer(SpeechConfig speechConfig, String sourceLanguage)Initializes a new instance of Speech Recognizer.SpeechRecognizer(SpeechConfig speechConfig, String sourceLanguage, AudioConfig audioConfig)Initializes a new instance of Speech Recognizer.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voiddispose(boolean disposing)This method performs cleanup of resources.StringgetAuthorizationToken()Gets the authorization token used to communicate with the service.StringgetEndpointId()Gets the endpoint ID of a customized speech model that is used for speech recognition.OutputFormatgetOutputFormat()Gets the output format of recognition.PropertyCollectiongetProperties()The collection of properties and their values defined for this SpeechRecognizer.StringgetSpeechRecognitionLanguage()Gets the spoken language of recognition.Future<SpeechRecognitionResult>recognizeOnceAsync()Starts speech recognition, and returns after a single utterance is recognized.voidsetAuthorizationToken(String token)Sets the authorization token used to communicate with the service.Future<Void>startContinuousRecognitionAsync()Starts speech recognition on a continuous audio stream, until stopContinuousRecognitionAsync() is called.Future<Void>startKeywordRecognitionAsync(KeywordRecognitionModel model)Configures the recognizer with the given keyword model.Future<Void>stopContinuousRecognitionAsync()Stops a running recognition operation as soon as possible and immediately requests a result based on the the input that has been processed so far.Future<Void>stopKeywordRecognitionAsync()Ends the keyword initiated recognition.-
Methods inherited from class com.microsoft.cognitiveservices.speech.Recognizer
canceledSetCallback, close, doAsyncRecognitionAction, getImpl, getPropertyBagFromRecognizerHandle, recognize, recognizedSetCallback, recognizingSetCallback, sessionStartedEventCallback, sessionStartedSetCallback, sessionStoppedEventCallback, sessionStoppedSetCallback, speechEndDetectedEventCallback, speechEndDetectedSetCallback, speechStartDetectedEventCallback, speechStartDetectedSetCallback, startContinuousRecognition, startKeywordRecognition, stopContinuousRecognition, stopKeywordRecognition
-
-
-
-
Field Detail
-
recognizing
public final EventHandlerImpl<SpeechRecognitionEventArgs> recognizing
The event recognizing signals that an intermediate recognition result is received.
-
recognized
public final EventHandlerImpl<SpeechRecognitionEventArgs> recognized
The event recognized signals that a final recognition result is received.
-
canceled
public final EventHandlerImpl<SpeechRecognitionCanceledEventArgs> canceled
The event canceled signals that the recognition was canceled.
-
-
Constructor Detail
-
SpeechRecognizer
public SpeechRecognizer(SpeechConfig speechConfig)
Initializes a new instance of Speech Recognizer.- Parameters:
speechConfig- speech configuration.
-
SpeechRecognizer
public SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig)
Initializes a new instance of Speech Recognizer for embedded speech recognition. Added in version 1.19.0- Parameters:
embeddedSpeechConfig- embedded speech configuration.
-
SpeechRecognizer
public SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig)
Initializes a new instance of Speech Recognizer for hybrid speech recognition.- Parameters:
hybridSpeechConfig- hybrid speech configuration.
-
SpeechRecognizer
public SpeechRecognizer(SpeechConfig speechConfig, AudioConfig audioConfig)
Initializes a new instance of Speech Recognizer.- Parameters:
speechConfig- speech configuration.audioConfig- audio configuration.
-
SpeechRecognizer
public SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig, AudioConfig audioConfig)
Initializes a new instance of Speech Recognizer for embedded speech recognition. Added in version 1.19.0- Parameters:
embeddedSpeechConfig- embedded speech configuration.audioConfig- audio configuration.
-
SpeechRecognizer
public SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig, AudioConfig audioConfig)
Initializes a new instance of Speech Recognizer for hybrid speech recognition.- Parameters:
hybridSpeechConfig- hybrid speech configuration.audioConfig- audio configuration.
-
SpeechRecognizer
public SpeechRecognizer(SpeechConfig speechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)
Initializes a new instance of Speech Recognizer.- Parameters:
speechConfig- speech configuration.autoDetectSourceLangConfig- the configuration for auto detecting source language
-
SpeechRecognizer
public SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)
Initializes a new instance of Speech Recognizer for embedded speech recognition. Added in version 1.20.0- Parameters:
embeddedSpeechConfig- embedded speech configuration.autoDetectSourceLangConfig- configuration for auto detecting the source language.
-
SpeechRecognizer
public SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)
Initializes a new instance of Speech Recognizer for hybrid speech recognition.- Parameters:
hybridSpeechConfig- hybrid speech configuration.autoDetectSourceLangConfig- the configuration for auto detecting source language
-
SpeechRecognizer
public SpeechRecognizer(SpeechConfig speechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig, AudioConfig audioConfig)
Initializes a new instance of Speech Recognizer.- Parameters:
speechConfig- speech configuration.autoDetectSourceLangConfig- the configuration for auto detecting source languageaudioConfig- audio configuration.
-
SpeechRecognizer
public SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig, AudioConfig audioConfig)
Initializes a new instance of Speech Recognizer for embedded speech recognition. Added in version 1.20.0- Parameters:
embeddedSpeechConfig- embedded speech configuration.autoDetectSourceLangConfig- configuration for auto detecting the source language.audioConfig- audio configuration.
-
SpeechRecognizer
public SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig, AudioConfig audioConfig)
Initializes a new instance of Speech Recognizer for hybrid speech recognition.- Parameters:
hybridSpeechConfig- hybrid speech configuration.autoDetectSourceLangConfig- the configuration for auto detecting source languageaudioConfig- audio configuration.
-
SpeechRecognizer
public SpeechRecognizer(SpeechConfig speechConfig, SourceLanguageConfig sourceLanguageConfig)
Initializes a new instance of Speech Recognizer.- Parameters:
speechConfig- speech configuration.sourceLanguageConfig- the configuration for source language
-
SpeechRecognizer
public SpeechRecognizer(SpeechConfig speechConfig, SourceLanguageConfig sourceLanguageConfig, AudioConfig audioConfig)
Initializes a new instance of Speech Recognizer.- Parameters:
speechConfig- speech configuration.sourceLanguageConfig- the configuration for source languageaudioConfig- audio configuration.
-
SpeechRecognizer
public SpeechRecognizer(SpeechConfig speechConfig, String sourceLanguage)
Initializes a new instance of Speech Recognizer.- Parameters:
speechConfig- speech configuration.sourceLanguage- the recognition source language
-
SpeechRecognizer
public SpeechRecognizer(SpeechConfig speechConfig, String sourceLanguage, AudioConfig audioConfig)
Initializes a new instance of Speech Recognizer.- Parameters:
speechConfig- speech configuration.sourceLanguage- the recognition source languageaudioConfig- audio configuration.
-
-
Method Detail
-
getEndpointId
public String getEndpointId()
Gets the endpoint ID of a customized speech model that is used for speech recognition.- Returns:
- the endpoint ID of a customized speech model that is used for speech recognition.
-
setAuthorizationToken
public void setAuthorizationToken(String token)
Sets the authorization token used to communicate with the service. Note: The caller needs to ensure that the authorization token is valid. Before the authorization token expires, the caller needs to refresh it by calling this setter with a new valid token. Otherwise, the recognizer will encounter errors during recognition.- Parameters:
token- Authorization token.
-
getAuthorizationToken
public String getAuthorizationToken()
Gets the authorization token used to communicate with the service.- Returns:
- Authorization token.
-
getSpeechRecognitionLanguage
public String getSpeechRecognitionLanguage()
Gets the spoken language of recognition.- Returns:
- The spoken language of recognition.
-
getOutputFormat
public OutputFormat getOutputFormat()
Gets the output format of recognition.- Returns:
- The output format of recognition.
-
getProperties
public PropertyCollection getProperties()
The collection of properties and their values defined for this SpeechRecognizer.- Returns:
- The collection of properties and their values defined for this SpeechRecognizer.
-
recognizeOnceAsync
public Future<SpeechRecognitionResult> recognizeOnceAsync()
Starts speech recognition, and returns after a single utterance is recognized. The end of a single utterance is determined by listening for silence at the end or until a maximum of 15 seconds of audio is processed. The task returns the recognition text as result. Note: Since recognizeOnceAsync() returns only a single utterance, it is suitable only for single shot recognition like command or query. For long-running multi-utterance recognition, use startContinuousRecognitionAsync() instead.- Returns:
- A task representing the recognition operation. The task returns a value of SpeechRecognitionResult
-
startContinuousRecognitionAsync
public Future<Void> startContinuousRecognitionAsync()
Starts speech recognition on a continuous audio stream, until stopContinuousRecognitionAsync() is called. User must subscribe to events to receive recognition results.- Returns:
- A task representing the asynchronous operation that starts the recognition.
-
stopContinuousRecognitionAsync
public Future<Void> stopContinuousRecognitionAsync()
Stops a running recognition operation as soon as possible and immediately requests a result based on the the input that has been processed so far. This works for all recognition operations, not just continuous ones, and facilitates the use of push-to-talk or "finish now" buttons for manual audio endpointing.- Returns:
- A future that will complete when input processing has been stopped. Result generation, if applicable for the input provided, may happen after this task completes and should be handled with the appropriate event.
-
startKeywordRecognitionAsync
public Future<Void> startKeywordRecognitionAsync(KeywordRecognitionModel model)
Configures the recognizer with the given keyword model. After calling this method, the recognizer is listening for the keyword to start the recognition. Call stopKeywordRecognitionAsync() to end the keyword initiated recognition. User must subscribe to events to receive recognition results.- Parameters:
model- The keyword recognition model that specifies the keyword to be recognized.- Returns:
- A task representing the asynchronous operation that starts the recognition.
-
stopKeywordRecognitionAsync
public Future<Void> stopKeywordRecognitionAsync()
Ends the keyword initiated recognition.- Returns:
- A task representing the asynchronous operation that stops the recognition.
-
dispose
protected void dispose(boolean disposing)
Description copied from class:RecognizerThis method performs cleanup of resources. The Boolean parameter disposing indicates whether the method is called from Dispose (if disposing is true) or from the finalizer (if disposing is false). Derived classes should override this method to dispose resource if needed.- Overrides:
disposein classRecognizer- Parameters:
disposing- Flag to request disposal.
-
-