Class SpeechRecognizer

  • All Implemented Interfaces:
    AutoCloseable

    public final class SpeechRecognizer
    extends Recognizer
    Performs speech recognition from microphone, file, or other audio input streams, and gets transcribed text as result. Note: close() must be called in order to release underlying resources held by the object.
    • Constructor Detail

      • SpeechRecognizer

        public SpeechRecognizer​(SpeechConfig speechConfig)
        Initializes a new instance of Speech Recognizer.
        Parameters:
        speechConfig - speech configuration.
      • SpeechRecognizer

        public SpeechRecognizer​(EmbeddedSpeechConfig embeddedSpeechConfig)
        Initializes a new instance of Speech Recognizer for embedded speech recognition. Added in version 1.19.0
        Parameters:
        embeddedSpeechConfig - embedded speech configuration.
      • SpeechRecognizer

        public SpeechRecognizer​(HybridSpeechConfig hybridSpeechConfig)
        Initializes a new instance of Speech Recognizer for hybrid speech recognition.
        Parameters:
        hybridSpeechConfig - hybrid speech configuration.
      • SpeechRecognizer

        public SpeechRecognizer​(SpeechConfig speechConfig,
                                AudioConfig audioConfig)
        Initializes a new instance of Speech Recognizer.
        Parameters:
        speechConfig - speech configuration.
        audioConfig - audio configuration.
      • SpeechRecognizer

        public SpeechRecognizer​(EmbeddedSpeechConfig embeddedSpeechConfig,
                                AudioConfig audioConfig)
        Initializes a new instance of Speech Recognizer for embedded speech recognition. Added in version 1.19.0
        Parameters:
        embeddedSpeechConfig - embedded speech configuration.
        audioConfig - audio configuration.
      • SpeechRecognizer

        public SpeechRecognizer​(HybridSpeechConfig hybridSpeechConfig,
                                AudioConfig audioConfig)
        Initializes a new instance of Speech Recognizer for hybrid speech recognition.
        Parameters:
        hybridSpeechConfig - hybrid speech configuration.
        audioConfig - audio configuration.
      • SpeechRecognizer

        public SpeechRecognizer​(SpeechConfig speechConfig,
                                AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)
        Initializes a new instance of Speech Recognizer.
        Parameters:
        speechConfig - speech configuration.
        autoDetectSourceLangConfig - the configuration for auto detecting source language
      • SpeechRecognizer

        public SpeechRecognizer​(EmbeddedSpeechConfig embeddedSpeechConfig,
                                AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)
        Initializes a new instance of Speech Recognizer for embedded speech recognition. Added in version 1.20.0
        Parameters:
        embeddedSpeechConfig - embedded speech configuration.
        autoDetectSourceLangConfig - configuration for auto detecting the source language.
      • SpeechRecognizer

        public SpeechRecognizer​(HybridSpeechConfig hybridSpeechConfig,
                                AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)
        Initializes a new instance of Speech Recognizer for hybrid speech recognition.
        Parameters:
        hybridSpeechConfig - hybrid speech configuration.
        autoDetectSourceLangConfig - the configuration for auto detecting source language
      • SpeechRecognizer

        public SpeechRecognizer​(SpeechConfig speechConfig,
                                AutoDetectSourceLanguageConfig autoDetectSourceLangConfig,
                                AudioConfig audioConfig)
        Initializes a new instance of Speech Recognizer.
        Parameters:
        speechConfig - speech configuration.
        autoDetectSourceLangConfig - the configuration for auto detecting source language
        audioConfig - audio configuration.
      • SpeechRecognizer

        public SpeechRecognizer​(EmbeddedSpeechConfig embeddedSpeechConfig,
                                AutoDetectSourceLanguageConfig autoDetectSourceLangConfig,
                                AudioConfig audioConfig)
        Initializes a new instance of Speech Recognizer for embedded speech recognition. Added in version 1.20.0
        Parameters:
        embeddedSpeechConfig - embedded speech configuration.
        autoDetectSourceLangConfig - configuration for auto detecting the source language.
        audioConfig - audio configuration.
      • SpeechRecognizer

        public SpeechRecognizer​(HybridSpeechConfig hybridSpeechConfig,
                                AutoDetectSourceLanguageConfig autoDetectSourceLangConfig,
                                AudioConfig audioConfig)
        Initializes a new instance of Speech Recognizer for hybrid speech recognition.
        Parameters:
        hybridSpeechConfig - hybrid speech configuration.
        autoDetectSourceLangConfig - the configuration for auto detecting source language
        audioConfig - audio configuration.
      • SpeechRecognizer

        public SpeechRecognizer​(SpeechConfig speechConfig,
                                SourceLanguageConfig sourceLanguageConfig)
        Initializes a new instance of Speech Recognizer.
        Parameters:
        speechConfig - speech configuration.
        sourceLanguageConfig - the configuration for source language
      • SpeechRecognizer

        public SpeechRecognizer​(SpeechConfig speechConfig,
                                SourceLanguageConfig sourceLanguageConfig,
                                AudioConfig audioConfig)
        Initializes a new instance of Speech Recognizer.
        Parameters:
        speechConfig - speech configuration.
        sourceLanguageConfig - the configuration for source language
        audioConfig - audio configuration.
      • SpeechRecognizer

        public SpeechRecognizer​(SpeechConfig speechConfig,
                                String sourceLanguage)
        Initializes a new instance of Speech Recognizer.
        Parameters:
        speechConfig - speech configuration.
        sourceLanguage - the recognition source language
      • SpeechRecognizer

        public SpeechRecognizer​(SpeechConfig speechConfig,
                                String sourceLanguage,
                                AudioConfig audioConfig)
        Initializes a new instance of Speech Recognizer.
        Parameters:
        speechConfig - speech configuration.
        sourceLanguage - the recognition source language
        audioConfig - audio configuration.
    • Method Detail

      • getEndpointId

        public String getEndpointId()
        Gets the endpoint ID of a customized speech model that is used for speech recognition.
        Returns:
        the endpoint ID of a customized speech model that is used for speech recognition.
      • setAuthorizationToken

        public void setAuthorizationToken​(String token)
        Sets the authorization token used to communicate with the service. Note: The caller needs to ensure that the authorization token is valid. Before the authorization token expires, the caller needs to refresh it by calling this setter with a new valid token. Otherwise, the recognizer will encounter errors during recognition.
        Parameters:
        token - Authorization token.
      • getAuthorizationToken

        public String getAuthorizationToken()
        Gets the authorization token used to communicate with the service.
        Returns:
        Authorization token.
      • getSpeechRecognitionLanguage

        public String getSpeechRecognitionLanguage()
        Gets the spoken language of recognition.
        Returns:
        The spoken language of recognition.
      • getOutputFormat

        public OutputFormat getOutputFormat()
        Gets the output format of recognition.
        Returns:
        The output format of recognition.
      • getProperties

        public PropertyCollection getProperties()
        The collection of properties and their values defined for this SpeechRecognizer.
        Returns:
        The collection of properties and their values defined for this SpeechRecognizer.
      • recognizeOnceAsync

        public Future<SpeechRecognitionResult> recognizeOnceAsync()
        Starts speech recognition, and returns after a single utterance is recognized. The end of a single utterance is determined by listening for silence at the end or until a maximum of 15 seconds of audio is processed. The task returns the recognition text as result. Note: Since recognizeOnceAsync() returns only a single utterance, it is suitable only for single shot recognition like command or query. For long-running multi-utterance recognition, use startContinuousRecognitionAsync() instead.
        Returns:
        A task representing the recognition operation. The task returns a value of SpeechRecognitionResult
      • startContinuousRecognitionAsync

        public Future<Void> startContinuousRecognitionAsync()
        Starts speech recognition on a continuous audio stream, until stopContinuousRecognitionAsync() is called. User must subscribe to events to receive recognition results.
        Returns:
        A task representing the asynchronous operation that starts the recognition.
      • stopContinuousRecognitionAsync

        public Future<Void> stopContinuousRecognitionAsync()
        Stops a running recognition operation as soon as possible and immediately requests a result based on the the input that has been processed so far. This works for all recognition operations, not just continuous ones, and facilitates the use of push-to-talk or "finish now" buttons for manual audio endpointing.
        Returns:
        A future that will complete when input processing has been stopped. Result generation, if applicable for the input provided, may happen after this task completes and should be handled with the appropriate event.
      • startKeywordRecognitionAsync

        public Future<Void> startKeywordRecognitionAsync​(KeywordRecognitionModel model)
        Configures the recognizer with the given keyword model. After calling this method, the recognizer is listening for the keyword to start the recognition. Call stopKeywordRecognitionAsync() to end the keyword initiated recognition. User must subscribe to events to receive recognition results.
        Parameters:
        model - The keyword recognition model that specifies the keyword to be recognized.
        Returns:
        A task representing the asynchronous operation that starts the recognition.
      • stopKeywordRecognitionAsync

        public Future<Void> stopKeywordRecognitionAsync()
        Ends the keyword initiated recognition.
        Returns:
        A task representing the asynchronous operation that stops the recognition.
      • dispose

        protected void dispose​(boolean disposing)
        Description copied from class: Recognizer
        This method performs cleanup of resources. The Boolean parameter disposing indicates whether the method is called from Dispose (if disposing is true) or from the finalizer (if disposing is false). Derived classes should override this method to dispose resource if needed.
        Overrides:
        dispose in class Recognizer
        Parameters:
        disposing - Flag to request disposal.