In today’s Technology world, staying up to date with all the new trends and technologies is extremely hard. This also includes native browsers Apis. One such technology that has been making waves in recent times is the SpeechRecognition Web API. This API allows developers to integrate speech recognition capabilities seamlessly into web applications, opening up a realm of possibilities for creating intuitive and hands-free user experiences.

I heard of this from one of my colleagues and I thought to share my finding with you to hopefully help you keep up with tech progress.

Understanding SpeechRecognition Web API

The SpeechRecognition Web API is a browser-based technology that enables web applications to convert spoken words into text. It empowers developers to incorporate voice commands, dictation, and other speech-driven functionalities into their web projects. This not only enhances accessibility but also provides a novel way for users to interact with applications.

What this actually means is that you now have the ability to actually “transcript” audio from the microfon in real time in your browser without the need of fancy model or AI tools!

Key Features of SpeechRecognition Web API:

  1. Speech to Text Conversion: The primary function of the API is to transcribe spoken words into text, making it an invaluable tool for applications that require user input through speech.
  2. Real-time Recognition: The API supports real-time recognition, allowing applications to respond dynamically as users speak, fostering a more natural and engaging user experience.
  3. Language Specification: Developers can specify the language for recognition, ensuring accurate interpretation of speech in multilingual applications.
  4. Confidence Levels: The API provides confidence levels for each recognition result, allowing developers to gauge the accuracy of the transcription.

Building a Vue.js Composable for SpeechRecognition

To harness the capabilities of the SpeechRecognition Web API in a Vue.js application, we can create a composable using the Composition API. Let’s delve into a simple yet powerful Vue.js composable for speech recognition.


import { ref } from 'vue'
export function useVueSpeech(){
    // Initialize SpeechRecognition API
    const recognition = new webkitSpeechRecognition();
    recognition.continuous = false;
    recognition.lang = "en-US";
    recognition.interimResults = false;
    recognition.maxAlternatives = 1;
    
    // Setup Vue refs
    const listening = ref(false);
    const transcript = ref("");
    const confidence = ref(0);

    // Create method to trigger the API
    const listen = () => {
        listening.value = true;
        recognition.start();
    }

    // Handle API result
    recognition.onresult = (event) => {
        listening.value = false;
        const result = event.results[0][0];
        transcript.value = result.transcript;
        confidence.value = result.confidence;
    }

    return {
        listening,
        listen,
        transcript,
        confidence
    }
};

In the first part of this composable we have initiated the SpeechRecognition API. As you can see due to current browser adaptation, we have used the WebKit prefix.

The SpeechRecognition has a few different settings as shown in the MDN documentation and in our case we are setting a few of this settings such as language, maximum results, if it is one result of a stream and if the API should provide us interim results.

In the second blog we initalize all the Vue refs that would be exposed by our composable

Next we are going to write a quick method that users can use to start the API. This will be exposed to composable users and will probably be attached to a button click.

next up we have a listener event. The onResult event is triggered by the API when it has finished to transcribe our audio. The events return an array of results, but because we declared “maxAlternatives” to 1, it will just return one in our case.

The result includes a transcription string and a confidence level of that result.

The above composable can be used with the following implementation:

<script setup>
import { useVueSpeech } from "./index.js";

const vueSpeech = useVueSpeech()
</script>

<template>
  <h1>Vue Speech</h1>
  <div>Listening: {{ vueSpeech.listening.value }}</div>
  <div>Confidence: {{ vueSpeech.confidence }}</div>
  <button @click="vueSpeech.listen()">Listen now</button>
  <div>
    <textarea rows="5">{{ vueSpeech.transcript }}</textarea>
  </div>
</template>

A working version of the above code can be seen in the Vue playground following this link.

Before we conclude I also wanted to say that the above has just been shared for you to learn about the PI and it is not a production ready code. If you are in need of a composable to use for the SpeechRecognition API you can use the one provided by VueUse: https://vueuse.org/core/useSpeechRecognition/

Conclusion

The SpeechRecognition Web API opens up a world of possibilities for creating innovative and user-friendly web applications. With the power of Vue.js and the Composition API, developers can easily integrate speech recognition features, enhancing the accessibility and usability of their projects. Keep coding and stay tuned!

🤞 Don’t miss these tips!

No spam emails.. Pinky promise!