Introduction to the Web Speech API -WebApps That talk

HTML 5 specification includes a voice recognition allowing users to interact with our website using your computer’s microphone. Enabling speech recognition in a form field, the browser will provide the field with an option to start the speech recognition. way of providing this option will vary depending on the browser, eg Google Chrome desktop displays a microphone icon inside the text field, but Android displays microphone on the virtual keyboard.

With the development of new technologies, we seek to simplify human-computer interaction. An example of this is the development of Voice API for web (“Web Speech API”).

Initially proposed by Google, the API has already gained popularity and has even created the “Speech API W3C Community Group”. Unfortunately, it is not yet standardized for all browsers, so to use it, you must use the latest versions of Google Chrome.

A specification for a voice Web API has been recently released with a call to have a final group specification by the W3C Speech API Community Group . The specification is for the Javascript API will be able to give web developers the ability to incorporate web pages with text-to-voice ( text to speech ) and speech recognition can be used to enter data, make continuous dictation and control computers.
The so-called HTML Speech Incubator Group was originally formed in August 2010 with members of Microsoft, Google, Voxeo,

AT & T, Mozilla and OpenReach. There were proposals for the specification of API by Google and Microsoft. There was then a final report in December 2011, where the proposal is for the Javascript API and limitations in HTML.
The diagram reports the general guideline that would have the API for voice as appropriate consensus is achieved. Two weeks after this report, Google came out with the proposal Voice API for JavaScript that supports 15 of the 17 cases identified in the report HTML Speech Incubator end of the Group, which are:

  • Voice Search
  • Interface with voice commands
  • Grammars contingents specific domain
  • Recognition of continuous open dialog
  • Grammars specific domains
  • Voice interfaces present when a GUI is not needed
  • Voice Activity Detection
  • Hello world
  • Speech Translation
  • Voice Mail Client
  • Dialog Systems
  • Directions cars handling voice
  • Multimodal Interaction
  • Multimodal Video

The two missing fields seek to maintain a minimum API expert

  • Re-recognition
  • Temporal structure of the synthesis to give visual feedback

The Speech API community group was formed in April 2012 to continue working on this specification. It is headed by Glen Shires of Google, one of the editors of the draft speech API, and has five members plus representatives of W3C, Mozilla, Open Reach among others. The Voice API specification has been edited by Glen Shires and Hans Wennborg of Google.
By the time the API SpecificationLMS not have the status of a W3C standard. Also, Chrome is the only browser with Speech API and others are expected to follow suit to unify this technology, which today is absolutely a disaster for not a guide nor a standard.

The simplest implementation is to add voice recognition functionality to a text field (“input” text type). You need to include the “x-webkit-speech” attribute to the text field. As this attribute is an attribute that uses the syntax test also suggest adding the potential for standardization attribute “speech”.

x-webkit-speech speech />

This will add a microphone at the end of the field which when clicked, will allow the user to dictate the value of the field.

But What if we want to use your voice to other activities? It is also possible to obtain a transcript of the voice of the user through Javascript, and can be invoked by any action.

To create a variable voice recognition is made in the following manner:

webkitSpeechRecognition recognition var = new ();
// is proposed “new SpeechRecognition ()” for future

For this purpose we can assign the following events:

recognition.onstart = function () {…}
recognition.onresult = function (event) {…}
recognition.onerror = function (event) {…}
recognition.onend = function () {…}

In addition we can also specify the language that will recognize:

recognition.lang = “es-MX”;

To get the results you need to take the value of event.results [0] [0] in the event .transcript onResult. To start voice recognition the start () function of the object of speech recognition is used.

An example to insert the results inside an element with id = “test” is to do the following:

webkitSpeechRecognition recognition var = new ();
recognition.lang = “en-US”;
recognition.onstart = function () {};
recognition.onerror = function (event) {};
recognition.onend = function () {};
recognition .onresult = function (event) {
document.getElementById (‘test’) event.results innerHTML = [0] [0] .transcript,.
};
recognition.start ();

Among the other events so you just have to add the actions you want.

The proposal also includes an API specification for converting text to speech, thus the machine you could “talk” to the user. Unfortunately, this functionality is not yet implemented in any browser except Chrome.

Enabling speech recognition in a form field

The attribute speech is indicated by the HTML5 specification to enable speech recognition on a form field.

< input speech x-webkit-speech name = "recognition" / >

In the example, we can test here , we see a text field that has enabled voice recognition. Currently only Google Chrome incorporates text recognition, and other browsers does not support the speech attribute, so we also use the attribute x-webkit- speech together. If your browser does not support recognition a normal text field is displayed.

Getting the result with JavaScript

The event onspeechchange lets us know that the user has used the voice recognition. To be consistent we use together onwebkitspeechchange :

< script type = "text / javascript" > 
      function  texto (  input  )  {
         if  (  confirm!  (  'Did you say'  +  input.value  +  '?'  )  )  input.value  =  '' ; }
      
< / script > 
< input speech x-webkit-speech name = "number" onspeechchange = "texto ( This ) " onwebkitspeechchange = "texto ( This ) "/ >

In this example, you can try here , we see how the user is asked if the recognized text is correct and using onspeechchange onwebkitspeechchange.

Recognizing text input field

We can recognize text with window. Speech Recognition API object. Chrome uses this API currently experimentally with this other name window. webkit Speech Recognition. For security reasons to use text recognition on our website, the user will have to authorize it.

< script type = "text / javascript" > 
function  obtenerTexto (  )  {
    if  (  window.SpeechRecognition || window.webkitSpeechRecognition  )  { 
        // Get the object to aknowledge text form compatible with different browsers
        reconocimientoTexto  =  new  (  window.SpeechRecognition | | window.webkitSpeechRecognition  )  ( )
        reconocimientoTexto.onresult  =  function ( event )  { 
                // obtained show the text
                alert ( 1111 ) 
        } 
        We begin to recognize text //
        reconocimientoTexto.start ( ) ; } else {         alert  ( 'browser not supported' ) ; } }
      
  
    

< / script > 
< input onclick = " obtenerTexto ( ) " type = "button" value = "listen" / >

 Here is another example, which we prove here , is similar to the above but use a button and call the speech to text API from JavaScript. The function start request user permission and start voice recognition. The event onResult is the function that is called when all is well and we have a recognized text of what the user has said. You may also be interested event onerror running for example when the user does not give permission for speech recognition

 

 

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top