JavaScript Text to Speech Using Synthesis API

Friendly human interfaces sometimes require machine-generated voices. And Text-to-Speech technology in JavaScript can implement the requirement very well. When web pages read texts aloud like robots, people are just listening machine’s reporting without looking at screen monitors or mobile devices.

The speech synthesis API provides the simplest way to create browser-based web pages capable of reading texts aloud. This approach depends on browser compatibility. Fortunately, popular browsers of Chrome, Edge, and Firefox support such a Web Speech API solution.

All codes here are not complicated, so you can easily understand even though you are still students in school. To benefit your learning, we will provide you download link to a zip file thus you can get all source codes for future usage.

Estimated reading time: 5 minutes

EXPLORE THIS ARTICLE
TABLE OF CONTENTS

BONUS Source Code Download

1 The Basics

2 Text To Speech

FINAL Conclusion

TRY IT Quick Experience

BONUS
Source Code Download

We have released it under the MIT license, so feel free to use it in your own project or your school homework.

Download Guideline

Prepare HTTP server such as XAMPP or WAMP in your windows environment.
Download and unzip into a folder that http server can access.

DOWNLOAD SOURCE

SECTION 1
The Basics

Let’s study an experimental technology called speech synthesis API, which has two major interfaces to achieve Text-to-Speech conversion in JavaScript. It is feasible for browsers of Chrome, Edge, and Firefox.

Speech Synthesis API

The Mozilla Web Speech API cooperated with browsers provides interfaces to read web page contents aloud. It is officially defined as below.

Speech synthesis is accessed via the SpeechSynthesis interface, a text-to-speech component that allows programs to read out their text content (normally via the device’s default speech synthesiser.) Different voice types are represented by SpeechSynthesisVoice objects, and different parts of text that you want to be spoken are represented by SpeechSynthesisUtterance objects. You can get these spoken by passing them to the SpeechSynthesis.speak() method.

Utterance

Utterance is an action of saying or expressing something aloud, so Web Speech API define the SpeechSynthesisUtterance object as a speech request. The speech request object contains major properties as below.

lang: set the language of utterance.
pitch: set the degree of highness or lowness for tones.
rate: set the speed of speech.
text: input the content of speech.

Using F12 Inspect in run-time apps, you can check utterance properties from console logs.

Text to Speech

The JavaScript SpeechSynthesis object controls speech services. It has methods to get information about available synthesis voices on your device such as languages. Also the SpeechSynthesis object start, pause and cancel speech requests. The following methods briefly tell you how SpeechSynthesis and SpeechSynthesisUtterance objects are working together.

getVoices: returns available voices on the current device. You can choose languages as you want.

Supported Voices for Text to Speech in JavaScript

speak: adds an utterance to the utterance queue. Your speech request will be executed in turn.
cancel: stops speech requests can be done by removing all utterances from the utterance queue.

SECTION 2
Text To Speech

The section guides you through speech synthesis API implementation in JavaScript. On clicking, texts on web pages are read aloud. The properties of language, rate, and pitch configure the speech.

Read Text Aloud

The web page can read text like people does. If you want to increase the speed of speech, change the rate. Moreover, adjusting pitch is equal to changing the tone to be higher or lower.

index.html

<div data-role="content">
    <div class="config-data">
        <div>Read Aloud:</div>
        <textarea class="ui-input-text" data-clear-btn="true" id="input"></textarea>
        <button id="speak" class="ui-btn ui-corner-all">SPEAK</button>
        <label for="rate">Rate</label>
        <input type="range" data-highlight="true" min="0.5" max="2" value="0.8" step="0.1" id="rate">
        <label for="pitch">Pitch</label>
        <input type="range" data-highlight="true" min="0" max="2" value="1.6" step="0.1" id="pitch"></input>
    </div>
</div>

Implemented as Window Object

Remember to define as an Window object as below.

index.html

var speechSynthesis = window.speechSynthesis;

The Example

Once clicked, speech content, language, speed, and tone define an utterance object. Subsequently, all utterances in utterance queue are removed. Finally, a new speech request executes to produce voices in speakers. In addition, the example show available voice information on the device.

index.html

$(document).ready(function() {
    $("#input").val(testing).keyup();
    $("#speak").on("click", function () {
        // Prepare speech synthesis utterance
        speechSynthesisUtterance = new SpeechSynthesisUtterance();
        speechSynthesisUtterance.text = $("#input").val();
        speechSynthesisUtterance.lang = 'en-US';
        speechSynthesisUtterance.rate = $('#rate').val();
        speechSynthesisUtterance.pitch = $('#pitch').val();
        console.log(speechSynthesisUtterance);
        // Cancel utterance queue, and start speaking with utterance.
        speechSynthesis.cancel();
        speechSynthesis.speak(speechSynthesisUtterance);
        // Log voice information
        voices = speechSynthesis.getVoices();
        console.log(voices);
    });
});

FINAL
Conclusion

The Text-to-Speech conversion in JavaScript is easy to implement. It is based on browsers and feasible on almost popular browsers such as Chrome, Edge, and Firefox. On the contrary, our previous article about speech recognition in JavaScript discusses the topic of Speech-to-Text in details.

Thank you for reading, and we have suggested more helpful articles here. If you want to share anything, please feel free to comment below. Good luck and happy coding!

Learning Tips

Let us suggest a excellent way to learn HTML scripts here. Using Google Chrome F12 Inspect or Inspect Element will help you study the codes.

In Google Chrome, there are two ways to inspect a web page using the browser built-in Chrome DevTools:

Right-click an element on the page or in a blank area, then select Inspect.
Go to the Chrome menu, then select More Tools > Developer Tools.

TRY IT
Quick Experience

That is all for this project, and here is the link that let you experience the program. Please kindly leave your comments for our enhancement.

Try It Yourself

Click here to execute the source code, thus before studying the downloaded codes, you can check whether it is worthy.

JavaScript Text to Speech Using Synthesis API

EXPLORE THIS ARTICLE
TABLE OF CONTENTS

BONUS
Source Code Download

Download Guideline

SECTION 1
The Basics

Speech Synthesis API

Utterance

Text to Speech

SECTION 2
Text To Speech

Read Text Aloud

Implemented as Window Object

The Example

FINAL
Conclusion

Learning Tips

Suggested Reading

TRY IT
Quick Experience

Try It Yourself

Leave a Comment Cancel reply

EXPLORE THIS ARTICLETABLE OF CONTENTS

BONUSSource Code Download

Download Guideline

SECTION 1The Basics

Speech Synthesis API

Utterance

Text to Speech

SECTION 2Text To Speech

Read Text Aloud

Implemented as Window Object

The Example

FINALConclusion

Learning Tips

Suggested Reading

TRY ITQuick Experience

Try It Yourself

Leave a Comment Cancel reply

EXPLORE THIS ARTICLE
TABLE OF CONTENTS

BONUS
Source Code Download

SECTION 1
The Basics

SECTION 2
Text To Speech

FINAL
Conclusion

TRY IT
Quick Experience