Ivona text to speech - modifying pronunciation of words.

Ivona text to speech modifier

A solution for changing speech synthesiser’s pronunciation of some words defined by phonetic notation. Modifying of synthetic speech is possible thanks to SSML tags.

Speech2Go online reading program

A web service providing conversion of any text to audio files. The files can be used later for any purpose, including commercial (like phone exchange prompts, dubbing for Youtube videos, audio books, reading programs for kids – all of those may be used commercially and for broadcast). In a nutshell it is a text-to-speech website.



Quite often recordings done at Speech2Go online service include special terms, like expressions in other languages, industry specific terms, etc. that are difficult to be read by the synthetic speech. Sometimes there are homonyms that should be spoken differently depending on context (e.g. read, a verb and read – past tense of the same verb).

When the text-to-speech generator can’t find an automatic solution for such situations, there are SSML tags available. There is a number of SSML tags – they are described on the web. One SSML command is needed only to fix the above mentioned problem, though. Below you will find description of the “magic” tag. When pronunciation of the text-to-speech voice isn’t perfect use the command with phonetic notation instead.

SSML tag for modifying IVONA text-to-speech - examples

Let’s look for an English word that could be spoken better than standard IVONA voice does: volutarism. Please try it at Speech2go.online. Not perfect? Let’s try to improve it:

(for the tags to be recognised please use My recordings/New recording function at: https://speech2go.online/cloud/step/1)

‘voluntarism’ <phoneme alphabet='ipa' ph='vä-lən-tə-ˌri-zəm'/>

Any better? Well – depends on the voice you have originally used.

Another example – would you like to make your English voice try some Polish? Here is an example:

‘chrząszcz’ <phoneme alphabet='ipa' ph='[xʃɔ̃w̃ʃʧ̑]'/>

The “ph” parameter is crucial here. This is how you point to the right pronunciation.

Where do I get the phonetic notation that is needed after the “ph” parameter?

Because the notation is difficult (virtually impossible) to enter with keyboard – you obviously want to use copy/paste method, so you need to get it from the Internet. Look for IPA notation at Wiktionary: en.wiktionary.org/wiki/fly

or at some other dictionaries like this one: www.merriam-webster.com/dictionary/voluntarism

If you look carefully – you will find some text-to-phonetic converters. Also, there is more notation systems than IPA – you could also use X-SAMPA or some specialised systems for language groups. If you want to use X-SAMPA you need to change the value of ‘alphabet’ parameter from ‘ipa’ to ‘X-SAMPA’.

The use of speech modifier for improving or changing pronunciation of a word usually brings good results. Some expertise in finding right phonetic notation helps but is not inevitable. You will learn how to do that soon.

This method is universal, should work with many text-to-speech engines – not only with IVONA – and is the “right” way to do the modification. If done well it will work with many text-to-speech voices.

Speech2Go online service offers test recording. You can test how will the recording of 200 characters perform without losing credits 3 times in 10 minutes. We advise to try the IVONA modifier this way!