TextToSpeech procedure

This documentation is valid for:
GeneXus 18 Help
GeneXus 17 Help
GeneXus 16 Help

Converts a plain text into an audio stream.

Parameters

in:&Text :: Text, GeneXusAI
The plain text to be synthesized.
in:&locale :: Locale, GeneXusAI
The language locale of the output speech.
in:&voiceType :: VoiceType, GeneXusAI
The output voice type (female or male).
in:&provider :: Provider, GeneXusAI.Configuration
Provider settings.
inout:&Messages :: Messages, GeneXus.Common
A collection of warning and error messages returned by the task. You should check in your code if an error was returned. Refer to error codes and descriptions for more information.
out:&Audio :: Audio data type
The input text's synthesized audio stream.

Configuration

The following table resumes the configuration properties (access credentials) you must set in order to use this AI task.

	PropertyKey
ProviderType	Id	Key	SecretKey
Alibaba	智能语音交互 app-key	用户AccessKey	用户AccessKey
Amazon	-	Polly	Polly
Baidu	百度语音	百度语音	百度语音
Google	-	Cloud Speech API	-
IBM	-	TextToSpeech API	-
Microsoft	-	Speech API	-
SAP	-	-	-
Tencent	音合成	音合成	-

Sample

Taking the following plain text, the table below shows the synthesis made for each provider and the time it takes to process it.

"The first question that comes up is: What is GeneXus? GeneXus is a tool that automatically generates software programs such as applications for the Web, and Smart Devices, always at the forefront of technological evolution."

Provider	Output	Benchmark
Alibaba	TextToSpeech - Alibaba output 0:00	3325ms
Amazon	TextToSpeech - Amazon output 0:00	1486ms
Baidu	TextToSpeech - Baidu output 0:00	4634ms
Google	TextToSpeech - Google output 0:00	1887ms
IBM	TextToSpeech - IBM output 0:00	3205ms
Microsoft	TextToSpeech - Microsoft output 0:00	3412ms
SAP	N/A	N/A
Tencent	TextToSpeech - Tencent output 0:00	4614ms

The &Text input parameter also admits SSML inner nodes (excluding <speak> root) for formatting pronunciation, intonation, etc. For example:

"<emphasis level='strong'>GeneXus<emphasis> is a tool that <prosody pinth='high'>automatically generates software programs</prosody> such as applications for the Web, and <sub alias='Smart Devices'>SD</sub>, with over <say-as interpret-as='cardinal'>30</say-as> years of experience."

Giving the following result:

TextToSpeech - SSML input

0:00

Notes

For SSML input, not all of the elements and options of W3 SSML specification are currently supported on every provider. GeneXusAI specifically does not allow the <voice> tag, it must be set in the &voiceType input parameter.
For Google Cloud AI, when you enable Speech Cloud API to use this task, you must select 'Standard Voices' option.
For Microsoft Speech API, when you want to use another locale or voice type, you can use FromString method to load them. For instance, if you want to use the 'es-UY' locale with the female voice "ValentinaNeural", you can do &locale.FromString("es-UY") and &voiceType.FromString("ValentinaNeural") when setting the input parameters.
Tencent AI and Baidu AI providers only allow Chinese or English (or mixed) text input.

Scope

Generators:	.NET, .NET Framework, Java, Apple, Android, Angular
Connectivity:	Online

Availability

This procedure is available as of GeneXus 16.

As of GeneXus 16 upgrade 1:
- Google Cloud AI is available.
As of GeneXus 16 upgrade 2:
- Amazon WS and Tencent AI are available.
As of GeneXus 16 upgrade 3:
- Baidu AI is available.
As of GeneXus 16 upgrade 4:
- Alibaba AI is available.