Speech | MyRobotLab

Javadoc link

The Speech service is used to generate speech from textual input. It is a Proxy service since there are several possible implementations which could be used. MarySpeech, AcapelaSpeech & GoogleSpeech can provide the needed implementations, while the Speech service offers a common interface.

AcapelaSpeech and GoogleSpeech initially need an internet connection, since the mp3 file is generated from their services. Once the file is retrieved it is cached, afterwhich the connection is no longer needed for a given utterance.

MarySpeech generates speech locally and never needs an internet connection, however, it soon might be caching the files like the other services for performance reasons.

Speech also uses an AudioFile service. The AudioFile service is responsible for the playing of the speech files. It also responsible for the file cache.

The cache structure is described below:

audioFile --
---- {{Speech Implementation}}
--- {{ voice name }}
--- {{ mp3 file }}

References

http://www.acapela-group.com/text-to-speech-interactive-demo.html Another back-end
FreeTTS - local backend
Changing Voices
CMU Speech Software

[[service/Speech.py]]

Example code (from branch develop):

#file : Speech.py (github)

#########################################
# Speech.py
# description: used as a general template
# categories: general
# more info @: http://myrobotlab.org/service/Speech
#########################################
 
# start all speech services ( to test them )
 
# local :
marySpeech = runtime.start("marySpeech", "MarySpeech")
localSpeech = runtime.start("localSpeech", "LocalSpeech")
mimicSpeech = runtime.start("mimicSpeech", "MimicSpeech")
 
# api needed
polly = runtime.start("polly", "Polly")
voiceRss = runtime.start("voiceRss", "VoiceRss")
indianTts = runtime.start("indianTts", "IndianTts")

prefetching

Do you think it would be a good idea for the service to start prefetching all of the phrases it needs when it starts up so that if it doesn't need to speak right away it can respond quicker later when it does?

Caching is almost always a

Caching is almost always a good idea for time consuming tasks...

What delay are you speaking of though? Let me list them out.

The first time a phrase is uttered by the Speech service, and defaults are in place - Speech will go out to Google to get the file. This by far takes the longest time. It is saved locally as a mp3 file under AudioFile directory.

The other delay is starting the thread which plays the audio file. I've looked into caching this before too, but not sucessfully .. yet...

I'm talking about the first

I'm talking about the first time a phrase is uttered delay. I'm thinking it would be good to have the service cached them all even before they're needed if a good internet connection is available. That way you can get them all cached and not worry if the connection goes down later when only one or two that have been used.

Does that make it clearer? Are the files named in a way that a new instance of the MRL will understand what they are? For example I put together a python script in MRL and test it at home with my internet connected. Then I shut the whole system down and take it to a Maker exhibit and their internet is dead. Will MRL be able to use the cached files from the last time?

yes KMC... we had a very

yes KMC... we had a very shitty internet connection at makerfaire rome... Gael and me used the files "downloaded" at home !!!

Yes kmc, it is a good idea.

Yes kmc, it is a good idea. Caching all the voice commands at once when a good internet connection is available.