Speech

javadoc

The Speech service is used to generate speech from textual input.  It is a Proxy service since there are several possible implementations which could be used.  MarySpeech, AcapelaSpeech & GoogleSpeech can provide the needed implementations, while the Speech service offers a common interface. 

AcapelaSpeech and GoogleSpeech initially need an internet connection, since the mp3 file is generated from their services.  Once the file is retrieved it is cached, afterwhich the connection is no longer needed for a given utterance.

MarySpeech generates speech locally and never needs an internet connection, however, it soon might be caching the files like the other services for performance reasons.

Speech also uses an AudioFile service.  The AudioFile service is responsible for the playing of the speech files.  It also responsible for the file cache.

The cache structure is described below:

audioFile --
                    ---- {{Speech Implementation}}
                                       --- {{ voice name }}
                                                    --- {{ mp3 file }}

References


#file : service/Speech.py edit raw
#########################################
# Speech.py
# description: used as a general template
# categories: general
# more info @: http://myrobotlab.org/service/Speech
#########################################

# start the service
speech = Runtime.start('speech','Speech')

 

Examples:


#file : service/Speech.py edit raw
#########################################
# Speech.py
# description: used as a general template
# categories: general
# more info @: http://myrobotlab.org/service/Speech
#########################################

# start the service
speech = Runtime.start('speech','Speech')

kmcgerald's picture

prefetching

Do you think it would be a good idea for the service to start prefetching all of the phrases it needs when it starts up so that if it doesn't need to speak right away it can respond quicker later when it does?

GroG's picture

Caching is almost always a

Caching is almost always a good idea for time consuming tasks...

What delay are you speaking of though?  Let me list them out.

The first time a phrase is uttered by the Speech service, and defaults are in place - Speech will go out to Google to get the file.  This by far takes the longest time.  It is saved locally as a mp3 file under AudioFile directory.  

The other delay is starting the thread which plays the audio file. I've looked into caching this before too, but not sucessfully .. yet...

kmcgerald's picture

I'm talking about the first

I'm talking about the first time a phrase is uttered delay.  I'm thinking it would be good to have the service cached them all even before they're needed if a good internet connection is available.  That way you can get them all cached and not worry if the connection goes down later when only one or two that have been used.

Does that make it clearer?  Are the files named in a way that a new instance of the MRL will understand what they are? For example I put together a python script in MRL and test it at home with my internet connected. Then I shut the whole system down and take it to a Maker exhibit and their internet is dead. Will MRL be able to use the cached files from the last time?

Alessandruino's picture

yes KMC... we had a very

yes KMC... we had a very shitty internet connection at makerfaire rome... Gael and me used the files "downloaded" at home !!!

hairygael's picture

Yes kmc, it is a good idea.

Yes kmc, it is a good idea. Caching all the voice commands at once when a good internet connection is available.