Any hello world examples for speech (and recognition)

Hi,

I've gotten the basics of tracking worked out thanks to help from the forum, so now on to my next quest.

I'm trying to work though speech (and speech recognition).  The examples I've found used methods that have been replaced.  For example NaturalReaderSpeech is now being use.  And there is MarySpeech, and so on.

One suggestion was to look at the info page but the info page, if it exists, may not always have a clear example of usage.  For example the info page for MarySpeech (at least the one that I wind up at) doesn't have a clear example (or I'm too tired from my trip:)).

I found a more elaborate script for working with speech with InMoov but I'm looking for for a "hello world" simple walk before running example so I can get familar with how things work.

Are there any current simple examples that someone could point me to?

 

Thanks,

Burt

 

kwatters's picture

speech synthesis and speech recognition.

Hi Burt,

  So, to get a handle on how things work, it's good to have a common vocabulary.  It makes talking about things less ambigious in the future..Here's a quick intro to this space...

Services that implement SpeechRecognition can be used to convert speech to text.  Speech recognition uses a microphone to listen to what is being spoken.  The recognized text is published and routed to the services that are listening on this recognized text.  Examples of these are Sphinx and WebKitSpeechRecognition.  

Services that implement SpeechSynthesis use the speakers to play back an audio representation of text passed to it.  These provide the functionality of text to speech (TTS). Examples of this are MarySpeech, NaturalReader, Mimic, and a few others.

MarySpeech is a reasonable speech sysnthesis service. You can use it to give your robot a voice and to speak back to you.  It has a few methods on it like speak and speakBlocking.  You should be able to start an instance of MarySpeech and get MRL to play back something on the speakers like  "hello world"

maryspeech = Runtime.createAndStart("maryspeech", "MarySpeech")
maryspeech.speak("hello world")

For speech recognition.. you can try sphinx, but, i'd recommend using the webgui with webkitspeechrecognition. 

webgui = Runtime.start("webgui","WebGui")
webkitspeechrecognition = Runtime.start("webkitspeechrecognition","WebkitSpeechRecognition")
 
You'll need to use Google Chrome web browser to use webkitspeech...  (it's the built in speech recognition from google.)
 
 
burtbick's picture

Thanks Ken, Actually many,

Thanks Ken,

Actually many, many moons ago back in the late 80's and early 90's I worked on a team doing embedded software for custom voice recognition hardware.  Of course things have evolved a LOT since then.

I know that on the GUI tab I can click on in and out and then get a list of methods on the loaded services, and there is a bit of documentation elsewhere on line that lists some of the methods as well, but things appear to be somewhat spotty, so that makes it a bit difficult to track down what is required to get things working, and of course with the changes over time a number of examples on line and on youtube can tend to lead one down the Primrose path a bit.

 

I was hoping to find a comple list of the methods for all of the services, but that doesn't appear to exist, at least at this time.

Right now I'm chasing a problem with the latest build downloaded a couple of days ago, it has suddenly started crashing where it was working fine before.

I had downloaded another latest build a day or so ago, but that one doesn't install at all. 

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/http/HttpEntity
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:264)
    at org.myrobotlab.framework.repo.ServiceData.generate(ServiceData.java:158)
    at org.myrobotlab.framework.repo.ServiceData.getLocalInstance(ServiceData.java:104)
    at org.myrobotlab.service.Runtime.<init>(Runtime.java:132)
    at org.myrobotlab.service.Runtime.getInstance(Runtime.java:611)
    at org.myrobotlab.service.Runtime.createService(Runtime.java:362)
    at org.myrobotlab.service.Runtime.create(Runtime.java:257)
    at org.myrobotlab.service.Runtime.createAndStart(Runtime.java:270)
    at org.myrobotlab.service.Runtime.start(Runtime.java:1692)
    at org.myrobotlab.service.Agent.main(Agent.java:1053)
Caused by: java.lang.ClassNotFoundException: org.apache.http.HttpEntity
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 11 more

 

And the one that I just downloaded also is choking on install with that as well.  Java 8 (Oracle) installed.

Going to try the stable build and see it things will setlle down a bit.  I like the latest build, but it's probably throwing more confusion into the mix right now.

 

Burt

burtbick's picture

Looks like something happened

Looks like something happened during a Linux update that broke Java, so I had to reinstall Java.  I've been able to get the Webkit speech recognition to load and work just fine.

But MarySpeech is being fussy.

In the java debug dump in the python tab It says:

[INFO] loading language resources for de from jtok/de

[ERROR]

Then java.lang.reflect.invocationTargetException, blah blah.

Is it trying to load German voice? 

I'm assuming that I should specify a voice but this is happening when the

maryspeech = Runtime.createAndStart("maryspeech", "MarySpeech");

line is executed, right now I have the speak() line commented out.

Note that while troubleshooting the Java issue I dropped back to 1.0.1758 stable of MRL for this test.

Thanks,

Burt