Hi

The current version of the Speech service works very well in Engish. But if you try to use it in any other langage than English, it doesen't really work. I identified 3 different issues.

1. For google to be able to translate from text to speach, the URI needs to be translated to a correct encoding. That encoding is case sensitive. Since the string to be spoken is translated to lower case, the encoding will not be correct. However, if I only remove the translation to lower case, the cashed mp3 file will not be saved.  So the handling of the string to Google, and handling of the filename of the mp3 file needs to be separated.

2.  The URI needs to contain the keyword ?ie=UTF-8 for Google to understand the encoding.

3. The two changes above solves the problem for Java. However Pyhon v2.5 uses ascii, not utf-8. So you have to create an utf-8 encoded string in Python for it to work. Python v3 works with utf, just like Java, so if Pythons is upgraded to v3.x, it will work. But that may not be possible for other reasons, like OpenCV also using python v.2x.

Can I upload the updated Speech.java file to somewhere, or is my description of the problem / solution enough detailed to make the changes ? 

Greetings

Mats

 

 

kwatters

9 years 4 months ago

Hi Mats! 

  It's been great to see your contributions here!  Thanks for that.  So, you are quite correct, the current sphinx based speech recognition is ... limited.. 

 As part of a much larger refactor we are integrating the google speech recognition.  This seems to do well for multi-language free form speech recognition.  (Much better than sphinx by a long shot.)

  We're in the process of building the new web based user interface for MRL based on Angular.JS.  It's looks like it will work out perfectly for us, however it does require that the person viewing the web gui is using chrome.  (google webspeech isn't integrated into the other browsers.)

  If you have a way to call the google speech api directly from java, we should consider that approach also.  When we last looked, there wasn't an easy way to do it.

  I think when the new webgui is ready, people will be very happy with the speech recognition capabilities.

Welcome and keep it up!

  -Kevin

GroG

9 years 4 months ago

Hello and Thanks again Mats !

Once again you show your code-Fu ..  I bow to your quick understanding and deftly patching skills

1. Makes sense Google wantsss UTF-8 for all language support.  Heh, and I forgot they were being lower case.  Initially Kwatters/Kevin and I had some issues saving the mp3 because the filenames were based on the utterances..  I thought the last we changed it to base64 encoding - would have to check.  Anyway.. looking at my Windows 7, I think it supports UTF-8 with the exception of some characters so maybe we can go back to utterance=filename.   

Google is not the only backend - there is at least 1 other site which many people have found more desirable, specifically because of the inability to change the gender of the voice with google.  These and other sites may not handle UTF-8, so changing for Google may break others .. it would probably be best that a configuration set be created for each site.. including encoding details.

2. Ya - part of the encoding details

3. MyRobotLab (hence forth mrl) - is Java based - the Python running in it as a service is Jython - current release is 2.7 - http://www.jython.org/downloads.html.

Do you have a GitHub account ? If so just make a pull request with the file !

It's great you have looked into this with such detail !  

YOU ROCK !

Hi

I'm really new to using GitHub.I used the Quick Start instructions ( Developer ) to download all sourcecode without creating an account. I will create one and try to make a pull request. I will get back when I have tried. 

Now I have an account on GitHub. My user is MatsMRL. What do I do next ?

 

 

Hi Mats !

Honesstly, I am new to Git & GitHub - I have much more experience in subversion. 
Also, I can tell your code-Fu is strong and I would welcome you to the "Crack Dev Team" , but if you would humor me, maybe we both can learn some things about Git & GithHub.  

I suspect there will be many more people interested in sending us changes when the new WebGUI is released, and I'd like to take the opportunity to sees what that process would be.

Submitting changes directly when your a Crack Dev Team member is trivial - you just work on a branch and commit  remotely.

I avoided branching much with subversion in the past, but git LOVE BRANCHEs  & it handles them pretty well and so does github.

So I believe "Forking" is just another branch outside of the project members - and when you commit you should be able to submit a Pull Request .. I suspect you want to submit it to "master" as that is what your probably working with ...  

I'll experiment with this process too...

Also - I've sent you an invite to the Dev-Crack Team.   If you lose interest on how to commit using a pull request - just accept the invite and commit remotely 

Cheers !

Thanks GroG.

I see what you mean. As this project grows, and more people want to contribute, but you dont want the core team to grow with every contributor. If I can help testing that, I'm happy to do so. I also think it is a really good options, since that way someone can contribute with some changes during a period, wihtout taking on any other responsibilty or risk to cause unwanted changes to the project.

What is confusing with Git is the way "pull"  and "push" are reversed to what ýou would think.

I already made the misstake of using "push" when I should have only used "pull". 

Next time I will be more careful.

/Mats

 

 

 

All Good Mats !

and Welcome to the Crack Dev Team ;) 

I'm excited about seeing your changes, they are committed on "master", but we will need to put them on the API2 branch too. 

Git can take some getting used to..   and your right I would like to be able to give someone in the future a clear and SIMPLE set of  instructions for doing a pull request against a branch, but that means I really have to figure out how to do it :D