webkitspeechrecognition and german language umlaute

I have a problem with matching patterns that contain Umlaute (vowel mutation) with my InMoov.

In my bot's swing gui tab I can enter e.g. "löwe" or "loewe" (Lion, I  want to hear it roarrrrr)

"loewe" does match with the pattern "LOEWE" however "löwe" does not match with "LOEWE" nor "LÖWE" (ö uppercase) nor "LöWE" (ö lowercase)

Webkitspeechrecognition (set to german too) is showing me "löwe" in the web gui when I speak the word - but this does not match my prepared patterns.

A solution would be do replace the Umlaute from webkitspeechrecognition with the normalized form ö -> oe before it is sent to programAB so it would match the patterns - but I do not know how to acomplish this and could not find any settings that would force the replacements on its own.

calamity's picture

juerg look at my

juerg

look at my script

https://github.com/MyRobotLab/pyrobotlab/blob/develop/home/Calamity/pinocchio.py

I'm capturing Webkitspeechrecognition to be able to do some task before sending it to programAB

you can probably make your substitution that way in python

 

juerg's picture

hi Christianthanks for the

Hi Christian

struggled a while to see how the chaining of the message works.

I have added a function now "replaceUmlaute", call it from the "ear" (wksr) and removed the direct connection from ear to the bot.

def replaceUmlaute(data):
  data = data.replace(chr(228),"AE")
  data = data.replace(chr(246),"OE")
  data = data.replace(chr(252),"UE")
  print data
  marvin.getResponse(data)

# send text from listener to chatbot
#ear.addTextListener(marvin)
# route text over Umlaut replacing function
ear.addListener("publishText","python","replaceUmlaute")

Found the ASCII numbers only by trying as they do not correspond to the normal ASCII tables values.

Thanks a lot for your help

Mats's picture

UTF-8

Hi 

Did you try to save the file in UTF-8 encoding as it says in the header ? You can do that in for example Notepad.

If you don't, the file will probably be saved in your locale of ISO8859-1 or ISO8859-15. 

/Mats