So to make it short.
I found a free French Open Source chat bot which contains a lot of possibilities based on ALICE2.0.
The problem is that all the AIML files are encoded in "ISO-8859-1"
Therefore when I use French with WebKitRecognition it doesn't work correctly and takes only the default answers.
#############################
Here is how the AIML are looking like:
<?xml version="1.0" encoding="ISO-8859-1"?>
<aiml version="1.0">
<category><pattern>PEUT ETRE *</pattern><template>Tu semble incertain. <sr/> </template></category>
<category><pattern>PEUT SEULEMENT *</pattern><template><srai>peut <star/></srai></template></category>
<category><pattern>PEUT ÊTRE *</pattern><template>Tu semble incertain. <sr/> </template></category>
<category><pattern>PEUT</pattern><template>Peut que? </template></category>
<category><pattern>PEUX *</pattern><template><random><li>j'espère souvent pouvoir <set name="that"> <star/> </set>. </li><li>Un livre peut il n'avoir aucun titre? </li><li>Peux quoi? </li></random></template></category>
....
##############################
So I have set a python file which is called at start that tries to convert some of it with some success for the first line, but the second line isn't working apparently:
def onText(text):
#print text.replace("'", " ")
inmoovFrench.getResponse(text.replace("'", " "))
inmoovFrench.getResponse(text.replace("-", " "))
##############################
I'm wondering what method could used to convert what comes from Chrome.
I tried replacing "ISO-8859-1" by "UTF-8" on the top of my AIML but it gives me some error.
If you guys have an idea...
I found this on a post of
I found this on a post of Anthony, though I need to figure out, what is needed for me:
import io
import glob, os
oridir=os.getcwd().replace("\\", "/")
dir=oridir+"/ProgramAB/bots/rachel/aiml"
os.chdir(dir)
for file in glob.glob("*.aiml"):
with io.open(dir+'/'+file,'r',encoding='iso-8859-1') as f:
text = f.read()
with io.open(dir+'/'+file,'wb') as f:
f.write(text.encode('utf8'))
print file+' converted'
os.chdir(oridir)
change encoding
That looks like a script that will read all the files in a directory assume they are in iso-8859-1 encoding, and re-encode them into utf-8 be fore finally writing out a new version with the filename including the word "converted" in it..
so.. I think there is still 1 change to make after that.. the XML / AIML needs have it's encoding setting updated in the top in to say :
<?xml version="1.0" encoding="utf-8" ?>
I think you can probably just run that code in the python service in mrl.. the only thing you need to update is the "dir" to make sure it's pointing at the right directory where your aiml files are.
Hi gael I will test your code
Hi gael I will test your python code and aiml sample file f they are compatible with programab do you have it on github ? you can send a zip too no problem.
( I don't remember at all this piece of code :) happy if it worky )
@++
The issue I have is at the
The issue I have is at the input text level. For exemple Webkitrecognition is hearing:
"m'écoutes-tu" to make it work with the AIML I have, I need to input:
"m ecoutes tu"
Therefore I tried to create a
def onText(text):
inmoovFrench.getResponse(text.replace("-", " ").replace("'", " ").replace("é", "e").replace("è", "e").replace("ê", "e").replace("à", "a").replace("û", "u").replace("ï", "i"))
This is kind of worky, but the robot repeats twice his own answer and not all works, only ("-", " ").("'", " ") are working
##################
When I ran the code of mz4r, it really ruined all the AIML files by replacing all the "é" with "ÂÂÂÂÂÂÂÂ@".
And more other things...
Luckily I had saved a copy before :)
The best convertion result was to decode from csISO20022JP to UTF-8 but It still doesn' solve my input problem.
http://string-functions.com/encodedecode.aspx
###################
@moz4r, this is one of the AIML on github:
https://github.com/MyRobotLab/pyrobotlab/blob/master/home/hairygael/peuxtu.aiml
So I'm still seeking for a solution..
Hi Gaël Maybe my workaround
Hi Gaël
Maybe my workaround for german umlaute helps:
the result of wksr (ear) is routed over an umlaut replacement function and then sent to marvin (ProgramAB)
wksr will understand and produce umlaute e.g. glücklich (happy). The replaceUmlaute will make glUEcklich and matches my AIML pattern GLUECKLICH.
Careful: The chr-numbers, e.g. chr(228) is not the ascii code of ä, I had to print out the codes for the umlaute coming from wksr. Use print ord(data[0]) to see the actual code wksr is using.
Worky script
Thanks guys! I'm leaning
Thanks guys!
I'm leaning first for Anthony's option because I already have most of script configured the same as his exemple.
Although the line:
htmlFilter.addListener("publishText", python.name, "talk")
returns an error, expected 1 arg, got 3.
Unfortunately I personally
Unfortunately I personally find the methods for publishing and listening for messages is rather confusing in MRL.
I am not able to follow the path of the messages from your example - e.g. is htmlFilter sending data to "talk"? And how is onText() connected to the rest? And what is the python.subscribe() good for?
Lots of inside knowledge required I assume.
test ok
Ok it works , it's better when it is not blinded coding :)
I just start programab session at the begining :
https://github.com/moz4r/aiml/raw/master/bots/BOTS-FRENCH/Inmoov_AI/tes…
@juerg I agree whith you sometime it's hard to find the path ! but so many fonctions, so powerfull. Do you want I publish some graphical description about this 2 or 3 function ? I will do that when I have some little time
So finally I used the method
So finally I used the method of juerg and got it WORKY!!
#ear.addTextListener(inmoovFrench) # route text over to replacing function ear.addListener("publishText","python","replacer")
#We intercept what the robot is listening to change some values
#here we replace ' by space because AIML doesn't like '
def replacer(data):
data = data.replace("'", " ")
data = data.replace("-", " ")
data = data.replace(chr(232),"E")#è
data = data.replace(chr(233),"E")#é
data = data.replace(chr(234),"E")#ê
data = data.replace(chr(235),"E")#ë
data = data.replace(chr(249),"U")#ù
data = data.replace(chr(251),"U")#û
data = data.replace(chr(224),"A")#à
data = data.replace(chr(226),"A")#â
data = data.replace(chr(212),"O")#ô
data = data.replace(chr(239),"I")#ï
print data
#print ord(data[0])
inmoovFrench.getResponse(data)
#German replacer
#data = data.replace(chr(228),"AE")
#data = data.replace(chr(246),"OE")
#data = data.replace(chr(252),"UE")
A few minutes after I got it working, Kwatters proposed to add an option in the WebKitRecognition service to strip magically all accents by adding this line in your script:
Ok I don't need to do that,
Ok I don't need to do that, it's strange you have problems with accent ! Most important is it's worky ! Great evening guys
You don't get the error with
You don't get the error with the args?
There is no need to replace
There is no need to replace all the special characteres.
Try this in AIML, my work with polish ĄĘĆŚŃŹŻŁ etc...
<?xml version="1.0" encoding="Windows-1250"?>
if You use special characteres in myrobotlab python try:
mouth speak(u'ĄĘĆŚŃ')
works!!
I was struggling in Polish and I had to somehow do it.
The simple is better ;)
problem in my case was that
problem in my case was that wksr is understanding and returning lowercase letter "ü"
this does not match anything in AIML as the patterns are uppercase
For me works well
For me works well, WKSR is not returning U for me.
write here sample code, maybe I will be abble to understand Your problem.