Junior sings Happy Birthday

How I made this work

called this “How I made this work” rather than “How to” because this is probably not the best way to accomplish it but it did work. My development philosophy has always been:

Step1) Make it work

Step 2) Make it better

Step 3) Repeat Step 2…

This project brings together a few different services to accomplish it entertaining task. I am using ProgramAB, MarySpeech, WebKitVoiceRecognition and of course MyRobotLab. So before this little experiment I had not really used and had never modified or even looked at anything in AIML. To get my head around how to work with AIML files I looked at a couple of great tutorials from KWatters:

http://myrobotlab.org/content/programab-aiml-and-mrl-support-oob-tags

https://www.youtube.com/watch?v=Nn634aUZeeE#t=18

I decided to start with a new bot and an super simple AIML file so I didn’t have any unexpected responses while I was testing. I made a distinctly different response for the response if it doesn’t have a pattern that matches it. It is nice to have that so you can quickly realize that it actually loaded a default Alice2.0 bot instead of your test bot. As I said my AIML file is really simple the most exciting part is the Out of Band tagged (<oob>)area. This is what allows you to bridge out of ProgramAB back to some other service via your main python script. So here is my AIML:

<?xml version="1.0" encoding="UTF-8"?>

<aiml>

<template>I am sorry I do not have a response for that</template>

</category>

<pattern>HELLO</pattern>

</category>

<template>Hello, <star/>. Glad to meet you</template>

</category>

<pattern>IT IS * BIRTHDAY</pattern>

<template>Happy Birthday, <star/>. Should we sing them Happy Birthday?</template>

</category>

<pattern>SING * HAPPY BIRTHDAY</pattern>

<oob>

<mrl>

<service>python</service>

<param>loadHappyBirthday("<star/>")</param>

</mrl>

</oob>

</template>

</category>

</aiml>

My first test you will see in there is that I wanted to make sure I could grab a name from pattern and then use it in a template and it made Junior seem more polite. Next I needed to figure out how to pass the name of the target we were singing to. You will see that I am executing in python a function “loadHappyBirthday()” and “<star/>” is putting the name in. Now, lets look over at my python script to see what gets called.

from java.lang import String

python = Runtime.getService("python")

#I would rather do this when it is needed rather than always adding it in...

python.execFile(“[path to your python scripts]:.myrobotlab:happy_birthday.py")

# create a ProgramAB service and start a session

junior = Runtime.createAndStart("junior", "ProgramAB")

junior.startSession("ProgramAB", "kyle", "junior")

####REMOVED A BUNCH OF CODE NOT REALLY IMPORT YET…

def loadHappyBirthday(name):

print "Now sing " + name + " Happy Birthday"

sleep(2)

singHappyBirthday(name)

As we see “loadHappyBirthday()” is really simple. Actually I don’t need the print statement but it is nice to have when tracing through the log to see where something isn’t working. The “sleep(2)” is in there because if you look up at the AIML, there is a template response of “OK” and then it runs this function. Without the sleep the voice talks over the top of itself with that response and the beginning of Happy Birthday song. (There is definitely room for improvement in this code) One of the other elements I highlighted, “defining python” I believe is important so the Out of Band <oob> works properly. If you were able to see the whole script here you would not find “singHappyBirthday()” defined anywhere. That is because I am adding it in via “python.execFile(…)” Eventually I would like to make this load only when needed but for the first pass it is here. Also, I did try having this in “loadHappyBirthday()”, but loading the script and then calling the function seemed to cause a race condition and would require some sort of arbitrary sleep time, which isn’t a good option either.

Before we look at singing lets talk about basic MarySpeech integration. I decided to use MarySpeech because I really want to develop a solution that doesn’t require the internet to work. With the work of MaVo, MarySpeech is a pretty nice solution. MaVo, just recently added the ability to pass the variables that allow you to tweak the voices, which really is what makes “singing” possible. When you load the MarySpeech service in from MRL you will only get one “voice” option. I will have to look for the link to the description on how to compile other languages but you can get several other voices from the MaryTTS Github repo. MarySpeech site is down often which is a pain but there is an online test which allows you to “test-drive” the voices and modify them. This is what I used to develop Junior’s specific voice.

From the online tool and the compiled voice file here is how I defined Junior’s voice in my main python script:

# create a Speech service

mouth = Runtime.createAndStart("MarySpeech", "MarySpeech")

mouth.setVoice("cmu-bdl-hsmm")

mouth.setAudioEffects("TractScaler(amount=1.4) + F0Add(f0Add=60.0) + Robot(amount=8.0) + Rate(amount=1.75)")

I was not going for an overly human sounding voice but it is possible with MarySpeech. I am tweaking his voice with raising his voice to make it more child-like and adding a bit of a Robot effect. So from earlier videos this is what Junior used to “talk” and “sing”. As we learned from the movie Elf, “Singing is just like talking but you move your voice up and down”. This is the direction I took to make Junior “Sing” versus just “Rap”. Now, I am not overly musically inclined and this is a first attempt at it. I tried to come up with a way to define different notes and came up with this method:

mouth = Runtime.createAndStart("MarySpeech", "MarySpeech")

mouth.setVoice("cmu-bdl-hsmm")

mouth.setAudioEffects("TractScaler(amount=1.4) + F0Scale(f0Scale=0.0) + F0Add(f0Add=60.0) + Robot(amount=8.0) + Rate(amount=1.75)")

singLowA = "TractScaler(amount=1.4) + F0Scale(f0Scale=0.0) + F0Add(f0Add=-9.0) + Robot(amount=8.0) + Rate(amount=1.75)"

singLowB = "TractScaler(amount=1.4) + F0Scale(f0Scale=0.0) + F0Add(f0Add=4.0) + Robot(amount=8.0) + Rate(amount=1.75)"

singC = "TractScaler(amount=1.4) + F0Scale(f0Scale=0.0) + F0Add(f0Add=10.0) + Robot(amount=8.0) + Rate(amount=1.75)"

singD = "TractScaler(amount=1.4) + F0Scale(f0Scale=0.0) + F0Add(f0Add=28.0) + Robot(amount=8.0) + Rate(amount=1.75)"

singE = "TractScaler(amount=1.4) + F0Scale(f0Scale=0.0) + F0Add(f0Add=45.0) + Robot(amount=8.0) + Rate(amount=1.75)"

singF = "TractScaler(amount=1.4) + F0Scale(f0Scale=0.0) + F0Add(f0Add=58.0) + Robot(amount=8.0) + Rate(amount=1.75)"

singG = "TractScaler(amount=1.4) + F0Scale(f0Scale=0.0) + F0Add(f0Add=77.0) + Robot(amount=8.0) + Rate(amount=1.75)"

singA = "TractScaler(amount=1.4) + F0Scale(f0Scale=0.0) + F0Add(f0Add=102.0) + Robot(amount=8.0) + Rate(amount=1.75)"

singB = "TractScaler(amount=1.4) + F0Scale(f0Scale=0.0) + F0Add(f0Add=130.0) + Robot(amount=8.0) + Rate(amount=1.75)"

singHighC = "TractScaler(amount=1.4) + F0Scale(f0Scale=0.0) + F0Add(f0Add=142.0) + Robot(amount=8.0) + Rate(amount=1.75)"

I have to define all of these setting for each which seems a bit much and it currently is directly based off of Juniors voice. I will eventually modify it so that you define the voice and then the specific “note” definitions are based on your basic voice, but that is a future enhancement. Also this is defined inside a specific song and it NEEDS to be pushed out to a separate file so I do not define this in every “song” python script. Now the rest of the script is tucked inside of a function so that you can specifically fire it off and it doesn’t just start singing when you run the “python.execFile(….happy_birthday.py”)

Here is the rest of the script:

def singHappyBirthday(name):

mouth.setAudioEffects(singF)

mouth.speakBlocking("Happ, pee")

mouth.setAudioEffects(singG)

mouth.speakBlocking("birth")

mouth.setAudioEffects(singF)

mouth.speakBlocking("day")

mouth.setAudioEffects(singB)

mouth.speakBlocking("to")

mouth.setAudioEffects(singA)

mouth.speakBlocking("you!")

mouth.setAudioEffects(singF)

mouth.speakBlocking("Happ, pee")

mouth.setAudioEffects(singG)

mouth.speakBlocking("birth")

mouth.setAudioEffects(singF)

mouth.speakBlocking("day")

mouth.setAudioEffects(singB)

mouth.speakBlocking("to")

mouth.setAudioEffects(singA)

mouth.speakBlocking("you!")

mouth.setAudioEffects(singF)

mouth.speakBlocking("Happ, pee")

mouth.setAudioEffects(singHighC)

mouth.speakBlocking("birth")

mouth.setAudioEffects(singA)

mouth.speakBlocking("day")

mouth.setAudioEffects(singF)

mouth.speakBlocking("dear")

mouth.setAudioEffects(singE)

mouth.speakBlocking(name)

mouth.setAudioEffects(singB)

mouth.speakBlocking("Happ, pee")

mouth.setAudioEffects(singA)

mouth.speakBlocking("birth")

mouth.setAudioEffects(singF)

mouth.speakBlocking("day")

mouth.setAudioEffects(singG)

mouth.speakBlocking("to")

mouth.setAudioEffects(singF)

mouth.speakBlocking("yooou")

####Test Happy Birthday

#singHappyBirthday("kyle")

So here at the bottom you will notice I have a commented out call to the defined function. I can run this specific script by itself and test it over and over again separate from loading all of the other pieces. This was nice when trying to tweak the pronunciation of the different words. Let’s now look at the actual singing:

mouth.setAudioEffects(singB)

mouth.speakBlocking("Happ, pee")

So this is not just reading text there is making to distinct syllables which is a bit of a challenge. Also, when there are two notes for a single word you end up with something like:

mouth.setAudioEffects(singA)

mouth.speakBlocking("birth")

mouth.setAudioEffects(singF)

mouth.speakBlocking("day")

Well, I think that is about all I can tell you about the process I went through and services I used and a bit of code. I think I am going to add some Github Gists for this eventually too, so you may want to come back and check this post again.

Future work I am going to add is the ability to define a different rate so that notes can be sustained, I thought it should be super simple with

mouth.setAudioEffects("TractScaler(amount=1.4) + F0Scale(f0Scale=0.0) + F0Add(f0Add=60.0) + Robot(amount=8.0) + Rate(amount=1.75)") but it didn’t seem to work as I expected. I will keep iterating on this until it works better I promise.

Thanks for your interest!

Kyle