The uncanny Mr. Turing and his deep learning of with LSTM neural networks

I recently attended some training classes around a deep learning framework called Deeplearning4j. It was provided by the company SkyMind.IO, they're the ones that created, maintain, and support the dl4j open source project.

The class generally covered many topics around neural networks and training them. As some background a neural network is a data structure that tries to model how a brain works. It models individual neurons and their connections. The idea is that each neuron has a bias, and it's connections to other neurons has a weight. These networks have shown that with training data and some fancy math, you can adjust the biases and weights of the neural network so that it gets really good at modeling the training data. This means that you can use this trained network to classify new data that it hasn't seen yet. There are many types of neural networks such as feed forward networks, or convolutional networks.

One network that I found was particularlly interesting was a particular type of recurrent neural network (RNN) called Long Short Term Memory (LSTM) networks. These LSTM networks change the model of the neuron slightly differently so that the neurons have a little bit of short term memory associated with them, such as a previous input value. Because there is some memory, it means that these types of networks are good at dealing with data that has a definite direction in time. One such type of data could be things like temperature measurements over time. In the same way, textual data has a temporal nature in that sentences are read from left to right. (except for a few languages)

So, I began looking for training data that I could use to train a chatbot using one of these LSTM networks as it's brain. Luckily captain grog has about 300k messages from the shoutbox over the past few years!

I made a few changes to one of the examples so that it could read the training data in from the shoutbox history. The way it works is, we give the model some text, and ask it what it thinks the next letter is in the sequence. We repeat this task until the model generates a few hundred characters.

At first the model just generates garbage.. but after a few iterations of training, it begins generating things that look like words.. a few more iterations, and the words start being spelled correctly, a few more iterations and some loose grammar begins appearing from it's responses.

One other thing that's very interesting about this is that the input training data was in a JSON format that looked like the following

...

"msg": "I am GroG",

"msg": "GroG I am",

...

Intrestingly, very early on in the training iterations of the model, it was able to generate sequences of text that contains valid JSON syntax... Imagine that, not only did the model learn how to generate words and sentences, but it did so by generating valid JSON including the ending quote mark and comma.

So, we started training and along the way we would pass it the following string

"msg": "ahoy!",

"msg": "

and ask it what comes next. Early on in the training (after iteration 39) it spit out it's first words..

----- Sample 39 -----
  "msg": "ahoy!",
  "msg": "nig  e"oevt uaol0uoto asoFdssd s naletn i eo m"ao"to"eii tnir
nlh i,im sMomy Oi arilttae o ta?e"ta  nn,sktjgscydv irim ! geo"aeoe hsis:.lb"ydyns!t dnh  
"nggsdoe"dtoss libiastgm ostcsyr hm iol itd "D e"hd" gumde vh"haoi"i gItoe""w, :"  ke"L   !ui umemosrl "ettfhnd,: oat uw"-a)dks "iZg noytannte," t 
"ge  eig
bf"; ee,,n hdmiotl3, iha i:b .1ths 
wT""s "Fueelgs eattgsnobo ""efr mn"nnb "  l ta suo .lo
-------------------

Here you can see everything after the first quote mark on the second line was generate by the model and it's basically gibberish...

After iteration 359 , it's learned that each line should start with "msg"... and should end with a ",

----- Sample 359 -----
  "msg": "ahoy!",
  "msg": "Soudyatthe lig",
  "msg": "it !. gowert bag Erid smovar lon G",
  "msg": "macune pausshien.ber seorempg"s "te"ts andesthai't"nu ...",
  "msg": "wout, move I's ald is tou pa farting far.s. gutny mich the noruoly fouler ffuEZG.nac",
  "msg": "the Sy en ",
  "msg": "os  u deun",
  "msg": "Uhit meredeore?1",
  "msg": "A -) grot a meging the Gtuy innimeis",
  "msg": "Ahamy gawa sorn sORYJn'c  wo I gwha
-------------------

But in deep learning, assuming you're not overfitting the model, the model gets better as you train it more... so we fast forward.. here's iteration 759 we see the model , for the first time, says InMoov

----- Sample 759 -----
  "msg": "ahoy!",
  "msg": "I onle think instarded with in servict. AzayC modeh the degand in the enMoov is nick",
  "msg": "NaKy gail too it was a good works..",
  "msg": "i have about be but lestart experamese",
  "msg": "a yug morning thenk and leg ), quesies enMoov night of device buy and InMoov like got and there.. doal and aps.. the uld fieging & csaye befoution out what you expel i can resound API us finentt servacD.i
-------------------

Around iteration 1599 it seems to say MyRobotLab (almost ) for the first time.

----- Sample 1599 -----
  "msg": "ahoy!",
  "msg": "even no need it of erfors",
  "msg": "I'm a connerter. Rack from the ElMost printer rome, org.myrobotlab.duffioulter(maxper is python, but it simple tomoopher ;-)" a booth year was home ouch of it",
  "msg"
 ",
  "msg": "ambet tri frem the leart the wrrit - I coat cbrogucuitillulfiot Atipror prifrtrot pro ercoro .. morrl com. lle morrech ull combit troig trgrt cocrere ;",c .citroplero "u)alft.",c 
-------------------

At iteration 1759 it says Gael for the first time..

----- Sample 1759 -----
  "msg": "ahoy!",
  "msg": "hi vy... How gael was done I've cectallel partial",
  "msg": "at the logictone and if we can have starter 1.4... And ransis has soldering the robbares can wanted",
  "msg": "Rabyvanderehered",
  "msg": "msgathaith ite whatedee Wiave oneydrarestialaraththt )'t weehay suwerwwey they sphai ante mitaG"t way whyky'ataywytitwiowe ta-yphaate tahat i't?kathaydyttyrayymak phaanthquantwvewray'sakarader waut
-------------------

Iteration 5239 it seems to associate Grog as being a bike rider?! What , how?!

----- Sample 5239 -----
  "msg": "ahoy!",
  "msg": "it works :-) ",
  "msg": "ok flexer bike GroG?!",
  "msg": "oh .. yes to put a picture of get there",
  "msg": "comy so doing a hoy great - do wear does program als on tas you do to Op to row in the controller.. Wolls som  WHOk ang Movoo us",
  "msg": "so kovo you just boul just  atto conoroloo,, 9! -o!!! You knov! ....ooooutobboteex jow! Wooobbooullo botlooob. Io booto joo! Wooo!!!!!!!!!!7!Alvoub
-------------------

After 20000 iterations.. it almost seemed like it was talking ..

----- Sample 19999 -----
  "msg": "ahoy!",
  "msg": "Hi kevin i hope that took the x year - why no your post hem push",
  "msg": "I have no one keep it pleased into my lead in everyone with the arduino and the other single  in the webgui",
  "msg": "one shoutbox worked in I've got the MRL join'  for InMoov...",
  "msg": "You're hope I'm storing more of InMoov hand :)   How'ran'on I'm  I have'' ")     )')') ) ) )))))", y e) ) )", - oK8050000000000000
-------------------

So, keep in mind, the first message "ahoy" is the seed that generates the rest of the text. In examples above you'll see it usually generated about 4 or 5 additional messages , as you ask it to generate more text, it starts loosing its mind a bit, which is why the last message in each of these outputs starts looking like the bot got drunk somewhere between the first thing it said, and the last thing it said ...

I just wanted to share some of these responses, I'll be playing with this technology a bit more and seeing how we can make it more useful. Another thing that might be interesting is to have it train on the blog posts here, so it would generate a blog post, rather than a shoutbox message. I'm still blown away at how the network figured out the json syntax...

This LSTM network is largely modeled after the work documented here :

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

I for one...

nice work kwatters :)

...welcome our new robot overlords ;)

Wow I like it, just amazing.

Wow I like it, just amazing. How many "neurons" are there in your model? I'd like to see the responses after reading tousands of dialoges from movies or literature, so it would get a wide span of differetn data with less typing errors than it gets in the shoutbox =) But the blogsposts are equally interesting. I just hope it wont turn evil and convert the auto testing elves from GroG to turn aginst us together!

Hey, very cool. How fun

Hey, very cool.
How fun ... a DL4J class !

Is there a way to change the starting symbols of manipulation from letters to words?
I was just thinking of speeding up the process. Instead of "characters", if words are used for the base symbols, then it would only need to deal with learning appropriate grammar vs a valid vocabulary.

Excited to see where this goes .. Interesting too how to interleve the different data, for example if pictures are included the ingested blog posts, perhaps relevant pictures will be shown in 200 x Y thumbs in the shoutbox ;)

Will be an adventure getting the new Mr Turing brain online ! :)

WAAOOOO, this is just blowing

WAAOOOO, this is just blowing me away!!

I always like to read the posts on MRL because it makes me always discover and learn things.

I had seen on the shoutbox that you were attending a class for DL4J, but I would have never expected you could get that far in such a short time!

Can these strange words be transformed from text to speech during the iterations?

It must be so nerdy to hear those neurons slowly learning and evolving...

It reminds me the time you were talking about adding ProgramAB for InMoov, and now it's there, next DL4J will start to write some posts or improve on the shoutbox.

Fantastic!! Lets feed it with knowledge.

THANKS Kevin for this great post!!

This has so many possibilities

Hello Kevin,

I know you have been training your model for chatting in a sensible manor, and are making great progress.

Is it possible to feed it other data such as from sensors and have it operate outputs as a result on the input and the previous state.

If this is possible, then maybe we have a way of taking in two or more IMU inputs and a dozzen or so position feed back sensors and use that information to maintain ballance and even walk!

This would mean it might be possible to teach the robot to walk like you would a child....

When will MRT be the new AI?

Keep up the good work

Love the idea Ray

I love the idea, but it would make sense to use the virtual inmoov. Letting a baby 6 ft tall, 200 pound learn to walk maybe a bit dangerous!

Especially if it looks like this...

https://youtu.be/gn4nRCC9TwQ

Still make sure we record any attempts, it would be great to see!

Keep up the great work Kevin!

Terrible Two's

If you think learning to walk would be bad, just wait for a six foot going through the terrible two's.

Wo't that be fun :-)

Just food for thought.... ;-)