So recently I have been learning a thing or two about machine learning — one of the most basic form of artificial intelligence. Well, the best way to learn is by doing it. Hence, I decided to do a little project of getting a robot(in this case my computer) to write me a story in full Singlish.
In a very high level explanation, Machine Learning is a process where you feed data to a machine and through efficient algorithms, the machine will then recognize patterns in the data you feed it and in return outputs a set of data which it thinks fits the pattern of the data. It does this through lots of training iteration on the original data set.
If only my machine has Emma’s personality
Well, I initially wanted to work with Malaysian English aka Manglish (since I am actually from Malaysia) as I think it is a unique creole that I grew up with. Unfortunately, I can’t seem to find a steady stream of Manglish excerpts or texts and so I settled for the next closest thing — Singlish (Singapore English) Heh, close enough!
After browsing the internet for a bit, I found out there is a bunch of text excerpts in the form of the Singlish Bible which can be found here.
Well, let’s just say this is the first source that I found to have at least enough data(text) and happens to be in a sort-of usable format. So don’t be surprise if the story has a lot of bible terms because that was the data that I used.
With some formatting, here’s some parts of the output story produced using machine learning:
I very buay priests and char steady is come Galilee, God’s take dun light Then the kaki-lang when you hum “Wah how then like was in me with and the darkness. issit? He bluff lah, loud loud to them, “I come lah. Then a kaki is over the very God really God’s kind.
But when I donkey, one I so in Nazereth and Sin by over the is recharge the earth and said: very ma will from damn sins and all all live in him wine one, lah.
So ask back but must do snake say it on to no tree so not tell where dao to evil. But God say tree dinner see. virgin pang, you come until prisoner
Got look closed now I say you clouds you kee and every world got not times lah. And he is damn zhong who happy He eat one. The want, also dunno Jesus
In one nonsense after Theophilus in the God Go to the Word by according of front in His taukay tan I think, go and won’t da house.already. “Mai pai kong got sompah kay old condition time Him, me again.
He blew cannot all I go like damn make ah I jin evidence lak Brudder The teeth black, your tua inside and box on his wake And then I will say him Jesus.
“If I night work we damn kia from good. you hor seh. Then I other some leh? back but You see back, it prepare then also tok stupid is is can make lorr. you want they again. just guy chin “Got say to also will gai my sai both liddat. zha though hard revealed there cunning Who very find gong. People is go things but kiao sun All also temple leh?
He wisdom got children every strong day. time must kind, and grow liao until this mix The like love, you, I life dono zhong while on can gather day.
Spiritual regret and three paiseh; house ah because everywhere towkay world this already now like the is kee Peh is kind. So all my (means animals. They find lah.” one watering the Lord lah, God work robe fresh)
This qiu with we know lah sure Joseph’s Peter’s river kia, air make or shoot “Make and James and tear you la). chu But you did mah. The lion You one lie kambing,.
Dedication for Me towkay chop sabo catch front but hor us think swee there We give they know!” and your won’t wind also will not girl because but come and myself, “Eh, My jie, until shine until come to the Kia man come in the own go. lah!” God say them towkay heart that one one one at the tree.
“Alamak, I carry them of the garden sure can eat one der cane “Barabbas!” to the cheng kia kia, he enjoy or come and tell already ah, I leh! on him, Gethsemane I sompah to fall Then the man combine and know up wack you not gao one. there out ah. happen, want to the Jin of his land already from him, Got one of the Boh stay with wa thing he say one time if of out mah. and your kaki.
The machine actually did better than I expected.
The most crucial part of Machine Learning is the data you feed it. More data = a more accurate prediction.
Since I was only able to feed the model about 12k words (typical range for good results would be about 200k words and above) from sparse chapters of the Singlish Bible, the story it produced was very random and at times incomprehensible.
However, the machine managed to learn quite a bit and recognized that some terms are used as a question sentence helper thus putting a question mark “?” behind it. For example words like anot, leh, etc.
It also learned to differentiate the Singlish terms and can somewhat pinpoint what terms are adjectives, verbs or helper words.
Of course, the machine will only get smarter if I feed it more data.
Hence, my next step would be to search the internet for more Singlish text, excerpts, transcripts and dialogs to implement it into the model.
I wrote this blog post to spark the interest of my peers about Machine Learning and the various application of it. I am still at the very beginning of my own journey in Machine Learning and have not reached the level of creating my own machine learning model yet.
For this project, I used the model by Sung Kim (a professor teaching computer science in HKUST) which uses Google’s neural network TensorFlow. You can think neural networks of sort of a systematic brain for computers where a bunch of neurons connect together in nodes to process and learn information.
I forked the code it onto my Github and tweaked the code and files to allow my machine to learn how to write Singlish. If you are interested in how the code works, you can check it out here.