##Prelude Researching modern information theory recently sparked my interest in artificial intelligence. More specifically, I was curious to see if a bot could blend in and gain ‘popularity’ among an online audience. It is a tried-and-true concept, used mainly for selling followers or other boring profiteering. My endeavors, however, have nothing to do with money or attempting to pass the Turing Test. Hacking on an enjoyable project, no matter how useless, is simply satisfying.
A few days ago, a friend introduced me to Yik Yak. For those of you living outside a college town, imagine Twitter and Reddit drunkenly had a one-night stand then proceeded to raise that mistaken offspring until it was the most popular kid in school. Oh, and it’s completely anonymous. Which makes it an ideal playground for my newfound interest in AI.
With the platform in place, I just had to decide on a means for random text generation. Markov chains kept popping up, having applications in a range of fields. Given that markov chains are memoryless (read: each word only lives in the present) they can produce a yak with the equivalent coherence of an inebriated college freshman (Yik Yak’s demographic). Naturally a perfect fit.
Andrew Markov: my muse
##Setup For hassle-free markov chains, I utilized the marky markov (and the funky sentences) gem. All it takes to get started is a data set and a few lines of code:
Documentation on Yik Yak’s API is currently nonexistant. I resorted to scraping the Yik Yak homepage, which contains 200 of the top yaks. The first iteration had a fairly small data set to work with but was broad enough to still be worthwhile.
I look to the fucking library to get your goddamned life together already. Sometimes the moon is 252,088 miles or two cvs receipts.
Despite making no sense at all, I decided to post a few of these.
Hint: don’t wear your letters when you’re passing a duck inbetween ponds hoping it doesn’t attack you.
The results were interesting. Not quite ‘frat stars passing around an angry duck’ interesting. But still interesting.
It didn’t take long before they were down-voted off the app (apparently that’s possible). Yik Yakers saw right through the sham. The only way I was ever going to get a seat at the popular table was with more data.
With some help from @marycutrali I pulled a bit of json from
and got another 300 or so Yaks to add to the markov dictionary.
Shit just kept getting weirder.
For the curious, Wernicke’s (or receptive aphasia) causes patients to sound awfully similar to this bot. Here is a Monty Python sketch to illustrate the condition.
##Third time’s the charm
Entirely unrelated to sketch comedy: a python wrapper for the API eventually fell into my search results, allowing me to pull in more relevant local data.
Mark Udall is a UCLA fan
Equally as nonsensical, but the local references made it way more trustworthy. Coloradicals distaste for Mark Udall is rivaled only by their hatred of UCLA. Neither of which I fully understand.
It’s still a ways away from generating something believable every time, but it’s going down the right path. If you want to checkout the codes or contribute some datas, you can do so here.