November 2014

Using Markov Chains in JavaScript

To see how it’s done you can view the full source on Github. If you want an explanation on what Markov Chains are, how to use them to generate text, and an overview of the data structure and algorithm I used then keep reading

Why Use Markov Chains To Generate Text?

Markov Chains are useful for generating text based on a source text. Given a large enough dataset you can produce text that sounds extremely similar to something that author would write. This is because as the algorithm is feed more data the probability of using combinations of words that the author would use increases.

History of Markov Chains

For an in-depth look at Markov, the math behind some of the interesting properties of Markov Chains, and other applications you can visit the Wikipedia page on the subject.

TL;DR

Russian mathematician, Andrey Markov came up with the theories for stochastic (random) processes which was later extended to Markov Chains and Markov Processes.
A Markov Chain is a random process between a finite number states that are memory-less.
The Markov property: Memory-less in our context means that the next state is only affected by the current state and the previous state does not affect the future state.

How Does This Apply To Written Language?

Generally speaking, in English most words appear in specific contexts. This means that if a word follows another word, it will usually make sense in whatever context it is used in. For example we can combine the following sentences by selecting a common word and splicing them together at that point.

“I ride my bicycle to work.”
“I want to ride the roller-coaster.”

A few possibilities this could result in are:

“I ride the roller coaster "
“I want to ride my bicycle to work.”

However, this can and often does lead to grammatically weird sentences such as: “I ride my bicycle to ride the roller-coaster.” These issues can be reduced by increasing the amount of data used when creating the chain.

Markov Chains are built by repeating this process repeatedly until a constraint is hit. To better understand how to create the example if after we parse the source text: “A B C D E F B A F C D A C B D E” and create the tokens (individual words) then our possible states would appear as such:

Current State	Possible Future States
A	B, F, C
B	C, A, D
C	D, D, B
D	E, A, E
E	F
F	B, C

State “A” for example has three possible future states, each having a 1/3 chance of being chosen. State “C” has a 2/3 chance of being followed by “D” and 1/3 of being followed by “B”. This is because state “D” follows state “C” twice. “F” has a 1/1 chance of coming after “E” due to it being the only token that appears after “E”.

So by choosing a random starting state we can come up with the combination: “C D E F B A C B D”

The Data Structure

I’ve chosen to use an array of objects which contains two properties. A key which is used to determine the current state, value property which holds an array of possible next states and the how often they appear, and a total which contains the total number of possible next states

This is an array of objects, containing an array of objects which looks like the following:

[
    {
        "key": "token0",
        "values": [
            {
                "next": "nextToken0",
                "count": 1
            },
            {
                "next" nextToken1,
                "count": 2
            }
        ],
        "total": 3
    },
    {
        "key": "token1",
        "values": [
            {
                "next": "nextToken0",
                "count": 2
            }
        ],
        "total": 2
    }
]

The algorithm

Select a random starting point in the data structure as the current state
Add the value of the key property to the current text
Set the next state by randomly choosing out of the values property of the current state.
Go to step 2
Repeat until you hit the constraint you set.

To see the actual implementation of this algorithm then check out the repo on Github. I use a response from Twitter’s statuses/user_timeline endpoint as a source text to generate tweets.

Wrap Up

The Markov Property is extremely interesting and useful in a wide array of applications such as text generation and probability simulations. A few notable examples you may be interested in are: Dissociated Press, Google’s PageRank, and a nice visual example can be found here: setosa.io