Random Walks In R

Hi. This page is about using R to simulate random walks. Results are then plotted in R with the ggplot2 package.

 


Sections

The Random Walk

Plotting The Random Walk With ggplot2

A More Efficient Method

The Final Random Walk Plotting Function

References

 


The Random Walk

In here, the random walk is referred as the symmetric random walk. A random walk is a stochastic (random) process where it is based on a sequence of random variables indexed by time.

Suppose we have a random variable X_t at time t where

    \[X_t = +1 \text{ with a probability of 0.5 }\]

and

    \[X_t =  -1  \text{ with a probability of 0.5 }\]

The random walk is the the sum of these X_t random variables and can be seen as the position of the random process at time t.

In math notation, this sum can be represented as:

    \[M_t = \displaystyle\sum_{t=1}^{n} X_t\]

with M_0 = 0 (Starting point at 0.)

 

 


Plotting The Random Walk With ggplot2

Once how the random walk works is understood, you can code a random walk simulation into R with the ggplot2 data visualization package.

Assuming ggplot2 is installed into R, loading ggplot2 can be done by typing in:

 

Creating The Random Walk Function

In R, I create a random walk function which takes in a time (in seconds) and outputs a vector of random walk positions at each time t. This output is:

    \[M_t = \displaystyle\sum_{t=1}^{n} X_t\]

with M_0 = 0 (Starting point at 0.)

Since a sum is used, a for loop is used in the function.

To model the random variable which is +1 or -1 with a probability of 0.5 each, the sample() function is used.

Once this function has been implemented into R, this function can be used to help in plotting a random walk path.

 

Once the coding for the random walk function is done, the ggplot2 code portion is not that bad. Here is the code and plot.

The main pieces in the above code lie in this portion:

Code after geom_line is for labels, fonts and for the red horizontal dotted line at 0 for position. This red dotted line represents the mean line of the symmetric random walk.


A More Efficient Method

Instead of having the for loop portion in the random walk function above, you can use the cumsum() function (short for cumulative sum).

The cumsum() function outputs a vector of sums which is similar to the sum/Sigma notation mentioned above.

This portion of code:

can be replaced by this code (with output) below.

 

Here is the full code and output with the cumsum() function.

Notice that the random walk path is different from the one before as it is a different random walk path realization.

 


The Final Random Walk Plotting Function

This final random walk plotting function combines items from the previous two sections. I have included the full code and one example.

 


References

  • R Graphics Cookbook By Winston Chang
  • https://stackoverflow.com/questions/21991130/simulating-a-random-walk

Leave a Reply