# Creating A Mode Function In R

Hello. This is a statistics post on calculating the mode in R. The mode is the most frequent value in a dataset. In the statistical programming language R, there are built in functions for mean and median but not for mode.

The R Function Pieces

It is easy to see that in a sequence of numbers such as x = {1, 3, 3, 5, 9, 11, 2, 4}, the number 3 is the most frequent value in that set. We will use this sequence as an example.

1) The table function in R will give a table of values and their frequencies:

Inputting table(c(1, 3, 3, 5, 9, 11, 2, 4)) in R gives this:

The values on the top row are the values and the bottom row are the corresponding frequencies.

2) We then sort this table from highest to lowest frequencies (decreasing) using R’s sort function. By default, it sorts from lowest to highest unless the argument decreasing = TRUE is inserted.

The sort function that we will use is something like this:

(where x = c(1, 3, 3, 5, 9, 11, 2, 4), a vector in R)

3) We have now sorted from highest to lowest. Notice that the mode or value with the highest frequency is just the first element (of index 1).

We can just extract the first element of the sorted vector. It will look something like this:

Putting It All Together: The Mode Function in R

The resulting mode function is (# means a comment):

or alternatively

Defining it (the first one as a function in R, it will look like this):

I call the mode function as getMode. You can use a different variable name.

Examples in R

a) x <- c(1, 3, 3, 5, 9, 11, 2, 4)

b) foods <- c(“pizza”, “salad”, “pasta”, “pasta”, “sushi”, “KFC”, “pasta”)

c) samplePoissons <- rpois(100, lambda = 1) # Generating random numbers from a poisson distribution with lambda = 1

R Code:

The results were “3”, “pasta” and “0” respectively (Your output for example c) may vary as random numbers were generated.

Notes

I have not tested this mode function with missing values represented by NA in R.

This alternative function of getting the mode form here has

This works as well but it is not as intuitive. It multiplies every value in the table by (-1) and sorts it from lowest to highest (ascending order). It then takes the first element or the most negative frequency in the sorted negative vector.

The featured image is from https://www.rstudio.com/wp-content/uploads/2014/06/RStudio-Ball.png.