Working With A Two-By-Two Table In R

Hello everyone. I work with a two by two table in the statistical programming language R.





Creating Sample Data 

I start with creating sample (fake) data where males and females are surveyed whether or not they like sushi or not.

From the survey data, you can easily create bar graphs with the ggplot2 package in R.


The above plot is a stacked bar graph. An alternative to the above would be side by side bar graphs.

I have also changed the colour palette to mix things up.


A Two By Two Contingency Table

Instead of the long format from the beginning, you can display the table as a two by two contingency table.

From this contingency table, you can create a mosaic plot.

Since the counts are really close to each other, it is hard to see a difference between the tile sizes.

An alternate moasic plot comes from the vcd package in R.

Other than the colours and labels, this mosaic plot does not look that much different. Also, I have not figured out how to adjust the label titles and such.


Poisson Regression

The counts are at least zero (non-negative) and are whole numbers. When dealing with a two by two table, linear regression does not really work. With this data, a Poisson regression model is used.

In R, the glm() function is used where glm stands for generalized linear model. Make sure to indicate family = “poisson” in the glm() function.




