Reading In Various File Types Into R

Hi there. In this page, I showcase how to read in various file types into R. This R code and work is experimental in nature as I was testing out stuff.


Table Of Contents



Note that the first three sections deal with R’s haven package.

I have a separate page where I play around with loading in a .JSON file into R.


1) Reading In STATA Files Into R

Not all places keep up to date with software and use R. There may be certain places that use STATA.

I have once used STATA for a survival analysis course where the instructor was familiar with STATA and not R. A classmate told me that STATA is used a lot in econometrics (statistics and economics).

There are two ways to load in a STATA file into R. You can use read_stata() or read_dta(). The code below deals with url links.


2) Reading In SPSS Files Into R

The main function for reading in an SPSS file is the read_spss() function from the haven package in R.


3) SAS Files Into R

Loading in a SAS file into R is pretty straightforward as well. The key function here is the read_sas() function.



4) Excel Files Into R

From my experimentation, I have found loading in Excel files into R somewhat difficult. I have found out that (I think) there is no way to load in an .xlsx file from a URL alone. The solution I propose is to save the file into a folder. Then you would set that folder as the working directory in RStudio with the Set As Working Directory from the More drop down.


Once you have set your working directory folder, you can load in the Excel file into R. Be mindful of spelling and punctuation.


5) Loading In .csv Files Into R

Loading in .csv files is not very difficult. Do make sure to include header = TRUE if the first rows are column titles and the sep = argument.




6) Loading In .data Files Into R

From time to time I do play around with some datasets from the UCI Machine Learning Repository website. Some of them have this .data file extension. For this case, the read.table() function is used.

It appears that this balloons_data comes in as a single column with values that have four commas. This single column could be separated with the separate function from the tidyr package in R.





Leave a Reply