Export all datasets in an R package
Lots of R packages come with datasets that are easily loaded into R using the data()
function.
I’ve recently been reading An Introduction to Statistical Learning by James, Witten, Hastie, and Tibshirani (available for free from their website), and the book has exercises that readers can do in R.
They have an R package, called ISLR2
, that includes several of the datasets used in the book and exercises.
While these data are easy to load into R, I wanted to have a way that I could use them anywhere.
So, I wrote this R script to save each dataset into a .csv, and put a repo with all of the .csv files on GitHub.
# File: save_ISLR_datasets.R
library(ISLR2)
library(readr)
# Load list of datasets
datasets <- data(package = "ISLR2")
# Pull out vector of dataset names
dataset_names <- datasets$results[,"Item"]
# Create directory to save the .csv files
dir_for_data <- "data"
dir.create(dir_for_data)
# Function to save a dataset to a csv given its name
save_dataset_to_csv <- function(dataset_name){
data(list = dataset_name)
dataset_to_save <- get(dataset_name)
if(!is.data.frame(dataset_to_save)){
return(paste0("Failed to save ", dataset_name))
}
write_csv(dataset_to_save, file = paste0(dir_for_data, "/", dataset_name, ".csv"))
return(paste0("Saved ", dataset_name))
}
sapply(dataset_names, save_dataset_to_csv)
At present, it only saves datasets that are data.frame
s.
This code could be easily modified to save datasets from any package or save them to formats other than .csv.