ST 558 Project 2
Kaylee Frazier and Rebecca Voelker 10/31/2021
This repo includes analysis on the online news popularity data set. It subsets the data into six documents subsetted from the data’s channel name. In this repo, we summarize the data and then try to predict the number of shares using predictive modeling.
This is the list of the packages used.
tidyverse
: useful features for data
sciencecaret
: set of functions that help to streamline the process for creating predictive modelsknitr
: a markdown friendly way to display tablesggplot2
: a package for making graphs and visualizationsrandomForest
: helps create random forest modelsreadr
: a fast and easy way to read in rectangular datadplyr
: aids with data manipulationrmarkdown
: adds enhancements to R Markdownshiny
: makes it easy to create interactive webpages from RThese are links to the generated analyses.
#get unique names
channelIDs <- unique(rawDataNew$data_channel)
#create file names
output_file <- paste0(channelIDs, ".md")
#create a list for each channel with just the channel name parameter
params = lapply(channelIDs, FUN = function(x){list(data_channel = x)})
#put into a data frame
reports <- tibble(output_file, params)
#read in library
library(rmarkdown)
#need to use x[[1]] to get at elements since tibble doesn't simplify
apply(reports, MARGIN = 1,
FUN = function(x){
render(input = "ST558_Project2.Rmd", output_file = x[[1]], params = x[[2]])
})