Assignment 2
This assignment will use the same data that you will use in the retail project later in the semester. Each student will use a different time series, selected using their student ID number as follows.
library(fpp3)
<- function(student_id) {
get_my_data set.seed(student_id)
<- readr::read_rds("https://bit.ly/monashretaildata")
all_data while(TRUE) {
<- filter(all_data, `Series ID` == sample(`Series ID`, 1))
retail if(!any(is.na(fill_gaps(retail)$Turnover))) return(retail)
}
}# Replace the argument with your student ID
<- get_my_data(12345678) retail
- Using a test set of 2019–2022, fit an ETS model chosen automatically, and three benchmark methods to the training data. Which gives the best forecasts on the test set, based on RMSE?
- Check the residuals from the best model using an ACF plot and a Ljung-Box test. Do the residuals appear to be white noise?
- Now use time-series cross-validation with a minimum sample size of 15 years, a step size of 1 year, and a forecast horizon of 5 years. Calculate the RMSE of the results. Does it change the conclusion you reach based on the test set?
- Which of these two methods of evaluating accuracy is more reliable? Why?
Submit a Quarto (qmd
) file which carries out the above analysis. You need to submit one file which implements all steps above. You may use this file as a starting point.
To receive full marks, the qmd
file must compile without errors.
Due: 17 April 2025
Submit