Assignment 3

This assignment will use the same data that you will use in the retail project later in the semester. Each student will use a different time series, selected using their student ID number as follows.

library(fpp3)
get_my_data <- function(student_id) {
  set.seed(student_id)
  all_data <- readr::read_rds("https://bit.ly/monashretaildata")
  while(TRUE) {
    retail <- filter(all_data, `Series ID` == sample(`Series ID`, 1))
    if(!any(is.na(fill_gaps(retail)$Turnover))) return(retail)
  }
}
# Replace the argument with your student ID
retail <- get_my_data(12345678)

Use a training set up to and including 2018.

What transformations (Box-Cox and/or differencing) would be required to make the data stationary? You should use a unit-root test as part of the discussion.
Use a plot of the ACF and PACF of the (possibly differenced) data to determine two plausible ARIMA models for this data set.
Fit both models, along with an automatically chosen model, and produce forecasts for 2019–2022.
Which model is best based on AIC? Which model is best based on the test set RMSE? Which do you think is best to use for future forecasts? Why?
Check the residuals from your preferred model, using an ACF plot and a Ljung-Box test. Do the residuals appear to be white noise?

Submit a Quarto (qmd) file which carries out the above analysis. You need to submit one file which implements all steps above. You may use this file as a starting point.

Due: 20 May 2025
Submit