Exercise Week 12: Solutions

fpp3 10.7, Ex 2

Repeat Exercise 4 from Section 7.10, but this time adding in ARIMA errors to address the autocorrelations in the residuals.

  1. How much difference does the ARIMA error process make to the regression coefficients?
fit <- souvenirs |>
  mutate(festival = month(Month) == 3 & year(Month) != 1987) |>
  model(
    reg = TSLM(log(Sales) ~ trend() + season() + festival),
    dynreg = ARIMA(log(Sales) ~ trend() + season() + festival)
  )
tidy(fit) |> print(n=50)
# A tibble: 31 × 6
   .model term           estimate std.error statistic  p.value
   <chr>  <chr>             <dbl>     <dbl>     <dbl>    <dbl>
 1 reg    (Intercept)      7.62    0.0742     103.    4.67e-78
 2 reg    trend()          0.0220  0.000827    26.6   2.32e-38
 3 reg    season()year2    0.251   0.0957       2.63  1.06e- 2
 4 reg    season()year3    0.266   0.193        1.38  1.73e- 1
 5 reg    season()year4    0.384   0.0957       4.01  1.48e- 4
 6 reg    season()year5    0.409   0.0957       4.28  5.88e- 5
 7 reg    season()year6    0.449   0.0958       4.69  1.33e- 5
 8 reg    season()year7    0.610   0.0958       6.37  1.71e- 8
 9 reg    season()year8    0.588   0.0959       6.13  4.53e- 8
10 reg    season()year9    0.669   0.0959       6.98  1.36e- 9
11 reg    season()year10   0.747   0.0960       7.79  4.48e-11
12 reg    season()year11   1.21    0.0960      12.6   1.29e-19
13 reg    season()year12   1.96    0.0961      20.4   3.39e-31
14 reg    festivalTRUE     0.502   0.196        2.55  1.29e- 2
15 dynreg ar1              0.556   0.179        3.11  2.53e- 3
16 dynreg ma1             -0.129   0.192       -0.670 5.05e- 1
17 dynreg ma2              0.340   0.114        2.99  3.68e- 3
18 dynreg trend()          0.0226  0.00150     15.1   1.17e-25
19 dynreg season()year2    0.252   0.0574       4.38  3.40e- 5
20 dynreg season()year3    0.297   0.118        2.51  1.42e- 2
21 dynreg season()year4    0.377   0.0729       5.17  1.56e- 6
22 dynreg season()year5    0.400   0.0789       5.07  2.30e- 6
23 dynreg season()year6    0.438   0.0817       5.36  7.19e- 7
24 dynreg season()year7    0.598   0.0827       7.23  2.04e-10
25 dynreg season()year8    0.573   0.0821       6.98  6.45e-10
26 dynreg season()year9    0.651   0.0799       8.16  2.94e-12
27 dynreg season()year10   0.725   0.0746       9.71  2.18e-15
28 dynreg season()year11   1.18    0.0629      18.7   1.14e-31
29 dynreg season()year12   1.93    0.0599      32.2   5.41e-49
30 dynreg festivalTRUE     0.461   0.119        3.86  2.19e- 4
31 dynreg intercept        7.60    0.0857      88.7   8.60e-85

The coefficients are all relatively close.

  1. How much difference does the ARIMA error process make to the forecasts?
future_souvenirs <- new_data(souvenirs, n = 24) |>
  mutate(festival = month(Month) == 3)
fit |>
  forecast(new_data = future_souvenirs)  |>
  autoplot(souvenirs, level=95)

The forecasts are also extremely close.

  1. Check the residuals of the fitted model to ensure the ARIMA process has adequately addressed the autocorrelations seen in the TSLM model.
fit |>
  select(dynreg) |>
  gg_tsresiduals()

These look fine.

fpp3 10.7, Ex 3

Repeat the daily electricity example, but instead of using a quadratic function of temperature, use a piecewise linear function with the “knot” around 25 degrees Celsius (use predictors Temperature & Temp2). How can you optimize the choice of knot?

vic_elec_daily <- vic_elec |>
  filter(year(Time) == 2014) |>
  index_by(Date = date(Time)) |>
  summarise(
    Demand = sum(Demand) / 1e3,
    Temperature = max(Temperature),
    Holiday = any(Holiday)
  ) |>
  mutate(
    Temp2 = I(pmax(Temperature - 25, 0)),
    Day_Type = case_when(
      Holiday ~ "Holiday",
      wday(Date) %in% 2:6 ~ "Weekday",
      TRUE ~ "Weekend"
    )
  )

First, we will leave the knot at 25 and find the best ARIMA model. This will take a while, but we only have to do it once.

vic_elec_daily |>
  model(
    fit = ARIMA(Demand ~ Temperature + Temp2 + (Day_Type == "Weekday"))) |>
  report()
Series: Demand 
Model: LM w/ ARIMA(3,1,1)(2,0,0)[7] errors 

Coefficients:
         ar1      ar2     ar3      ma1    sar1    sar2  Temperature   Temp2
      0.7914  -0.0108  0.0110  -0.9762  0.1995  0.2936      -0.7025  4.4068
s.e.  0.0730   0.0831  0.0568   0.0213  0.0542  0.0569       0.1744  0.3069
      Day_Type == "Weekday"TRUE
                        31.4647
s.e.                     1.3758

sigma^2 estimated as 61.42:  log likelihood=-1262.4
AIC=2544.81   AICc=2545.43   BIC=2583.78

Now we will use that ARIMA(3,1,1)(2,0,0)[7] model and modify the knot until the AICc is optmized.

To optimize the knot position, we need to try many knot locations and select the model with the smallest AICc value. The fit is sensitive to the knot placement, with errors occurring at some knot locations. So we need to do a grid search, allowing for null models to be returned.

best_aicc <- Inf
for(knot in seq(20, 30, by=0.1)) {
  elec_model <- vic_elec_daily |>
    mutate(Temp2 = pmax(Temperature - knot, 0)) |>
    model(fit = ARIMA(Demand ~ Temperature + Temp2 + (Day_Type == "Weekday")))
  if(!is_null_model(elec_model$fit[[1]])) {
    aicc <- glance(elec_model) |> pull(AICc)
    if(aicc < best_aicc) {
      best_aicc <- aicc
      best_knot <- knot
      best_model <- elec_model
    }
  }
}
best_model |> report()
Series: Demand 
Model: LM w/ ARIMA(2,1,1)(2,0,0)[7] errors 

Coefficients:
         ar1     ar2      ma1    sar1    sar2  Temperature   Temp2
      0.7432  0.0468  -0.9699  0.1947  0.3143       0.0120  5.4051
s.e.  0.0676  0.0645   0.0241  0.0531  0.0564       0.1326  0.3328
      Day_Type == "Weekday"TRUE
                        31.0127
s.e.                     1.3634

sigma^2 estimated as 59.41:  log likelihood=-1256.87
AIC=2531.75   AICc=2532.25   BIC=2566.82

The optimal knot (to 1 decimal place) is 28.4 degrees Celsius.

augment(best_model) |>
  left_join(vic_elec_daily) |>
  ggplot(aes(x = Temperature)) +
  geom_point(aes(y = Demand)) +
  geom_point(aes(y = .fitted), col = "red")

augment(best_model) |>
  left_join(vic_elec_daily) |>
  ggplot(aes(x = Demand)) +
  geom_point(aes(y = .fitted)) +
  geom_abline(aes(intercept = 0, slope = 1))

best_model |> gg_tsresiduals()

augment(best_model) |>
  features(.innov, ljung_box, dof = 6, lag = 21)
# A tibble: 1 × 3
  .model lb_stat lb_pvalue
  <chr>    <dbl>     <dbl>
1 fit       31.0   0.00875

The model fails the residual tests but the significant autocorrelations are relatively small, so it should still give reasonable forecasts. The residuals look like they have some heteroskedasticity, but otherwise look ok.

vic_next_day <- new_data(vic_elec_daily, 1) |>
  mutate(
    Temperature = 26,
    Temp2 = I(pmax(Temperature - best_knot, 0)),
    Day_Type = "Holiday"
  )
forecast(best_model, vic_next_day)
# A fable: 1 x 7 [1D]
# Key:     .model [1]
  .model Date           Demand .mean Temperature    Temp2 Day_Type
  <chr>  <date>         <dist> <dbl>       <dbl> <I<dbl>> <chr>   
1 fit    2015-01-01 N(161, 59)  161.          26        0 Holiday 
vic_elec_future <- new_data(vic_elec_daily, 14) |>
  mutate(
    Temperature = 26,
    Temp2 = I(pmax(Temperature - best_knot, 0)),
    Holiday = c(TRUE, rep(FALSE, 13)),
    Day_Type = case_when(
      Holiday ~ "Holiday",
      wday(Date) %in% 2:6 ~ "Weekday",
      TRUE ~ "Weekend"
    )
  )
forecast(best_model, vic_elec_future) |>
  autoplot(vic_elec_daily) + labs(y = "Electricity demand (GW)")

fpp3 10.7, Ex 4

This exercise concerns aus_accommodation: the total quarterly takings from accommodation and the room occupancy level for hotels, motels, and guest houses in Australia, between January 1998 and June 2016. Total quarterly takings are in millions of Australian dollars. a. Compute the CPI-adjusted takings and plot the result for each state

aus_accommodation <- aus_accommodation |>
  mutate(
    adjTakings = Takings / CPI * 100
  )
aus_accommodation |>
  autoplot(adjTakings)

  1. For each state, fit a dynamic regression model of CPI-adjusted takings with seasonal dummy variables, a piecewise linear time trend with one knot at 2008 Q1, and ARIMA errors.
fit <- aus_accommodation |>
  model(
    ARIMA(adjTakings ~ season() + trend(knot = yearquarter("2008 Q1")))
  )
fit
# A mable: 8 x 2
# Key:     State [8]
  State                        ARIMA(adjTakings ~ season() + trend(knot = year…¹
  <chr>                                                                  <model>
1 Australian Capital Territory                       <LM w/ ARIMA(1,0,0) errors>
2 New South Wales                          <LM w/ ARIMA(1,0,0)(0,0,1)[4] errors>
3 Northern Territory                       <LM w/ ARIMA(0,0,1)(1,0,0)[4] errors>
4 Queensland                               <LM w/ ARIMA(1,0,0)(0,0,1)[4] errors>
5 South Australia                          <LM w/ ARIMA(1,0,0)(1,0,0)[4] errors>
6 Tasmania                                 <LM w/ ARIMA(0,0,1)(1,0,0)[4] errors>
7 Victoria                                 <LM w/ ARIMA(1,0,0)(0,0,1)[4] errors>
8 Western Australia                                  <LM w/ ARIMA(1,0,0) errors>
# ℹ abbreviated name:
#   ¹​`ARIMA(adjTakings ~ season() + trend(knot = yearquarter("2008 Q1")))`

The seasonal dummy variable has not adequately handled the seasonality, so there are seasonal ARIMA components.

  1. Check that the residuals of the model look like white noise.
fit |>
  filter(State == "Victoria") |>
  gg_tsresiduals()

No apparent problems. Similar plots needed for the other states.

  1. Forecast the takings for each state to the end of 2017. (Hint: You will need to produce forecasts of the CPI first.)
# CPI forecasts
cpif <- aus_accommodation |>
  model(ARIMA(CPI)) |>
  forecast(h = 6) |>
  as_tsibble() |>
  select(Date, State, CPI = .mean)
fit |>
  forecast(new_data = cpif) |>
  mutate(Takings = adjTakings * CPI / 100)
# A fable: 48 x 7 [1Q]
# Key:     State, .model [8]
   State                    .model    Date   adjTakings .mean   CPI      Takings
   <chr>                    <chr>    <qtr>       <dist> <dbl> <dbl>       <dist>
 1 Australian Capital Terr… "ARIM… 2016 Q3   N(62, 9.9)  61.6  109.    N(67, 12)
 2 Australian Capital Terr… "ARIM… 2016 Q4    N(59, 12)  58.9  110.    N(65, 15)
 3 Australian Capital Terr… "ARIM… 2017 Q1    N(59, 13)  59.0  110.    N(65, 16)
 4 Australian Capital Terr… "ARIM… 2017 Q2    N(59, 13)  59.4  111.    N(66, 16)
 5 Australian Capital Terr… "ARIM… 2017 Q3    N(61, 13)  60.9  111.    N(68, 16)
 6 Australian Capital Terr… "ARIM… 2017 Q4    N(59, 13)  58.8  112.    N(66, 16)
 7 New South Wales          "ARIM… 2016 Q3 N(791, 1254) 791.   109. N(863, 1494)
 8 New South Wales          "ARIM… 2016 Q4 N(844, 1589) 844.   110. N(926, 1914)
 9 New South Wales          "ARIM… 2017 Q1 N(829, 1679) 829.   110. N(915, 2043)
10 New South Wales          "ARIM… 2017 Q2 N(734, 1703) 734.   111. N(814, 2094)
# ℹ 38 more rows
  1. What sources of uncertainty have not been taken into account in the prediction intervals?
  • The uncertainty in the CPI forecasts has been ignored.
  • As usual, the estimation of the parameters and the choice of models have also not been accounted for.

fpp3 10.7, Ex 5

We fitted a harmonic regression model to part of the us_gasoline series in Exercise 6 in Section 7.10. We will now revisit this model, and extend it to include more data and ARMA errors.

  1. Using TSLM(), fit a harmonic regression with a piecewise linear time trend to the full gasoline series. Select the position of the knots in the trend and the appropriate number of Fourier terms to include by minimising the AICc or CV value.

Let’s optimize using 2 break points and an unknown number of Fourier terms. Because the number of Fourier terms is integer, we can’t just use optim. Instead, we will loop over a large number of possible values for the breakpoints and Fourier terms. There are more than 2000 models fitted here, but TSLM is relatively fast.

Note that the possible values of the knots must be restricted so that knot2 is always much larger than knot1. We have set them to be at least 2 years apart here.

us_gasoline |> autoplot(Barrels)

# Function to compute CV given K and knots.
get_cv <- function(K, knot1, knot2) {
  us_gasoline |>
    model(TSLM(Barrels ~ fourier(K = K) + trend(c(knot1, knot2)))) |>
    glance() |>
    pull(CV)
}

models <- expand.grid(
  K = seq(25),
  knot1 = yearweek(as.character(seq(1991, 2017, 2))),
  knot2 = yearweek(as.character(seq(1991, 2017, 2)))
) |>
  filter(knot2 - knot1 > 104) |>
  as_tibble()
models <- models |>
  mutate(cv = purrr::pmap_dbl(models, get_cv)) |>
  arrange(cv)

# Best combination
(best <- head(models, 1))
# A tibble: 1 × 4
      K    knot1    knot2     cv
  <int>   <week>   <week>  <dbl>
1     6 2007 W01 2013 W01 0.0641
fit <- us_gasoline |>
  model(
    TSLM(Barrels ~ fourier(K = best$K) + trend(c(best$knot1, best$knot2)))
  )
  1. Now refit the model using ARIMA() to allow for correlated errors, keeping the same predictor variables as you used with TSLM().
fit <- us_gasoline |>
  model(ARIMA(Barrels ~ fourier(K = best$K) + trend(c(best$knot1, best$knot2)) + PDQ(0, 0, 0)))
fit |> report()
Series: Barrels 
Model: LM w/ ARIMA(1,0,1) errors 

Coefficients:
         ar1      ma1  fourier(K = best$K)C1_52  fourier(K = best$K)S1_52
      0.9277  -0.8414                   -0.1144                   -0.2306
s.e.  0.0256   0.0357                    0.0133                    0.0132
      fourier(K = best$K)C2_52  fourier(K = best$K)S2_52
                        0.0418                    0.0309
s.e.                    0.0105                    0.0105
      fourier(K = best$K)C3_52  fourier(K = best$K)S3_52
                        0.0836                    0.0343
s.e.                    0.0097                    0.0097
      fourier(K = best$K)C4_52  fourier(K = best$K)S4_52
                        0.0187                    0.0399
s.e.                    0.0094                    0.0094
      fourier(K = best$K)C5_52  fourier(K = best$K)S5_52
                       -0.0315                    0.0011
s.e.                    0.0092                    0.0092
      fourier(K = best$K)C6_52  fourier(K = best$K)S6_52
                       -0.0523                    0.0001
s.e.                    0.0092                    0.0092
      trend(c(best$knot1, best$knot2))trend
                                     0.0028
s.e.                                 0.0001
      trend(c(best$knot1, best$knot2))trend_831
                                        -0.0051
s.e.                                     0.0002
      trend(c(best$knot1, best$knot2))trend_1144  intercept
                                          0.0055     7.1065
s.e.                                      0.0006     0.0352

sigma^2 estimated as 0.06051:  log likelihood=-13.38
AIC=64.76   AICc=65.33   BIC=163.78
  1. Check the residuals of the final model using the gg_tsdisplay() function and a Ljung-Box test. Do they look sufficiently like white noise to continue? If not, try modifying your model, or removing the first few years of data.
gg_tsresiduals(fit)

augment(fit) |> features(.innov, ljung_box, dof = 2, lag = 24)
# A tibble: 1 × 3
  .model                                                       lb_stat lb_pvalue
  <chr>                                                          <dbl>     <dbl>
1 "ARIMA(Barrels ~ fourier(K = best$K) + trend(c(best$knot1, …    23.6     0.369

The model looks pretty good, although there is some heteroskedasticity. So the prediction intervals may not have accurate coverage.

  1. Once you have a model with white noise residuals, produce forecasts for the next year.
fit |>
  forecast(h = "1 year") |>
  autoplot(us_gasoline)

fpp3 10.7, Ex 6

Electricity consumption is often modelled as a function of temperature. Temperature is measured by daily heating degrees and cooling degrees. Heating degrees is 18^\circC minus the average daily temperature when the daily average is below 18^\circC; otherwise it is zero. This provides a measure of our need to heat ourselves as temperature falls. Cooling degrees measures our need to cool ourselves as the temperature rises. It is defined as the average daily temperature minus 18^\circC when the daily average is above 18^\circC; otherwise it is zero. Let y_t denote the monthly total of kilowatt-hours of electricity used, let x_{1,t} denote the monthly total of heating degrees, and let x_{2,t} denote the monthly total of cooling degrees.

An analyst fits the following model to a set of such data: y^*_t = \beta_1x^*_{1,t} + \beta_2x^*_{2,t} + \eta_t, where (1-\Phi_{1}B^{12} - \Phi_{2}B^{24})(1-B)(1-B^{12})\eta_t = (1+\theta_1 B)\varepsilon_t and y^*_t = \log(y_t), x^*_{1,t} = \sqrt{x_{1,t}} and x^*_{2,t}=\sqrt{x_{2,t}}.

  1. What sort of ARIMA model is identified for \eta_t?

ARIMA(0,1,1)(2,1,0)_{12}

  1. The estimated coefficients are
Parameter Estimate s.e. Z P-value
\beta_1 0.0077 0.0015 4.98 0.000
\beta_2 0.0208 0.0023 9.23 0.000
\theta_1 0.5830 0.0720 8.10 0.000
\Phi_{1} -0.5373 0.0856 -6.27 0.000
\Phi_{2} -0.4667 0.0862 -5.41 0.000

Explain what the estimates of \beta_1 and \beta_2 tell us about electricity consumption.

b_1 is the unit increase in y_t^* when x_{1,t}^* increases by 1. This is a little hard to interpret, but it is clear that monthly total electricity usage goes up when monthly heating degrees goes up. Similarly, for b_2, monthly total electricty usage goes up when monthly cooling degrees goes up.

  1. Write the equation in a form more suitable for forecasting.

This turned out to be way more messy than I expected, but for what it’s worth, here it is in all its ugliness.

First apply the differences to the regression equation. (1-B)(1-B^{12}) y_t^* = b_1^*(1-B)(1-B^{12})x_{1,t}^* + b_2^*(1-B)(1-B^{12})x_{2,t}^* + (1-B)(1-B^{12})\eta_{t} So \begin{align*} (y^*_{t} - y^*_{t-1} - y^*_{t-12} +y^*_{t-13}) =& b_1(x^*_{1,t} - x^*_{1,t-1} - x^*_{1,t-12} + x^*_{1,t-13}) + b_2(x^*_{2,t} - x^*_{2,t-1} - x^*_{2,t-12} + x^*_{2,t-13}) + \eta'_t \end{align*} Multiplying by the AR polynomial gives \begin{align*} (y^*_{t} & - y^*_{t-1} - y^*_{t-12} +y^*_{t-13}) -\Phi_{1}(y^*_{t-12} - y^*_{t-13} - y^*_{t-24} +y^*_{t-25}) -\Phi_{2}(y^*_{t-24} - y^*_{t-25} - y^*_{t-36} +y^*_{t-37})\\ = & ~ b_1(x^*_{1,t} - x^*_{1,t-1} - x^*_{1,t-12} + x^*_{1,t-13}) -\Phi_{1}b_1(x^*_{1,t-12} - x^*_{1,t-13} - x^*_{1,t-24} + x^*_{1,t-25}) -\Phi_{2}b_1(x^*_{1,t-24} - x^*_{1,t-25} - x^*_{1,t-36} + x^*_{1,t-37}) \\ & ~ + b_2(x^*_{2,t} - x^*_{2,t-1} - x^*_{2,t-12} + x^*_{2,t-13}) - \Phi_{1}b_2(x^*_{2,t-12} - x^*_{2,t-13} - x^*_{2,t-24} + x^*_{2,t-25}) - \Phi_{2}b_2(x^*_{2,t-24} - x^*_{2,t-25} - x^*_{2,t-36} + x^*_{2,t-37}) \\ & ~ + \varepsilon_t + \theta_1 \varepsilon_{t-1}. \end{align*} Finally, we move all but y_t^* to the right hand side: \begin{align*} y^*_{t} = & ~ y^*_{t-1} + y^*_{t-12} - y^*_{t-13} +\Phi_{1}(y^*_{t-12} - y^*_{t-13} - y^*_{t-24} +y^*_{t-25}) +\Phi_{2}(y^*_{t-24} - y^*_{t-25} - y^*_{t-36} +y^*_{t-37})\\ & + b_1(x^*_{1,t} - x^*_{1,t-1} - x^*_{1,t-12} + x^*_{1,t-13}) -\Phi_{1}b_1(x^*_{1,t-12} - x^*_{1,t-13} - x^*_{1,t-24} + x^*_{1,t-25}) -\Phi_{2}b_1(x^*_{1,t-24} - x^*_{1,t-25} - x^*_{1,t-36} + x^*_{1,t-37}) \\ & + b_2(x^*_{2,t} - x^*_{2,t-1} - x^*_{2,t-12} + x^*_{2,t-13}) - \Phi_{1}b_2(x^*_{2,t-12} - x^*_{2,t-13} - x^*_{2,t-24} + x^*_{2,t-25}) - \Phi_{2}b_2(x^*_{2,t-24} - x^*_{2,t-25} - x^*_{2,t-36} + x^*_{2,t-37}) \\ & + \varepsilon_t + \theta_1 \varepsilon_{t-1}. \end{align*}

  1. Describe how this model could be used to forecast electricity demand for the next 12 months.

For t=T+1, we use the above equation to find a point forecast of y_{T+1}^*, setting \varepsilon_{T+1}=0 and \hat{\varepsilon}_T to the last residual. The actual y_t^* values are all known, as are all the x_{1,t}^* and x_{2,t}^* values up to time t=T. For x_{1,T+1}^* and x_{2,T+1}^*, we could use a forecast (for example, a seasonal naive forecast).

For t=T+2,\dots,T+12, we do something similar, but both \varepsilon values are set to 0 and y^*_{T+k} (k\ge1) is replaced by the forecasts just calculated.

  1. Explain why the \eta_t term should be modelled with an ARIMA model rather than modelling the data using a standard regression package. In your discussion, comment on the properties of the estimates, the validity of the standard regression results, and the importance of the \eta_t model in producing forecasts.
  • The non-stationarity of \eta_t means the coefficients in the regression model will be inconsistent if OLS is used.
  • The standard errors will be incorrectly computed.
  • Which means all the p-values will be wrong.
  • Using an ARIMA structure for \eta_t allows these problems to be corrected, plus the short-term forecasts will be more accurate.