Exercise Week 6: Solutions

library(fpp3)
library(broom)

fpp3 8.8, Ex1

Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

  1. Use the ETS() function in R to estimate the equivalent model for simple exponential smoothing. Find the optimal values of \alpha and \ell_0, and generate forecasts for the next four months.
fit <- aus_livestock |>
  filter(Animal == "Pigs", State == "Victoria") |>
  model(ses = ETS(Count ~ error("A") + trend("N") + season("N")))
report(fit)
Series: Count 
Model: ETS(A,N,N) 
  Smoothing parameters:
    alpha = 0.3221247 

  Initial states:
     l[0]
 100646.6

  sigma^2:  87480760

     AIC     AICc      BIC 
13737.10 13737.14 13750.07 

Optimal values are \alpha = 0.3221247 and \ell_0 = 100646.6

fc <- fit |> forecast(h = "4 months")
fc
# A fable: 4 x 6 [1M]
# Key:     Animal, State, .model [1]
  Animal State    .model    Month             Count  .mean
  <fct>  <fct>    <chr>     <mth>            <dist>  <dbl>
1 Pigs   Victoria ses    2019 Jan N(95187, 8.7e+07) 95187.
2 Pigs   Victoria ses    2019 Feb N(95187, 9.7e+07) 95187.
3 Pigs   Victoria ses    2019 Mar N(95187, 1.1e+08) 95187.
4 Pigs   Victoria ses    2019 Apr N(95187, 1.1e+08) 95187.
fc |>
  autoplot(filter(aus_livestock, Month >= yearmonth("2010 Jan")))

  1. Compute a 95% prediction interval for the first forecast using \hat{y} \pm 1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.
s <- augment(fit) |>
  pull(.resid) |>
  sd()
yhat <- fc |>
  pull(.mean) |>
  head(1)
yhat + c(-1, 1) * 1.96 * s
[1]  76871.01 113502.10
fc |>
  head(1) |>
  mutate(interval = hilo(Count, 95)) |>
  pull(interval)
<hilo[1]>
[1] [76854.79, 113518.3]95

The intervals are close but not identical. This is because R estimates the variance of the residuals differently, taking account of the degrees of freedom properly (and also using a more accurate critical value rather than just 1.96).

Try the following.

res <- augment(fit) |> pull(.resid)
s <- sqrt(sum(res^2) / (length(res) - NROW(tidy(fit))))
yhat + c(-1, 1) * qnorm(0.975) * s
[1]  76854.79 113518.33

fpp3 8.8, Ex2

Write your own function to implement simple exponential smoothing. The function should take arguments y (the response data), alpha (the smoothing parameter \alpha) and level (the initial level \ell_0). It should return the forecast of the next observation in the series. Does it give the same forecast as ETS()?

my_ses <- function(y, alpha, level) {
  yhat <- numeric(length(y) + 1)
  yhat[1] <- level
  for (i in 2:(length(yhat))) {
    yhat[i] <- alpha * y[i - 1] + (1 - alpha) * yhat[i - 1]
  }
  return(last(yhat))
}

vic_pigs_vec <- aus_livestock |>
  filter(Animal == "Pigs", State == "Victoria") |>
  pull(Count)

ses_fc <- vic_pigs_vec |>
  my_ses(alpha = 0.3221, level = 100646.6)

c(my_ses = ses_fc, fable = fc$.mean[1])
  my_ses    fable 
95186.59 95186.56 

Yes, the same forecasts are obtained. The slight differences are due to rounding of \alpha and \ell_0.

fpp3 8.8, Ex3

Modify your function from the previous exercise to return the sum of squared errors rather than the forecast of the next observation. Then use the optim() function to find the optimal values of \alpha and \ell_0. Do you get the same values as the ETS() function?

my_ses_sse <- function(par, y) {
  alpha <- par[1]
  level <- par[2]
  n <- length(y)
  yhat <- numeric(n)
  yhat[1] <- level
  for (i in 2:n) {
    yhat[i] <- alpha * y[i - 1] + (1 - alpha) * yhat[i - 1]
  }
  return(sum((y - yhat)^2))
}

optim(c(0.1, vic_pigs_vec[1]), my_ses_sse, y = vic_pigs_vec)$par
[1] 3.220344e-01 1.005254e+05
tidy(fit)
# A tibble: 2 × 5
  Animal State    .model term    estimate
  <fct>  <fct>    <chr>  <chr>      <dbl>
1 Pigs   Victoria ses    alpha      0.322
2 Pigs   Victoria ses    l[0]  100647.   

Similar, but not identical estimates. This is due to different starting values being used.

fpp3 8.8, Ex4

Combine your previous two functions to produce a function that both finds the optimal values of \alpha and \ell_0, and produces a forecast of the next observation in the series.

my_ses <- function(y) {
  par <- optim(c(0.1, y[1]), my_ses_sse, y = y)$par
  alpha <- par[1]
  level <- par[2]
  yhat <- numeric(length(y) + 1)
  yhat[1] <- level
  for (i in 2:(length(yhat))) {
    yhat[i] <- alpha * y[i - 1] + (1 - alpha) * yhat[i - 1]
  }
  return(last(yhat))
}
my_ses(vic_pigs_vec)
[1] 95186.69

fpp3 8.8, Ex16

Show that the forecast variance for an ETS(A,N,N) model is given by \sigma^2\left[1+\alpha^2(h-1)\right].

An ETS(A,N,N) model is defined as \begin{align*} y_t & = \ell_{t-1} + \varepsilon_{t} \\ \ell_{t} & = \ell_{t-1} + \alpha\varepsilon_{t}, \end{align*} where \varepsilon_t \sim \text{N}(0,\sigma^2), and h-step forecasts are given by \hat{y}_{T+h|T} = \ell_T. So \begin{align*} y_{T+h} & = \ell_{T+h-1} + \varepsilon_{T+h} \\ & = \ell_{T+h-2} + \alpha \varepsilon_{T+h-1} + \varepsilon_{T+h} \\ & = \ell_{T+h-3} + \alpha \varepsilon_{T+h-2} + \alpha \varepsilon_{T+h-1} + \varepsilon_{T+h} \\ & \dots \\ & = \ell_{T} + \alpha \sum_{j=1}^{h-1} \varepsilon_{T+h-j} + \varepsilon_{T+h}. \end{align*} Therefore \begin{align*} \text{Var}(y_{T+h} | y_1,\dots,y_T) & = \alpha^2 \sum_{j=1}^{h-1} \sigma^2 + \sigma^2 \\ & = \sigma^2\left[ 1 + \alpha^2 (h-1)\right ]. \end{align*}

fpp3 8.8, Ex17

Write down 95% prediction intervals for an ETS(A,N,N) model as a function of \ell_T, \alpha, h and \sigma, assuming normally distributed errors.

Using previous result: \ell_T \pm 1.96 \sigma \sqrt{ 1 + \alpha^2 (h-1)}