Activities: Week 1

Time series data and patterns

Exercise 1

The pedestrian dataset contains hourly pedestrian counts from 2015-01-01 to 2016-12-31 at 4 sensors in the city of Melbourne.

The data is shown below:

# A tibble: 66,037 × 5
   Sensor         Date_Time           Date        Time Count
   <chr>          <dttm>              <date>     <int> <int>
 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01     0  1630
 2 Birrarung Marr 2015-01-01 01:00:00 2015-01-01     1   826
 3 Birrarung Marr 2015-01-01 02:00:00 2015-01-01     2   567
 4 Birrarung Marr 2015-01-01 03:00:00 2015-01-01     3   264
 5 Birrarung Marr 2015-01-01 04:00:00 2015-01-01     4   139
 6 Birrarung Marr 2015-01-01 05:00:00 2015-01-01     5    77
 7 Birrarung Marr 2015-01-01 06:00:00 2015-01-01     6    44
 8 Birrarung Marr 2015-01-01 07:00:00 2015-01-01     7    56
 9 Birrarung Marr 2015-01-01 08:00:00 2015-01-01     8   113
10 Birrarung Marr 2015-01-01 09:00:00 2015-01-01     9   166
# ℹ 66,027 more rows
Your turn!

Identify the index variable, key variable(s), and measured variable(s) of this dataset.

Hint
  • The index variable contains the complete time information
  • The key variable(s) identify each time series
  • The measured variable(s) are what you want to explore/forecast.

index variable

key variable(s)

measured variable(s)

Exercise 2

The aus_accommodation dataset contains quarterly data on Australian tourist accommodation from short-term non-residential accommodation with 15 or more rooms, 1998 Q1 - 2016 Q2.

The units of the measured variables are as follows:

  • Takings are in millions of Australian dollars
  • Occupancy is a percentage of rooms occupied
  • CPI is an index with value 100 in 2012 Q1.
Your turn!

Complete the code to convert this dataset into a tsibble.


    

Exercise 3

Temporal granularity

The previous exercise produced a dataset with daily frequency - although clearly the data is quarterly! This is because we are using a daily granularity which is inappropriate for this data.

Common temporal granularities can be created with these functions:

Granularity Function
Annual as.integer()
Quarterly yearquarter()
Monthly yearmonth()
Weekly yearweek()
Daily as_date(), ymd()
Sub-daily as_datetime()
Your turn!

Use the appropriate granularity for the aus_accommodation dataset, and verify that the frequency is now quarterly.


    

Exercise 4

The tourism dataset contains the quarterly overnight trips from 1998 Q1 to 2016 Q4 across Australia.

It is disaggregated by 3 key variables:

  • State: States and territories of Australia
  • Region: The tourism regions are formed through the aggregation of Statistical Local Areas (SLAs) which are defined by the various State and Territory tourism authorities according to their research and marketing needs
  • Purpose: Stopover purpose of visit: “Holiday”, “Visiting friends and relatives”, “Business”, “Other reason”.

Calculate the total quarterly tourists visiting Victoria from the tourism dataset.


    

Find what combination of Region and Purpose had the maximum number of overnight trips on average.


    

Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.