Tidy Time Series and Forecasting in R

  conf2020

  Rob J Hyndman

tl;dr: all workshop materials are available here:
https://rstd.io/conf20-ts
📖 Forecasting: Principles and Practice
License: CC BY-SA 4.0

For the last couple of years I’ve watched the twitter feed from the Rstudio conference with jealousy — not just FOMO but KIWMO (Knowing I Was Missing Out). So when I was asked to teach a workshop on Tidy Time Series as part of rstudio::conf2020, I agreed without hesitation. The idea was to teach a workshop using a tidy approach to time series — something that has only been possible very recently.

It is now very common for organizations to collect huge amounts of data over time, and existing time series analysis tools are often unable to handle the scale, frequency and structure of the data collected. In response, I have been working with Earo Wang, Mitch O’Hara-Wild and Di Cook to develop a suite of packages to handle modern time series data in a tidy framework. This workshop was an introduction to how to use these packages.

All the materials for the workshop are on a github repository. The workshop was based on the new edition of my textbook with George Athanasopoulos.

I was very lucky to have four awesome teaching assistants in Mitch O’Hara-Wild, Rhian Davies, Phillip Lear, and Steven Lawrence.

Day 1

On day 1, we looked at the tsibble, lubridate and feasts packages (along with the tidyverse of course). We introduced the tsibble data structure for flexibly managing collections of related time series, and explored how to do data wrangling, data visualizations and exploratory data analysis, along with some feature-based methods to explore time series data in high dimensions.

Links to slides for day 1 are given below.

  1. Background
  2. Introduction to tsibbles
  3. Time series graphics
  4. Transformations
  5. Seasonality and trends
  6. Time series features

Day 2

Day 2 was about forecasting using the fable package. We looked at several well-known time series forecasting models and how they are automated in the fable package. We also discussed ensemble forecasts. Finally, we looked at forecast reconciliation, allowing millions of time series to be forecast in a relatively short time while accounting for constraints on how the series are related.

Links to slides for day 2 are given below.

  1. Introduction to forecasting
  2. Exponential smoothing
  3. ARIMA models
  4. Dynamic regression
  5. Hierarchical forecasting

Lab Sessions

We alternated between me presenting and lab sessions where participants worked with time series data in R. The instructions for the lab sessions are available on the repo.

R code providing solutions to the lab exercises are also available here.