Here, we have the Total Monthly Consumption of Electricity Generated via Natural Gas in the United States provided by the U.S. Energy Information Administration. Data Link.

#Read Data and Library
data = read.csv("data.csv")[,3]
data = ts(data, start=c(2001, 1), frequency = 12)

library(ggplot2); library(forecast)

# First six values
head(data)
##           Jan      Feb      Mar      Apr      May      Jun
## 2001 323638.8 296964.9 346591.9 369900.8 418573.2 477079.6

The Electricity Consumed is generated by Natural Gas. Mcf is used to measure quantity of Natural Gas. M means one thousand (Roman Numeral). Thus, 400 Mcf is 400,000 cubic feet of natural gas.In terms of energy, 1 Mcf equals 1,000,000 BTU (British Thermal Units).

Regardless, we will look at some plots to gain some insights into our dataset.

ggtsdisplay

The ggtsdisplay gives us 3 plots:

  1. Original Plot
  2. ACF Plot
  3. PACF or histogram, spectrum, scatter plot.

The argument smooth = T tells R to add an smooth blue line across the time series to highlight the underlying trend in the data.

ggtsdisplay(data, smooth = T)

ggAcf

The ggAcf gives us the Autocorrelation plot. We can similarly, use ggPacf for a Partial Autocorrelation plot.

ggAcf(data, lag.max = 12)

We can specify how many lags we wish to look at using the lag.max argument. Looking at the first 12 lags, we see they are all significant. This is indicative of a strong seasonality. And this makes sense as well because we are dealing with a monthly data set.

ggseasonplot

The ggseasonplot gives us a plot of year-wise seasonal pattern

ggseasonplot(data, year.labels = T)

ggsubseriesplot

The ggsubseriesplot gives us a plot also known as Month plot.

ggsubseriesplot(data)

gglagplot

The gglagplot gives us plot of Yt plotted against different lagged values Ytk.

gglagplot(data, do.lines = FALSE, lags = 12) 

STL Decomposition

autoplot(stl(data, s.window = "periodic"))

Let’s fit a SARIMA Model

auto.arima(data) #SARIMA(1,0,1)(0,1,1)[12]
## Series: data 
## ARIMA(1,0,1)(0,1,1)[12] with drift 
## 
## Coefficients:
##          ar1      ma1     sma1     drift
##       0.7982  -0.1586  -0.7700  2230.886
## s.e.  0.0551   0.0925   0.0517   262.937
## 
## sigma^2 estimated as 1.795e+09:  log likelihood=-2852.96
## AIC=5715.91   AICc=5716.18   BIC=5733.23

Equation of fitted SARIMA Model:

(1LS)(1ϕL)Yt=(1+θL)(1+θSLS)et

Let Zt=(1LS)Yt

Then, Zt=ϕ1Zt1+et+θ1Se(t1)s+θ1et1+θ1Sθ1e(t1)S1