A Vector Autoregression Weather Model for Electricity Supply and

A Vector Autoregression Weather Model for Electricity Supply and
Demand Modeling
Yixian Liua , Matthew C. Robertsb,c , Ramteen Sioshansia,c,∗
a Integrated
Systems Engineering Department, The Ohio State University, 1971 Neil Avenue, Columbus, OH 43210, United
States of America
b Department of Agricultural, Environmental, and Development Economics, The Ohio State University, 2120 Fyffe Road,
Columbus, Ohio 43210, United States of America
c Center for Automotive Research, The Ohio State University, 930 Kinnear Road, Columbus, OH 43212, United States of
America
Abstract
Weather forecasting is crucial to both the demand and supply sides of electricity systems. Temperature has
a great effect on energy demand. Moreover, solar and wind are very promising renewable energy sources.
In this paper, a large vector autoregression model is built to forecast three important weather variables for
61 cities around the United States. We estimate the vector autoregression model with 16 years of hourly
historical data and use two additional years of data for out-of-sample validation. Forecasts of up to sixhours-ahead are generated with good forecasting performance. Our results show that the proposed time
series approach is appropriate for short-term forecasting of solar radiation, temperature, and wind speed.
Keywords: Forecasting, solar irradiance, wind speed, temperature
1. Introduction
Electricity supply and demand are greatly influenced by weather conditions. Temperature, wind speed,
and solar radiation are among the most influential factors. Temperature has a great effect on energy use
by individuals, and thus on the demand side of the electricity system. Heating and cooling loads depend
largely on ambient temperature. Wind and solar generation are increasingly important as renewable energy
gains in popularity. Wind power is growing at a rate of 30% annually, with a worldwide installed capacity
of 283 GW at the end of 2012. The installed capacity of solar photovoltaic (PV) grew by 41% in 2012,
reaching 100 GW. However, the stochastic nature of wind speed and solar radiation raises power system
operational challenges as the penetrations of these technologies increase. Accurate short-term forecasting of
the two resources could improve power system operational efficiency.
There are many works dealing with weather forecasting. Most works focusing on temperature forecasting analyze financial weather derivatives as the prime application. Besides atmospheric models, models
attempting to capture these dynamics can be divided into two categories: stochastic approaches (Monte
Carlo Simulation) and time-series models. Examples of the former include the works of Alaton et al.
ˇ
(2002); Benth and Saltyt˙
e-Benth (2005); Oetomo and Stevenson (2005); Svec and Stevenson (2007) and
Taylor and Buizza (2004) while examples of the latter includes the works of Campbell and Diebold (2005);
ˇ
e-Benth et al. (2007); Svec and Stevenson (2007); Taylor and Buizza
Oetomo and Stevenson (2005); Saltyt˙
(2004, 2006). According to Oetomo and Stevenson (2005), although a model that relies on auto-regressive
moving average processes exhibits a better goodness-of-fit than Monte Carlo simulation models, such models
do not necessarily generate better forecasts.
∗ Corresponding
author
Email addresses: [email protected] (Yixian Liu), [email protected] (Matthew C. Roberts), [email protected]
(Ramteen Sioshansi)
Preprint submitted to Solar Energy
June 28, 2015
Another important issue, which Taylor and Buizza (2004); Campbell and Diebold (2005) discuss is point
and density forecasting. While time-series models are more popular for wind and temperature forecasting,
these techniques are not as widely used for solar radiation forecasting. Numerical weather prediction (NWP)
models are a popular approach for solar radiation forecasting, and can be used to generate forecasts up
to several days ahead. Chowdhury and Rahman (1987); Hammer et al. (1999); Heinemann et al. (2006);
Perez et al. (2010) note that most short-term solar radiation forecasts range from 30 minutes to six hours
ahead and rely on satellite-derived cloud-motion forecasts. Perez et al. (2007) uses sky cover predictions
as inputs. Heinemann et al. (2006); Remund et al. (2008) note that comparing the forecasts of different
methods is useful in providing comparative statistics to validate a forecasting model. Wind speeds are
typically forecasted several minutes to several days ahead, often using statistical methods. For example,
Erdem and Shi (2011) use auto-regression moving average-based approaches whereas Li and Shi (2010) use
artificial neural networks. Other works, such as those of Traiteur et al. (2012); Chen et al. (2013), combine
multiple numerical techniques to produce ensemble wind forecasts. Giebel et al. (2011) provide a detailed
review of the available techniques for wind speed forecasting.
In this paper, we use time-series methods to model and generate hourly temperature, wind speed,
and solar radiation forecasts at 61 locations in the United States. Figure 1 shows the locations modeled.
The three weather variables at the 61 locations are response variables in a vector autoregression (VAR)
model. In addition to estimating the model, we also conduct out-of-sample validation to test the quality
of the forecasts produced. We compare our forecasting errors to those reported for other techniques in the
literature, including persistence forecasts, showing that our method performs as well or better. We also
compare the performance of our VAR, which captures spatial correlation in the response variables, to a
model without spatial relationships. We demonstrate the benefits of modeling spatial correlations through
better forecasting performance.
Figure 1: 61 Locations in the United States that are Modeled
The remainder of this paper is organized as follows. In Section 2 we provide descriptive statistics for the
three weather variables. In Section 3 the model and estimation methods are introduced. For a large model of
this form, we try to find a proper number of residual terms to include to ensure good forecasting performance
while maintaining reasonable model size and degrees of freedom. Thirty lags for each time series are utilized
and each equation is estimated separately with either ordinary or weighted least squares. In Section 4 we
examine the forecasting performance up to six hours ahead and provide comparative statistics with other
models. Conclusions and suggestions for future research are provided in Section 5.
2. Weather Data
The data we use are from the National Solar Radiation Database (NSRDB), which is produced by the
National Renewable Energy Laboratory and distributed by the National Climatic Data Center. The NSRDB
2
contains ground-based solar and meteorological data for 1,454 sites around the United States. Nearly all of
the solar data are modeled while meteorological elements, including wind speed and dry bulb temperature,
are observed values. Wilcox (2012) provides further details regarding the NSRDB. We use modeled global
horizontal irradiance as our solar data.
We model hourly wind speed, global solar radiation, and dry bulb temperature at the 61 locations shown
in Figure 1 in one single VAR model. The 61 locations are chosen to provide roughly even coverage of the
continental United States. Moreover, locations that are close to population centers and areas with good
solar and wind resource availability are also included in the dataset. Data covering the years 1990 to 2008
are used, since these data are complete and do not require any modification. Among the 18 years of hourly
data, 16 years are used for model estimation and two years are used for out-of-sample model validation.
To get an overall feel for the data, Tables 1 through 3 summarize some simple descriptive statistics of
the wind speed, solar radiation, and temperature data at six locations. Moreover, Figures 2 through 4 show
plots of wind speed, temperature, and solar radiation in Las Vegas, NM (not to be confused with the popular
gambling destination) from 2006 to 2008. These figures clearly show seasonal patterns for the three weather
variables.
Table 1: Descriptive statistics of wind speed data [m/s]
Location
Maximum
Median
Mean
Standard Deviation
Bismarck, ND
Las Vegas, NM
Dallas, TX
Denver, CO
Chicago, IL
New York, NY
22.70
28.80
19.60
26.80
30.08
23.20
3.60
4.30
4.10
3.60
4.10
4.60
4.26
4.74
4.59
3.85
4.38
5.05
2.77
2.96
2.50
2.25
2.31
2.50
Table 2: Descriptive statistics of global solar radiation data [Wh/m2 ]
Location
Maximum
Median
Mean
Standard Deviation
Bismarck, ND
Las Vegas, NM
Dallas, TX
Denver, CO
Chicago, IL
New York, NY
975.00
1073.00
1047.00
1036.00
998.00
996.00
6.00
10.00
7.00
8.00
5.00
6.00
158.92
212.45
194.97
181.05
155.47
160.79
242.14
296.51
279.82
262.89
237.70
242.82
Table 3: Descriptive statistics of dry bulb temperature data [◦ C]
Location
Minimum
Maximum
Median
Mean
Standard Deviation
Bismarck, ND
Las Vegas, NM
Dallas, TX
Denver, CO
Chicago, IL
New York, NY
-40.00
-22.80
-13.30
-25.60
-29.40
-19.40
43.90
36.60
43.30
38.00
39.40
39.40
6.70
10.30
20.60
10.00
10.60
13.30
6.32
9.96
4.59
9.88
10.29
13.23
13.41
9.71
9.56
10.70
11.28
9.77
3
25
Wind Speed [m/s]
20
15
10
5
0
1/2006
7/2006
1/2007
7/2007
Time
1/2008
7/2008
12/2008
Figure 2: Time Series of Wind Speed in Las Vegas, NM from 2006 to 2008
1200
2
Solar Radiation [Wh/m ]
1000
800
600
400
200
0
1/2006
7/2006
1/2007
7/2007
Time
1/2008
7/2008
12/2008
Figure 3: Time Series of Global Solar Radiation in Las Vegas, NM from 2006 to 2008
3. Methodology
A time series approach is proposed to capture the characteristics of the three weather variables. Our
approach consists of three parts: (1) a linear trend, (2) a seasonal component, which is represented by
Fourier series and Chebyshev polynomials, and (3) a VAR to model the stochastic component of the time
series. We detail these three components below.
4
40
20
°
Dry Bulb Temperature [ C]
30
10
0
−10
−20
−30
1/2006
7/2006
1/2007
7/2007
Time
1/2008
7/2008
12/2008
Figure 4: Time Series of Dry Bulb Temperature in Las Vegas, NM from 2006 to 2008
3.1. Trend
To check for the presence of a linear trend, we run a simple linear regression of the weather data against
hourly time. Both the intercept and time parameters are significant at the 1% level. Hence, a linear trend,
though slight, should be included in our model. We represent this trend component by including a term of
the form:
trendt = β0 + β1 t
in our model.
3.2. Seasonality
As discussed in Section 2 and illustrated in Figures 2 through 4, there are strong seasonal variations in
all three of the weather variables. Because of the hourly time step in our data, it is important to model
both diurnal and seasonal seasonality. Since the three weather variables exhibit different diurnal patterns,
we use different approaches to represent their diurnal seasonality.
For wind and temperature, diurnal seasonality is represented by a Fourier series of the form:
P X
d(t)
d(t)
+ δs,p · sin 2πp
,
δc,p · cos 2πp
daySeast =
24
24
p=1
where P is the order of the Fourier series, δc,p and δs,p are coefficients on the cosine and sine terms,
respectively, and:
d(t) = (t mod 24),
(1)
converts t to hours of the day. Season-of-the-year seasonality is similarly captured by a Fourier series of the
form:
"
!
!#
Pˆ
X
ˆ
ˆ
d(t)
d(t)
ˆ
ˆ
δc,p · cos 2πp
annSeast =
+ δs,p · sin 2πp
,
(2)
365
365
p=1
where Pˆ is the order of the Fourier series and δˆc,p and δˆs,p are coefficients on the cosine and sine terms,
respectively, and:
t
ˆ
,
(3)
d(t) =
24
5
converts t into days of the year.
As discussed by Campbell and Diebold (2005), Fourier series can produce a smooth seasonal pattern
with a significant reduction in the number of parameters to be estimated as compared to dummy variables.
To find the proper order of the Fourier series, we estimate models with between first- and fifth-order terms.
Examining modeled and observed seasonality with different-ordered Fourier series shows that a third-order
series is sufficient to capture the seasonality dynamics. We also compare the forecasting performance of the
model with third- and fifth-order Fourier series terms, finding them to be similar, further suggesting that
third-order terms are sufficient. Thus, we include third-order Fourier series terms for daily and season-ofthe-year seasonality of wind and temperature.
The season-of-the-year seasonality of solar radiation is given by the same Fourier series shown in (2).
Daily seasonality is modeled using second-order Chebyshev polynomials, as opposed to Fourier series. Following the method outlined by Miranda and Fackler (2002), to define the Chebyshev polynomials we first
convert our independent variable, x, where we assume x ∈ [a, b], to the normalized variable:
z=
2(x − a)
− 1.
b−a
By definition we have z ∈ [−1, 1]. We then define the Chebyshev polynomials recursively as:
Tj (z) = 2 · z · Tj−1 (z) − Tj−2 (z),
where:
T0 (z) = 1,
and:
T1 (z) = z.
Thus, the second-order Chebyshev polynomial used to model diurnal solar radiation seasonality is given by:
( )
2
2(xt − at )
2(xt − at )
− 1 + α2 · 2 ·
−1 −1 .
(4)
daySeast = α0 + α1 ·
b t − at
b t − at
We use Chebyshev polynomials, as opposed to Fourier series, to model the diurnal seasonality for a
number of reasons. First, we only need to model solar radiation during daytime hours, since there is by
definition zero solar radiation at night. Moreover, solar radiation follows a predictable diurnal pattern,
insomuch as it peaks in the middle of the day. A second-order Chebyshev polynomial is better able to
produce this shape of a seasonal pattern than a Fourier series. This is confirmed by our model estimates,
since second-order Chebyshev polynomials provide much better goodness-of-fit than Fourier series do.
Based on these properties of the diurnal pattern, we define:
xt =
d(t) − rd(t)
ˆ
sd(t)
− rd(t)
ˆ
ˆ
,
ˆ are as defined in equations (1) and (3), r ˆ and s ˆ are the sunrise
in equation (4), where d(t) and d(t)
d(t)
d(t)
ˆ
and sunset times, respectively, on the day d(t),
and at and bt are the minimum and maximum values,
ˆ
respectively, that xt takes on day d(t).
Sunrise and sunset times are computed, based on the day of the
year and geographic coordinates of each location modeled, using MATLAB functions developed by the U.S.
Geological Survey.1
1 These
function are available at http://woodshole.er.usgs.gov/operations/sea-mat/air_sea-html/index.html.
6
3.3. VAR Model
VAR is a statistical model capturing the linear interdependencies among multiple time series. Hence,
it is beneficial in modeling the temporal and spatial correlation among wind speed, solar radiation, and
temperature in different locations. Each variable at each location has an equation explaining its evolution
based on time-lagged values of all of the weather variables at all locations.
Modeling the three weather variables at 61 locations in a single VAR gives 183 response variables in
total. Given the large model size, it is important to determine a suitable number of autoregressive lags and
which time-lagged values to include in the model. To do this, we regard one week’s lag as the maximum
number to be considered. We estimate multiple VAR(168) models with two response variables only. After
estimating several pairs, we find that regardless of the distance between locations, autoregressive lags of 1
and multiples of 24 are significant for most location pairs. This lag structure give us the spatial relationship
among locations. Furthermore, Akaike and Bayesian information criteria are used to determine the lag
structure. To fully capture the relationship between observations in adjacent periods, we use a VAR model
with lags one to 24 and multiples of 24 up to 168 of the form:
X
Yt = trendt + daySeast + annSeast +
Al · Yt−l + Ut ,
l∈L
where:
Yt = (y1,t , y2,t , · · · , y183,t )⊤ ,
is a 183 × 1 vector of hour-t response variables,
L = {1, 2, · · · , 24, 48, 72, 96, 120, 144, 168},
is the set of lags modeled, Al are 183 × 183 coefficient matrices for the lagged response variables, and:
Ut = (u1,t , u2,t , · · · , u183,t )⊤ ,
is a 183×1 vector of residuals. Since our data set covers 16 years of hourly observations, t = 1, 2, · · · , 140256.
3.4. Parameter Estimation
A VAR of the size proposed is difficult to estimate as a whole system due to computational and memory
limitations of computers (the entire system consists of more than 25 million equations). Since the model
is actually a seemingly unrelated regression system, we solve this problem by estimating each equation
separately. The data used for estimation are hourly observations from 1991 to 2006. The variance/covariance
matrix of the residuals is calculated after the estimation.
For wind and temperature, ordinary least squares is used for parameter estimation. Weighted least
squares is applied for solar radiation. The weights assigned to night observations are zero whereas weights
of one are given to daytime observations. We do this because the VAR model is only used to forecast solar
radiation during the day—solar radiation is fixed equal to zero during the night since by definition there is
no sunlight at night. By applying these weights, the estimated coefficients are better for forecasting solar
radiation during the day since nighttime observations are ignored. As discussed in Section 3.2, we calculate
sunrise and sunset times for each location based on geographic coordinates and the day of the year.
Figure 5 shows the residuals of the three weather variables in Chicago, IL and Las Vegas, NM. It is clear
that the residuals display heteroskedasticity. However, Durbin’s alternative test reveals no serial correlation
in the residuals.
4. Forecasting and Validation
To validate our model, we generate out-of-sample forecasts and compare the performance of our VAR
model to a number of benchmark competitors. In doing so, we consider forecasts that are up to six hours
ahead and use two years of out-of-sample data covering the years 2007 and 2008. As noted in Section 3.2,
7
25
25
20
20
15
15
Residuals [m/s]
Residuals [m/s]
10
10
5
0
5
0
−5
−5
−10
−10
−15
−15
1991
1994
1998
Time
2002
−20
1991
2006
1994
800
800
600
600
400
400
200
0
−200
0
−200
−400
−600
−600
1994
1998
Time
2002
−800
1991
2006
(c) Chicago, IL Solar
1994
1998
Time
2002
2006
(d) Las Vegas, NM Solar
10
5
5
Residuals [° C]
10
0
°
Residuals [ C]
2006
200
−400
−800
1991
2002
(b) Las Vegas, NM Wind
Residuals [Wh/m2]
2
Residuals [Wh/m ]
(a) Chicago, IL Wind
1998
Time
−5
−10
0
−5
−10
−15
1991
1994
1998
Time
2002
−15
1991
2006
(e) Chicago, IL Temperature
1994
1998
Time
2002
2006
(f) Las Vegas, NM Temperature
Figure 5: Residuals in Chicago, IL and Las Vegas, NM from 1991 to 2006
we fix solar radiation forecasts equal to zero between sunset and sunrise on each day. We further truncate
any negative wind speed and solar radiation forecasts equal to zero, since it is physically impossible for these
values to be negative.
We use three types of benchmark competitors in this validation. The first is to compare the VAR model
to persistence-type forecasting methods. The second compares the VAR model, which includes spatial
correlation among the weather variables, to a model that does not include spatial terms, which we term the
VARNS model. This benchmark is meant to demonstrate the benefit of taking spatial relationships into
account in weather forecasting. The third compares the performance of our VAR model to other forecasting
techniques appearing in the literature.
Numerous metrics are used in the literature to evaluate forecast accuracy. These include mean absolute
error (MAE), mean absolute percentage error (MAPE), root mean square error (RMSE), and relative root
8
mean square error (RMSE%), which we use in our validation. If we let Fi and Oi denote the hour-i forecast
and observation, respectively, of a given variable and N the number of out-of-sample forecasts used, MAE
is defined as:
N
1 X
|Fi − Oi |,
N i=1
MAPE is defined as:
RMSE is defined as:
and RMSE% is defined as:
N 1 X Fi − Oi ,
N i=1 Oi v
u
N
u1 X
t
(Fi − Oi )2 ,
N i=1
q
1
N
PN
i=1 (Fi −
PN
1
i=1 Oi
N
Oi )2
.
4.1. Persistence Forecasts
We compare two kinds of persistence-type forecasting methods to our VAR model. The first one is
the simple persistence method, which we denote the SP model. The SP model relies upon the weather
condition at the current time to forecast future conditions. Again, letting Oi denote the hour-i observation,
the simple persistence forecast of the hour-(i + ∆i) weather variable generated at hour i is defined as Oi
(i.e., SP assumes that the weather variable will have the same value at hour i + ∆i as it does at hour i).
This persistence forecast is applied to all three weather variables for comparison with the VAR model.
We also use the clearness persistence forecast proposed by Marquez and Coimbra (2012), which we denote
the CP model, to provide an additional benchmark for solar irradiance forecasts generated by our VAR.
The clearness persistence model relies on extraterrestrial solar radiation and takes the solar zenith angle
as an input. Let θi represent the solar zenith angle at hour i. We then define hour-i extraterrestrial solar
radiation as:
Si = C · cos(θi ),
where C = 1367 W/m2 is the solar constant. The clearness persistence forecast of the hour-(i + ∆i) solar
irradiance is then given by:
Si+∆i
Oi ·
.
Si
We use the hourly mean solar zenith angle recorded in the NSRDB to generate our clearness persistence
forecasts.
4.2. Model Without Spatial Relationships
The VARNS model has a similar structure to the VAR model introduced in Section 3, however the
spatial relationships between different locations are excluded from the model. Correlations between the
three weather variables at each individual location is kept in the model, however. For each location modeled,
the VARNS model has the form:
X
Zt = trendt + daySeast + annSeast +
Al · Zt−l + Ξt ,
l∈L
where:
Zt = (y1,t , y2,t , y3,t )⊤ ,
9
is a 3 × 1 vector of hour-t weather response variables for the single location being modeled,
L = {1, 2, · · · , 24, 48, 72, 96, 120, 144, 168},
is the same set of lags that are modeled in the VAR introduced in Section 3, Al are 3 × 3 coefficient matrices
for the lagged response variables, and:
Ξt = (ξ1,t , ξ2,t , ξ3,t )⊤ ,
is a 3 × 1 vector of residuals.
It is important to the stress that the first index of the subscripts on the y1,t , y2,t , and y3,t terms used to
define Zt do not directly correspond to the first index of the y1,t , y2,t , and y3,t terms used to define Yt . We
are merely stressing in our definitions that the full VAR, which has Yt as the response variable, captures all
of the spatial relationships between the weather variables at different locations. The VARNS, which has Zt
as the response variable, only captures correlations between weather variables at each individual location.
4.3. Results
Tables 4 through 6 summarize the forecasting performance of our VAR model using the different metrics
discussed above. For some of the metrics, the table reports the average (among the 61 locations modeled)
as well as minimum and maximum values. These results are compared with results in the literature in
Section 4.4.
Table 4: Average, minimum, and maximum (among 61 locations modeled) MAE, MAPE, and RMSE of
wind forecasts produced by VAR model
Forecast
Horizon
MAE [m/s]
Mean Min Max
MAPE [%]
Mean Min
Max
RMSE [m/s]
Mean
1-Hour Ahead
2-Hours Ahead
3-Hours Ahead
4-Hours Ahead
5-Hours Ahead
6-Hours Ahead
1.03
1.18
1.28
1.35
1.40
1.44
27.03
30.52
32.80
34.36
35.49
36.36
38.26
40.13
42.28
45.43
47.77
49.66
1.38
1.58
1.70
1.79
1.85
1.90
0.18
0.36
0.54
0.67
0.74
0.79
1.39
1.69
1.91
2.06
2.17
2.27
5.54
11.19
17.12
21.15
23.71
25.30
Table 5: Average, minimum, and maximum (among 61 locations modeled) MAE, MAPE, RMSE, and
RMSE% of wind solar produced by VAR model
Forecast
Horizon
MAE [Wh/m2 ]
Mean Min
Max
MAPE [%]
Mean Min
1-Hour Ahead
2-Hours Ahead
3-Hours Ahead
4-Hours Ahead
5-Hours Ahead
6-Hours Ahead
34.67
42.55
47.08
49.77
51.33
52.24
37.03
40.81
42.81
43.98
44.69
45.04
21.90
26.43
28.79
30.13
30.76
31.08
48.15
57.33
61.86
64.75
66.12
66.86
26.53
28.70
29.50
29.77
29.77
29.84
Max
RMSE [Wh/m2 ]
Mean
RMSE% [%]
Mean
51.58
58.84
60.25
61.47
61.78
62.30
73.19
85.83
93.34
98.07
101.05
102.81
21.31
24.99
27.17
28.55
29.42
29.93
Tables 7 through 9 summarize the average forecasting performance of the benchmarks mentioned above.
Tables 7 and 8 show that the VAR model slightly outperforms the VARNS for both temperature and wind.
Both the VAR and VARNS models outperform the simple persistence model.
10
Table 6: Average, minimum, and maximum (among 61 locations modeled) MAE and MAPE of temperature
forecasts produced by VAR model
Forecast
Horizon
MAE [◦ C]
Mean Min
Max
MAPE [%]
Mean Min
Max
1-Hour Ahead
2-Hours Ahead
3-Hours Ahead
4-Hours Ahead
5-Hours Ahead
6-Hours Ahead
0.68
0.98
1.22
1.42
1.57
1.70
1.06
1.41
1.77
2.10
2.41
2.67
11.21
16.39
20.62
24.06
26.89
29.20
29.98
46.59
59.84
71.49
81.47
89.80
0.31
0.63
0.87
0.98
1.05
1.10
2.26
3.10
3.70
4.14
4.48
4.74
Table 7: Average (among 61 locations modeled) MAE of VAR, VARNS, and SP models
Forecast
Horizon
Temperature [◦ C]
VAR VARNS SP
Solar Radiation [Wh/m2 ]
VAR VARNS SP
Wind Speed [m/s]
VAR VARNS SP
1-Hour Ahead
2-Hours Ahead
3-Hours Ahead
4-Hours Ahead
5-Hours Ahead
6-Hours Ahead
0.68
0.98
1.22
1.42
1.57
1.70
34.67
42.55
47.08
49.77
51.33
52.24
1.03
1.18
1.28
1.35
1.40
1.44
0.70
1.06
1.35
1.59
1.79
1.95
0.99
1.77
2.50
3.18
3.80
4.36
34.26
43.64
49.11
52.36
54.22
55.25
63.08
109.19
152.96
194.04
231.74
265.39
1.03
1.20
1.32
1.40
1.46
1.51
1.10
1.36
1.56
1.73
1.88
2.01
Table 8: Average (among 61 locations modeled) MAPE of VAR, VARNS, and SP models
Forecast
Horizon
Temperature [◦ C]
VAR VARNS SP
Solar Radiation [Wh/m2 ]
VAR VARNS SP
Wind Speed [m/s]
VAR VARNS SP
1-Hour Ahead
2-Hours Ahead
3-Hours Ahead
4-Hours Ahead
5-Hours Ahead
6-Hours Ahead
11.21
16.39
20.62
24.06
26.89
29.20
37.03
40.81
42.81
43.98
44.69
45.04
27.03
30.52
32.80
34.36
35.49
36.36
11.65
17.74
22.82
27.06
30.68
33.71
14.52
25.13
35.00
44.22
52.83
60.67
41.80
49.20
52.01
53.45
54.41
55.03
117.65
295.33
474.65
638.94
775.06
871.71
27.08
30.80
33.24
34.97
36.22
37.20
29.48
36.12
41.21
45.54
49.24
52.44
For solar radiation, the MAE of one-hour-ahead forecasts produced by the VARNS is slight lower than
that of the VAR, however the VAR model outperforms the VARNS in terms of MAPE and RMSE. Note that
the calculation of MAPE requires a division by zero when actual values are zero. Thus, the MAPE of actual
solar irradiance observations that are zero and not correctly forecasted cannot be calculated. Table 9 shows
that in general, the VAR model with spatial information is better than the model without in terms of solar
forecasting performance. Both the VAR and VARNS are better than the two persistence models, especially
when the forecasting horizon increases. This is also illustrated in Figure 6, which shows the average RMSE
of the different models as a function of the forecasting horizon.
4.4. Comparative Studies
Our VAR model provides good forecasting performance compared to other methods reported in the
literature, showing that our model can be used for providing short-term temperature, wind speed, and solar
radiation forecasts. The average (across the 61 locations modeled) performance of our model is comparable
11
Table 9: Average (among 61 locations modeled) RMSE of solar radiation forecasts produced by VAR,
VARNS, CP, and SP models
Forecast Horizon
VAR
VARNS
CP
SP
1-Hour Ahead
2-Hours Ahead
3-Hours Ahead
4-Hours Ahead
5-Hours Ahead
6-Hours Ahead
73.19
85.83
93.34
98.07
101.05
102.81
73.20
87.48
96.26
101.82
105.21
107.09
79.61
105.28
136.22
171.39
207.38
240.79
111.06
178.99
242.22
298.01
345.16
383.29
400
VAR
VARNS
CP
SP
350
RMSE [Wh/m 2 ]
300
250
200
150
100
50
1
2
3
4
5
6
Forecast Horizon [Hours Ahead]
Figure 6: Average (Among 61 Locations Modeled) RMSE of Solar Radiation Forecasts Produced by VAR,
VARNS, CP, and SP Models
to other works, while our model performs significantly better at some locations (this is indicated by the
minimum values of the performance metrics reported in Tables 4 through 6). Moreover, Tables 4 through 6
suggest that our VAR models provides relatively robust weather forecasts up to six-hours ahead.
Perez et al. (2007) forecast wind speed using a blended ensemble, which consists of the Weather Research
and Forecasting Single Column Model and time series forecasts that are calibrated with Bayesian model
averaging. The MAEs of their hour-ahead wind speed forecasts are between 0.9 m/s and 0.95 m/s during the
day and are between 1.01 m/s and 1.07 m/s overnight. Erdem and Shi (2011) compare four approaches based
on an autoregressive moving average method for hour-ahead forecasting. Their method has MAEs ranging
from 0.8 m/s to 2.3 m/s. Li and Shi (2010) present a comparison study on the application of different
artificial neural networks in hour-ahead wind speed forecasting and measure forecasting performance in
terms of MAE, MAPE, and RMSE. The best MAE, MAPE, and RMSE among the locations they model
are 0.95 m/s, 19.4%, and 1.254 m/s, respectively. Chen et al. (2013) produce wind speed forecasts using a
Gaussian process applied to the outputs of an NWP model. Their hour-ahead and five-hour-ahead forecasts
have RMSEs of 1.8 m/s and 2.2 m/s, respectively.
More short-term solar radiation forecasting is done using cloud motion derived from satellite images.
Examples of this method includes the works of Heinemann et al. (2006); Perez et al. (2010); Traiteur et al.
12
(2012). Perez et al. (2010) report an increase in the RMSE% from 25% to 42% as the forecasting horizon
goes from hour-ahead to six-hours-ahead. Traiteur et al. (2012) compare their forecasts against single point
ground-truth stations and report RMSEs that vary from 68 Wh/m2 to 120 Wh/m2 for hour-ahead forecasts
and 140 Wh/m2 to 200 Wh/m2 to six-hour-ahead forecasts. Erdem and Shi (2011) generate one-, two-, and
three-hours-ahead solar forecasts and report RMSE%s of 23%, 32%, and 38%, respectively. Remund et al.
(2008) compare short-term global radiation forecasts of three different models and find that ECMWF (Global
Model of the European Centre for Medium-Range Weather Forecasts) is the best, with an RMSE% that
stays at about 38% for one- to five-hours-ahead forecasting.
Taylor and Buizza (2004) compare point forecasts of daily air temperature generated by six different
models to actual observations. The best MAE of an hour-ahead forecast that they report is 0.9◦ C, as
opposed to an average of 0.68◦ C generated by our model.
5. Conclusions
In this paper, we propose a time series VAR model to forecast temperature, solar radiation, and wind
speed at 61 locations around the United States. The forecasting performance is good for all the three
weather variables. Given the influence of the three weather variables on electricity systems, the model is
able to provide proper inputs for electricity supply and demand modeling.
The consideration of spatial relationship allows the model to provide good forecasts for multiple locations
while considering cross correlations among the locations modeled. The comparison of the VAR and VARNS
models shows that the spatial terms do help in improving forecasting performance. The VAR model proposed
is also flexible in size. The forecasting performance is similar when it is used to forecast the three weather
variables for fewer locations (results for these more limited models are excluded for sake of brevity). We also
show that the VAR model performs similarly to or better than other methods proposed in the literature,
including persistence forecasts.
Another important contribution of this paper is that it shows that a time series approach can be used
to provide robust short-term solar radiation forecasts with good forecasting performance.
This work does suggest several areas of future research. Although the VAR model proposed provides
good forecasts, it may be redundant given its large size. Each equation has about five thousands parameters
to be estimated. Not every one of these parameters contributes to the overall forecast. Thus, it may be
possible to further customize the model and autoregressive structure of the model to better exploit the
correlations in the data. The residuals also display heteroskedasticity, which weighted or generalized leastsquares techniques may reduce.
Acknowledgments
The authors thank Armin Sorooshian and the editor for helpful suggestions and conversations. The work
presented in this paper was supported by the National Science Foundation under Grant Number 1029337.
This work was also supported by an allocation of computing time from the Ohio Supercomputer Center.
References
Alaton, P., Djehiche, B., Stillberger, D., 2002. On modelling and pricing weather derivatives. Applied Mathematical Finance
9, 1–20.
ˇ
Benth, F. E., Saltyt˙
e-Benth, J., 2005. Stochastic Modelling of Temperature Variations with a View Towards Weather Derivatives. Applied Mathematical Finance 12, 53–85.
Campbell, S. D., Diebold, F. X., March 2005. Weather Forecasting for Weather Derivatives. Journal of the American Statistical
Association 100, 6–16.
Chen, N., Qian, Z., Meng, X., Nabney, I. T., 3-9 August 2013. Short-TermWind Power Forecasting Using Gaussian Processes.
In: 23rd International Joint Conference on Artificial Intelligence. Beijing, China.
Chowdhury, B. H., Rahman, S., 4-8 May 1987. Forecasting sub-hourly solar irradiance for prediction of photovoltaic output. In:
19th IEEE Photovoltaic Specialists Conference. Institute of Electrical and Electronics Engineers, New Orleans, Louisiana,
USA.
13
Erdem, E., Shi, J., April 2011. ARMA based approaches for forecasting the tuple of wind speed and direction. Applied Energy
88, 1405–1414.
Giebel, G., Brownsword, R., Kariniotakis, G., Denhard, M., Draxl, C., January 2011. The State of the Art in Short-Term
Prediction of Wind Power A Literature Overview, 2nd Edition. Tech. rep., ANEMOS.plus.
Hammer, A., Heinemann, D., Lorenz, E., L¨
uckehe, B., July 1999. Short-term forecasting of solar radiation: a statistical
approach using satellite data. Solar Energy 67, 139–150.
Heinemann, D., Lorenz, E., Girodo, M., 2006. Forescasting of Solar Radiation. In: Dunlop, E. D., Wald, L., Suri, M. (Eds.),
Solar Energy Resource Management for Electricity Generation from Local Level to Global Scale. Nova Publishers, Ch. 7,
pp. 83–94.
Li, G., Shi, J., July 2010. On comparing three artificial neural networks for wind speed forecasting. Applied Energy 87,
2313–2320.
Marquez, R., Coimbra, C. F. M., 13-17 May 2012. Comparison of Clear-Sky Models for Evaluating Solar Forecasting Skill. In:
Proceedings of the World Renewable Energy Forum 2012. American Solar Energy Society, Denver, pp. 4443–4449.
Miranda, M. J., Fackler, P. L., 2002. Applied Computational Economics and Finance. MIT Press, Cambridge, Massachusetts.
Oetomo, T., Stevenson, M. J., August 2005. Hot or Cold? A Comparison of Different Approaches to the Pricing of Weather
Derivatives. Journal of Emerging Market Finance 4, 101–133.
Perez, R., Kivalov, S., Schlemmer, J., Hemker Jr., K., Renn´e, D., Hoff, T. E., December 2010. Validation of short and medium
term operational solar radiation forecasts in the US. Solar Energy 84, 2161–2172.
Perez, R., Moore, K., Wilcox, S. M., Renn´
e, D., Zelenka, A., June 2007. Forecasting solar radiation—Preliminary evaluation
of an approach based upon the national forecast database. Solar Energy 81, 697–838.
Remund, J., Perez, R., Lorenz, E., 2008. Comparison of Solar Radiation Forecasts for the USA. In: 2008 European PV
Conference. Valencia, Spain.
ˇ
Saltyt˙
e-Benth, J., Benth, F. E., Jalinskas, P., 2007. A Spatial-temporal Model for Temperature with Seasonal Variance. Journal
of Applied Statistics 34, 823–841.
Svec, J., Stevenson, M. J., 2007. Modelling and forecasting temperature based weather derivatives. Global Finance Journal 18,
185–204.
Taylor, J. W., Buizza, R., August 2004. A comparison of temperature density forecasts from GARCH and atmospheric models.
Journal of Forecasting 23, 337–355.
Taylor, J. W., Buizza, R., January-March 2006. Density forecasting for weather derivative pricing. International Journal of
Forecasting 22, 29–42.
Traiteur, J. J., Callicutt, D. J., Smith, M., Roy, S. B., October 2012. A Short-Term Ensemble Wind Speed Forecasting System
for Wind Power Applications. Journal of Applied Meteorology and Climatology 51, 1763–1774.
Wilcox, S. M., August 2012. National Solar Radiation Database 1991-2010 Update: User’s Manual. Tech. Rep. NREL/TP5500-54824, National Renewable Energy Laboratory.
14