# Sunny Days

## Summary

Frost has been identified as one of the leading causes of green seed in Western Canada. As the effects of climate change threaten to destabilize Albertan climate conditions, the weather-dependent exigencies of the canola industry can be remedied with long-term weather prediction models. The 2019 Calgary iGEM team has established such a weather prediction model through the use of a novel machine learning algorithm, the principal component neural network mean model (PNMM), which utilizes principal component analysis (PCA) and recurrent neural network (RNN) machine learning methodologies in tandem with traditional weather prediction factors. The PNMM was able to predict minimum daily temperature for the 2018 growing season with a mean absolute error of 2.101°C. Therefore, the PNMM can be utilized to predict inclement weather with a degree of accuracy for any location with sufficient historical weather data, allowing Albertan farmers to optimize their agricultural decision-making and minimize the occurrence of green seed.

## Inspiration

After speaking to Craig Shand and Angela Brakeenreed, we learned that farmers can use a few agronomic techniques, such as swathing and crop rotation, to protect their fields from inclement weather that would cause green seed production. However, current weather forecasting systems are not adequate enough for farmers to make these choices, and with Alberta’s climate becoming increasingly volatile due to the effects of global warming, the high variability in contemporary weather patterns has turned the management and planning of canola crops into a risky and complicated affair (Deser et al. 2012). This is due to the inability of modern temperature prediction methods to accurately predict frost events. Generally, current methods of long-term weather prediction, such as with Environment and Climate Change Canada’s (ECCC’s) long-term probabilistic temperature and precipitation forecast maps (ECCC, 2019), give only probabilities of temperatures being below, at, or above normal temperatures. The lack of daily resolution in temperature detail lowers the usefulness of these models for use in agricultural contexts. Additionally, the application of predictions over a large area, such as with ECCC’s predictions and Old Farmer Almanac’s predictions, leads to reduced accuracy in comparison to localized predictions (Old Farmer’s Almanac, 2019).

To combat this problem, we utilized our experience in machine learning with MATLAB and TensorFlow to create a novel machine learning algorithm, dubbed the principal component neural network mean model (PNMM). The model is able to predict daily minimum temperatures for localized areas with a high level of accuracy. To generate an example use case, we randomly selected the southern Albertan town of Vulcan to train and test our model on. To train our model, we used multiple years worth of weather data from climate measurement stations around Vulcan.

By predicting daily minimum temperatures, our weather modelling methodology can be used to forecast inclement weather. Therefore, the knowledge gained from this model can aid farmers in planning their seed choice, crop planting, and harvest timing.

# Methodology

## General Assumptions

- The average data of the five weather stations examined that surround Vulcan, Alberta is representative of the weather of Vulcan, Alberta.
- Missing weather data that was interpolated is accurate to real values.

## Data Used

Input weather data was collected using the Alberta Agriculture and Forestry’s Alberta Climate Information Service (ACIS). Special thanks to ACIS for allowing the use of their data in our research. The dataset involved weather data from 5 different weather stations surrounding the town of Vulcan, Alberta. These weather stations were Blackie AGCM, Champion AGDM, Mossleigh AGCM, Queenstown, and Travers AGCM, where most measurements besides those attributed to precipitation were collected two metres above the ground. The data parameters that were obtained were comprised of the following: average relative humidity (%), average air temperature (°C), maximum air temperature (°C), accumulated precipitation (mm), minimum relative humidity (%), maximum relative humidity (%), minimum air temperature (°C), precipitation (mm), and average wind speed (km/h) (ACIS 2019).

The target weather data were labeled as the daily minimum temperature values of the five examined weather stations 153 days into the future from input dates. The training data of the PCA model used weather data from the dates of June 13, 2012 to April 30, 2018. The labelled testing data was selected as the weather data from the dates of May 1, 2018, to September 30, 2018.

Figure 1. Location map of the five examined weather stations (Blackie ACGM, Champion AGDM, Mossleigh AGCM, Queenstown, and Travers AGCM) around Vulcan, Alberta, Canada (ACIS, 2019)

# Recurrent Neural Network

## Assumptions made for RNN

- The value of the local minimum in error reached by training in the model is or is equal to the global minimum in error.
- Extrema of weather data outside of the training data do not excessively differ in magnitude from the extrema of the training weather data, ensuring the validity of using min-max normalization.

## Recurrent Neural Network Methodology

Recurrent neural networks (RNNs) are ANNs (click hereto read more) where the hidden layer nodes feed back into the neural network. This architecture allows the neural network to develop internal memory states, allowing it to exploit time series data to inform the predictions of the RNN.

However, typical RNNs suffer from the “vanishing gradient problem,” as described by Hochreiter, Bengio, Frasconi, and Schmidhuber (2001). The “vanishing gradient problem” describes the issue where the rate of change of a model’s parameters during training decreases to negligible values, hence “vanishing.”

Deeply layered models, such as unrolled RNNs with long-term memory, are highly prone to the “vanishing gradient problem.” Therefore, many RNNs are limited in the temporal memory they can possess, which can lower their capacity for learning. In order to nullify this difficulty, the team needed to explore more advanced solutions.

## GRUs: A Solution to the Vanishing Gradient Problem

Therefore, our team implemented gated recurrent units (GRUs), as introduced by Cho et al. (2014). By replacing the standard nodes of an RNN with GRUs, which utilize update and reset gates, the neural network was able to maintain a long-term memory of temporal data to inform the decisions of the neural network.

The architecture of a GRU can be described as:

For an iteration \(t\), where \(x_{t}\) is the input vector, \(W\) and \(U\) are matrices of parameters, \(b\) is a parameter vector, \(g\) is the sigmoid activation function, and \(h\) is the hyperbolic tangent activation function, the general formulas for the update gate, reset gate, and output vectors are given as:

$$ z_{t}=\sigma _{g}(W_{z}x_{t}+U_{z}h_{t-1}+b_{z}), $$ $$ r_{t}=\sigma _{g}(W_{r}x_{t}+U_{r}h_{t-1}+b_{r}), and $$

A final neural network architecture of the form 21-256-256-256-5 was utilized by the team, with 3 consecutive GRU units in the hidden layers of the RNN, and a final output layer of temperature predictions for each examined weather station.

## Data Used

The input data for the RNN model were collected by using Environment and Climate Change Canada’s (ECCC) Historical Data tool. The data used were comprised of hourly weather data from the dates of December 24, 2012, to September 30, 2018, for the five examined weather stations. Aside from the wind speed data fields for Mossleigh AGCM and Travers AGCM, the data fields of each weather station in the input data were: air temperature (°C), dew point temperature (°C), relative humidity (%), and wind speed (km/h). However, some data was missing at certain times, requiring those data points to be interpolated. The interpolation method used was hot deck imputation.

The time interval of the collected training and validation weather data for the RNN was May 26, 2013, to April 30, 2018. Additionally, the test data was selected as the weather data from May 1, 2018, to September 30, 2018.

## Data Preprocessing

Data preprocessing is an indispensable initial step for neural network training. In order to enable compatibility between the input data and the RNN, input data was converted into numeric feature columns, which was especially relevant for providing date input information.

Additionally, min-max scaling was used to scale all input values to be in or almost in the range of [0,1]. Data normalization of this type was utilized to increase the efficiency of the training of the neural network, as numerically large inputs can disproportionately influence node values in early stages of training. The general formula for this type of scaling is given as:

$$x_{normalized}=\frac{x-min(v)}{max(v)-min(v)},$$

where \(x_{normalized}\) is the normalized value of \(x\), where \(x\) is a value of a data field \(v\). \(max(v)\) and \(min(v)\) are the maximum and minimum values of \(v\), respectively.

## Training Data

Training data was selected as the least recent 80% of the original preprocessed input data. The remaining most recent 20% of the original preprocessed input data was taken as the validation data. In neural network training, validation data is not used for training, but is instead used to evaluate the neural network’s ability to generalize to new weather data.

Additionally, the weather model training used a batch generator that produced each batch of data used in the training as a set of 64 continuous time series of 6 weeks of input data each, which were randomly sampled with replacement from the training data. These were all bundled with their corresponding labels. The neural network was trained to minimize the mean squared error (MSE) of its predicted temperature values versus the true temperature values for the training data, while ignoring a warm up period of the first 2 days (48 hours) of predictions. This period was ignored due to those early predictions lacking enough time series information to provide stable predictions.

## Training

In order to prevent overfitting, techniques like dropout and early stopping were utilized to preserve the ability for the model to generalize. Dropout is a technique whereby some nodes in the neural network are disabled during training, preventing them from memorizing the training data. The model used dropout rates of 0.1, 0.15, and 0.2 for the consecutive GRU layers in the RNN, respectively, corresponding to 25, 38, and 51 (randomly selected) disabled nodes during each training update.

Each epoch of the training involved only 10 learning steps. The training encompassed 53 epochs, using early stopping with a patience of 6 epochs, meaning that the final neural network model was selected once the previous 6 epochs resulted in no improvement on the best neural network of the previous 6 epochs. The early stopping technique was used in order to prevent the validation error from increasing due to model overfitting while training.

Additionally, the optimizer used in the RNN training was the Adam optimizer, as introduced by Kingma and Ba (2014). Adam was chosen due to its efficient performance when applied to problems with “non-stationary objectives and problems with very noisy and/or sparse gradients” (Kingma and Ba. 2014).

Once the training of the neural network concluded, the RNN weather model could then be used to forecast weather when given new preprocessed input data of the same dimensionality as the training data.

# PNMM

## PNMM Methodology

In order to help alleviate individual shortcomings of the RNN and PCA models, we devised a hybrid component model comprised of the two aforementioned models and the historical normal temperature of Vulcan, collected from ACIS (2018). The reason temperature normals were incorporated into this model is because of the integral role they play in most temperature models; ECCC and ACIS temperature predictions are relative to temperature models (ECCC, 2018) (ACIS, 2018).

The designation given to this combined model was: principal component neural network mean model (PNMM). The benefits of hybrid models lie in the use of multiple approaches to the same problem, increasing the confidence of classification when the component models corroborate each other. Conversely, when the underlying models’ predictions diverge, hybrid models normalize its component models’ variance spikes across time series.

Mathematically, we can represent the PNMM thusly:

PNMM Prediction = \(D(t)\), the PNMM Prediction Function

PCA Prediction = \(A(t|\Omega _{\alpha})\), the PCA Prediction Function supported by \(\Omega _{\alpha}\)

RNN Prediction = \(B(t|\Omega _{\beta})\), the RNN Prediction Function supported by \(\Omega _{\beta}\)

Normal Prediction = \(C(t|\Omega _{\alpha})\), the Temperature Normal Prediction Function supported by \(\Omega _{\alpha}\)

\(\Omega\) = Information Space(\(\alpha\) for daily data, \(\beta\) for hourly data)

\(m\) = (Number of Component Models) \(- 1\)

^{*Above equation is not to be taken super seriously. It is just an average.}

Due to the nature of hybrid models, the performance of the PNMM is never the worst model when compared to its component models for specific time intervals. Additionally, when considered across a long enough time scale, the reliability and robustness of the PNMM typically results in it outperforming its individual component models.

# Results

## Model Evaluations

In order to evaluate the performance of the models, minimum temperature predictions for each day of the hindcasted time period of May 1, 2018 to September 30, 2018 were compiled into a time series and compared to the actual daily minimum temperatures for those dates. The results of this comparison are displayed in Figure 2.

Figure 2. Hindcasts of 2018 growing season average daily minimum temperature (°C) of Blackie ACGM, Champion AGDM, Mossleigh AGCM, Queenstown, and Travers AGCM Weather Stations

The metrics used for evaluating the performance of the models are mean absolute error (MAE) and mean squared error (MSE), given as:

$$ MAE = \frac{1}{n}\sum_{i=1}^{n}|y_{actual}-y_{predicted}|, and $$ $$ MAE = \frac{1}{n}\sum_{i=1}^{n}(y_{actual}-y_{predicted})^2, respectively, $$

where \(n\) is the number of days in the examined hindcast time interval, and \(y\) is the minimum temperature value on day \(i\) of the examined time interval. Therefore, lower MAE and MSE values indicate lower average error in the models’ predictions, whereas higher MAE and MSE values indicate inaccuracy in the models’ predictions. The results of these metrics are displayed in Table 1.

Model | MAE | MSE |
---|---|---|

PCA | 3.491°C | 18.624(°C)^{2} |

RNN | 2.926°C | 14.311(°C)^{2} |

PNMM | 2.101°C | 6.62(°C)^{2} |

Table 1. Evaluation Results from the Hindcasts of 2018 Growing Season Average Daily Minimum Temperature (°C) of Blackie ACGM, Champion AGDM, Mossleigh AGCM, Queenstown, and Travers AGCM weather stations.

The daily and monthly absolute errors of each model are displayed in **Figure 3** and **Figure 4**. From these results as well as the results displayed in **Table 1**, it can be seen that the PNMM demonstrates the greatest overall accuracy of the three models. The PNMM results alone can be seen in **Figure 5** and **Figure 6**.

Figure 3. Absolute error of hindcasts of 2018 growing season average daily minimum temperature (°C) of Blackie ACGM, Champion AGDM, Mossleigh AGCM, Queenstown, and Travers AGCM weather stations

Figure 4. Total monthly absolute error of hindcasts of 2018 growing season average daily minimum temperature (°C) of Blackie ACGM, Champion AGDM, Mossleigh AGCM, Queenstown, and Travers AGCM weather stations

Figure 5. PNMM hindcast of 2018 growing season average daily minimum temperature (°C) of Blackie ACGM, Champion AGDM, Mossleigh AGCM, Queenstown, and Travers AGCM weather stations

Figure 6. Absolute error of PNMM hindcast of 2018 growing season average daily minimum temperature (°C) of Blackie ACGM, Champion AGDM, Mossleigh AGCM, Queenstown, and Travers AGCM weather stations

It can be seen from **Figure 4** that the three models are comparable until August, where the PCA model suffers from underpredicting the minimum average temperature, resulting in its most inaccurate month of predictions. Additionally, in September, the RNN model fails to accurately predict the seasonal decrease in minimum average temperatures. Despite these setbacks, the PNMM managed to accurately follow the general trend of the weather data where the other models diverged.

# Conclusion

In order to model localized long-term temperature values to anticipate cold weather events, an RNN model, a PCA model, and a mean-based composite model named PNMM were created. When hindcasting the 2018 growing season, the MAE of the RNN, PCA, and PNMM models were determined to be 2.917°C, 3.491°C, and 2.101°C respectively. Additionally, the RNN, PCA, and PNMM models’ MSE values were computed to be 14.311(°C)^{2} 18.624(°C)^{2}, and 6.621(°C)^{2}, respectively. The accuracy of these models predicting daily minimum temperature values 153 days into the future was therefore found to support the possibility of the PNMM being the best model for future accurate forecasting of weather conditions. Finally, the weather models developed here can be used as supplementary resources for farmers with needs for long-term weather forecasting in the interests of minimizing agriculturally harmful weather impacts.

# References

[ACIS] Alberta Climate Information Service. (2019). Current and Historical Alberta Weather Station Data Viewer [Data Viewer]. Alberta (AB): Alberta Agriculture and Forestry (AF); [updated 2019 May 23; accessed 2019 May 29]. Available from: http://climate.weather.gc.ca/historical_data/search_historic_data_e.html

Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014. Doha, Qatar. Association for Computational Linguistics.

Deser, C., Phillips, A., Bourdette, V., Teng. H. (2012). Uncertainty in Climate Change Projections: The Role of Internal Variability. *Climate Dynamics. 38:* 527-546

[ECCC] Environment and Climate Change Canada. (2019). National Archive Historical Data [Data Tool]. Canada: Environment and Climate Change Canada; [updated Jun 10, 2019; accessed 2019 Jun 12]. Available from: http://climate.weather.gc.ca/historical_data/search_historic_data_e.html

[ECCC] Environment and Climate Change Canada. [Internet]. (2019). Temperature and Precipitation Probabilistic Forecasts. Canada. [updated 2019 July 10; cited 2019 July 10] Available from: https://weather.gc.ca/saisons/prob_e.html

Old Farmer’s Almanac. (2019). *2019-2020 LONG RANGE WEATHER FORECAST FOR VULCAN, AB. *Retrieved from: https://www.almanac.com/weather/longrange/AB/Vulcan

Hochreiter, S., Bengio, Y., Frasconi, P., & Schmidhuber, J. (2001). Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In S. C. Kremer & J. F. Kolen (Eds.), *A Field Guide to Dynamical Recurrent Neural Networks *(pp. 237-243). New York, NY: IEEE Press.

Kingma, P. D. & Ba, J. (2017). Adam: A Method for Stochastic Optimization. Retrieved from https://arxiv.org/abs/1412.6980

Williams. R. J & Zipser, D. (1995). Gradient-based learning algorithms for recurrent networks and their computational complexity. In Y. Chauvin & D. E. Rumelhart (Eds.), *Developments in connectionist theory. Backpropagation: Theory, architectures, and applications* (pp. 443-486). New Jersey, NJ: Lawrence Erlbaum Associates, Inc.