Predict Number of Active Cases by Covid-19 Pandemic based on Medical Facilities (Volume of Testing, ICU beds, Ventilators, Isolation Units, etc) using Multi-variate LSTM based Multi-Step Forecasting Models Introduction and Motivation The intensity of the growth of the COVID-19 pandemic worldwide has propelled researchers to evaluate the best machine learning model that could the people affected in the distant future by considering the current statistics and predicting the near future terms in subsequent stages. While different univariate models like ARIMA/SARIMA and traditional time-series are capable of predicting the Number of Active cases, daily recoveries, Number of deaths, they do not take into consideration the other time-varying factors like , Medical Facilities (Volume of Testing, ICU beds, Hospital Admissions, Ventilators, Isolation Units Quarantine Centres, etc). As these factors become important we build a predictive model that can predict the Number of Active Cases, Deaths, and Recoveries based on the change in Medical Facilities as well as other changes in infrastructure. Here in this blog, we try to model Multi-step Time Series Prediction using Deep learning Models on the basis of Medical Information available for different states of India. Multi-Step Time Series Prediction A typical multi-step predictive model looks like the below figure, where each of the predicted outcomes from the previous state is treated as next state input to derive the outcome for the second-state and so forth. Source Deep Learning-based Multi-variate Time Series Training and Prediction The following figure illustrated the important steps involved in selecting the best deep learning model. Time-Series based Single/Multi-Step Prediction Feeding from a single source or from aggregated sources available directly from the cloud or other 3rd-party providers into the ML modeling data ingestion system. Multi-variate data Cleaning, preprocessing, and feature engineering of the data involving and . scaling normalization Conversion of the data to a . supervised time-series Feeding the data to a deep learning training source that can train different time-series models like using different combinations of LSTM, CNN, BI-LSTM, CNN+LSTM hidden layers, neurons, batch-size, and other hyper-parameters. Forecasting based on or in the future either using near term far distant terms Single-Step or Multi-Step Forecasting respectively. Evaluation of some of the error metrics like ( ) by comparing it with the actual data, when it comes in. MAPE, MAE, ME, RMSE, MPE Re-training the when the threshold of error exceeds. model and continuous improvements Import Necessary Tensorflow Libraries The code snippet gives an overview of the necessary libraries required for tensorflow. tensorflow.python.keras.layers Dense, LSTM, RepeatVector,TimeDistributed,Flatten, Bidirectional tensorflow.python.keras Sequential tensorflow.python.keras.layers.convolutional Conv1D, Conv2D, MaxPooling1D,ConvLSTM2D from import from import from import Data Loading and Selecting Features As Delhi had high Covid-19 cases, here we model different DL models for the Further we keep the scope of dates from 25th March to 6th June 2020. Data till 29th April has been used for Training, whereas from has been used for testing/prediction. The test data has been used to predict for 7 days for 3 subsequent stages of prediction. "DELHI" State (National Capital of India). , 30th April to 6th June This code demonstrates the data is first split into a 70:30 ratio between training and testing (by finding the closest number to 7), where each set is then restructured to weekly samples of data. print(np.shape(data)) split_factor = int((np.shape(data)[ ]* )) print( , split_factor) m = trn_close_no = closestNumber(split_factor, m) te_close_no = closestNumber((np.shape(data)[ ]-split_factor), m) train, test = data[ :trn_close_no], data[trn_close_no:(trn_close_no + te_close_no)] print( , np.shape(train), np.shape(test)) len_train = np.shape(train)[ ] len_test = np.shape(test)[ ] train = array(split(train[ :len_train], len(train[ :len_train]) / )) test = array(split(test, len(test) / )) print( , np.shape(train), np.shape(test)) train, test : def split_dataset (data) # split into standard weeks 0 0.7 "Split Factor no is" 7 0 0 "Initials Train-Test Split --" 0 0 # restructure into windows of weekly data 0 0 7 7 "Final Train-Test Split --" return Initials Train-Test Split -- (49, 23) (21, 23) ----- Training and Test DataSet Final Train-Test Split -- (7, 7, 23) (3, 7, 23) ----- Arrange Train and Test DataSet into 7 and 3 weekly samples respectively. The data set and the features have been scaled using Min-Max Scaler. scaler = MinMaxScaler(feature_range=( , )) scaled_dataset = scaler.fit_transform(dataset) 0 1 Convert Time-Series to a Supervised DataSet The tricky part in the t lies in the ) that the has to consider. converting ime-series to a supervised time-series for multi-step prediction incorporating number of past days (i.e. the historic data weekly data The series derived by considering historic data is (as it got split as , where is the with ). This built using data helps the model to considered 7 times during training iterations and 3 times during testing iterations (7,7,23) and (7,3,23) 22 number of input features one predicted output series historic learn and predict any day of the week. The below snippet code demonstrates what is described above. Note 1: This is the most important step of formulating a time-series data to a multi-step model data = train.reshape((train.shape[ ] * train.shape[ ], train.shape[ ])) X, y = list(), list() in_start = _ range(len(data)): in_end = in_start + n_input out_end = in_end + n_out out_end <= len(data): X.append(data[in_start:in_end, :]) y.append(data[in_end:out_end, ]) in_start += array(X), array(y) # convert history into inputs and outputs : def to_supervised (train, n_input, n_out= ) 7 # flatten data 0 1 2 0 # step over the entire history one time step at a time for in # define the end of the input sequence # ensure we have enough data for this instance if 0 # move along one time step 1 return Training Different Deep Learning Models using Tensorflow In this section, we describe how we train different DL models using Tensorflow's Keras APIs. Convolution Neural Network (CNN Model) The following figure recollects the structure of a with a code snippet showing how a with , with a of 3 has been used to train the network over Convolution Neural Network (CNN) 1D CNN 16 filters kernel size 7 steps, where each 7 step is of 7 days. Source train_x, train_y = to_supervised(train, n_input) verbose, epochs, batch_size = , , n_timesteps, n_features, n_outputs = train_x.shape[ ], train_x.shape[ ], train_y.shape[ ] model = Sequential() model.add(Conv1D(filters= , kernel_size= , activation= , input_shape=(n_timesteps,n_features))) model.add(MaxPooling1D(pool_size= )) model.add(Flatten()) model.add(Dense( , activation= )) model.add(Dense(n_outputs)) model.compile(loss= , optimizer= ) model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) model # train CNN model : def build_model_cnn (train, n_input) # prepare data # define parameters 0 200 4 1 2 1 # define model 16 3 'relu' 2 10 'relu' 'mse' 'adam' # fit network return CNN Prediction LSTM The following code snippet demonstrates how we train an , plot the t before making a prediction. LSTM model raining and validation loss, train_x, train_y = to_supervised(train, n_input) print(np.shape(train_x)) print(np.shape(train_y)) verbose, epochs, batch_size = , , n_timesteps, n_features, n_outputs = train_x.shape[ ], train_x.shape[ ], train_y.shape[ ] train_y = train_y.reshape((train_y.shape[ ], train_y.shape[ ], )) model = Sequential() model.add(LSTM( , activation= , input_shape=(n_timesteps, n_features))) model.add(RepeatVector(n_outputs)) model.add(LSTM( , activation= , return_sequences= )) model.add(TimeDistributed(Dense( , activation= ))) model.add(TimeDistributed(Dense( ))) model.compile(loss= , optimizer= ) model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) model # train LSTM model : def build_model_lstm (train, n_input) # prepare data # define parameters 0 50 16 1 2 1 # reshape output into [samples, timesteps, features] 0 1 1 # define model 200 'relu' 200 'relu' True 100 'relu' 1 'mse' 'adam' # fit network return The below figure illustrates the , after the predicted outcome has been inverse-transformed (to remove the effect of scaling). Actual vs Predicted Outcome of the Multi-Step LSTM model LSTM Prediction Bi-Directional LSTM The following code snippet demonstrates how we train a BI- , plot the t before making a prediction. LSTM model raining and validation loss, Source train_x, train_y = to_supervised(train, n_input) print(np.shape(train_x)) print(np.shape(train_y)) verbose, epochs, batch_size = , , n_timesteps, n_features, n_outputs = train_x.shape[ ], train_x.shape[ ], train_y.shape[ ] train_y = train_y.reshape((train_y.shape[ ], train_y.shape[ ], )) model = Sequential() model.add(Bidirectional(LSTM( , activation= , input_shape=(n_timesteps, n_features)))) model.add(RepeatVector(n_outputs)) model.add(Bidirectional(LSTM( , activation= , return_sequences= ))) model.add(TimeDistributed(Dense( , activation= ))) model.add(TimeDistributed(Dense( ))) model.compile(loss= , optimizer= ) model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) model # train Bi-Directionsl LSTM model : def build_model_bi_lstm (train, n_input) # prepare data # define parameters 0 50 16 1 2 1 # reshape output into [samples, timesteps, features] 0 1 1 # define model 200 'relu' 200 'relu' True 100 'relu' 1 'mse' 'adam' # fit network return The below figure illustrates the after the predicted outcome has been inverse-transformed (to remove the effect of scaling). Actual vs Predicted Outcome of the Multi-Step Bi-LSTM model BI-LSTM Prediction Stacked LSTM + CNN Here we have used which is then fed to a , to predicted different sequences, as illustrated by the figure below. The CNN model is built first, then added to the LSTM model by wrapping the in a . Conv1d with TimeDistributed Layer, single layer of LSTM entire sequence of CNN layers TimeDistributed layer Source train_x, train_y = to_supervised(train, n_input) verbose, epochs, batch_size = , , n_timesteps, n_features, n_outputs = train_x.shape[ ], train_x.shape[ ], train_y.shape[ ] train_y = train_y.reshape((train_y.shape[ ], train_y.shape[ ], )) model = Sequential() model.add(Conv1D(filters= , kernel_size= , activation= , input_shape=(n_timesteps, n_features))) model.add(Conv1D(filters= , kernel_size= , activation= )) model.add(MaxPooling1D(pool_size= )) model.add(Flatten()) model.add(RepeatVector(n_outputs)) model.add(LSTM( , activation= , return_sequences= )) model.add(TimeDistributed(Dense( , activation= ))) model.add(TimeDistributed(Dense( ))) model.compile(loss= , optimizer= ) model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) model # train Stacked CNN + LSTM model : def build_model_cnn_lstm (train, n_input) # prepare data # define parameters 0 500 16 1 2 1 # reshape output into [samples, timesteps, features] 0 1 1 # define model 64 3 'relu' 64 3 'relu' 2 200 'relu' True 100 'relu' 1 'mse' 'adam' # fit network return The prediction and inverse scaling help to yield the actual predicted outcomes, as illustrated below. LSTM With CNN Multi-Step Forecasting and Evaluation The below snippet states how the data is properly reshaped into (1, n_input, n) to forecast for the following week. For the multi-variate time-series (of 23 features) with test data of 23 samples (with predicted output from previous steps i.e. 21+2) for 3 weeks is reshaped from as (7,7,23), (8,7,23) and (9,7,23) (49,23), (56,23) and (63, 23). Prediction for 3 weeks, by taking the predicted output from previous weeks data = array(history) data = data.reshape((data.shape[ ]*data.shape[ ], data.shape[ ])) input_x = data[-n_input:, :] input_x = input_x.reshape(( , input_x.shape[ ], input_x.shape[ ])) yhat = model.predict(input_x, verbose= ) yhat = yhat[ ] yhat # make a forecast : def forecast (model, history, n_input) # flatten data 0 1 2 # retrieve last observations for input data # reshape into [1, n_input, n] 1 0 1 # forecast the next week 0 # we only want the vector forecast 0 return ( Notebook tseries_deeplearning_singlestep_forecats.ipynb). Note 2: If you wish to see the evaluation results and plots for each step as stated below, please check the notebook at Github https://github.com/sharmi1206/covid-19-analysis Here at each step at the granularity of every week, we evaluate the model and compare it against the actual output. print( , np.shape(actual)) print( , np.shape(predicted)) scores = list() i range(actual.shape[ ]): mse = mean_squared_error(actual[:, i], predicted[:, i]) rmse = sqrt(mse) scores.append(rmse) plt.figure(figsize=( , )) plt.plot(actual[:, i], label= ) plt.plot(predicted[:, i], label= ) plt.title(ModelType + + str(i)) plt.legend() plt.show() s = row range(actual.shape[ ]): col range(actual.shape[ ]): s += (actual[row, col] - predicted[row, col]) ** score = sqrt(s / (actual.shape[ ] * actual.shape[ ])) score, scores model = (ModelType == ): print( ) model = build_model_lstm(train, n_input) (ModelType == ): print( ) model = build_model_bi_lstm(train, n_input) (ModelType == ): print( ) model = build_model_cnn(train, n_input) (ModelType == ): print( ) model = build_model_cnn_lstm(train, n_input) history = [x x train] predictions = list() i range(len(test)): yhat_sequence = forecast(model, history, n_input) predictions.append(yhat_sequence) history.append(test[i, :]) predictions = array(predictions) score, scores = evaluate_forecasts(test[:, :, ], predictions) score, scores, test[:, :, ], predictions # evaluate one or more weekly forecasts against expected values : def evaluate_forecasts (actual, predicted) "Actual Results" "Predicted Results" # calculate an RMSE score for each day for in 1 # calculate mse # calculate rmse # store 14 12 'actual' 'predicted' ' based Multi-Step Time Series Active Cases Prediction for step ' # calculate overall RMSE 0 for in 0 for in 1 2 0 1 return # evaluate a single model : def evaluate_model (train, test, n_input) None # fit model if 'LSTM' 'lstm' elif 'BI_LSTM' 'bi_lstm' elif 'CNN' 'cnn' elif 'LSTM_CNN' 'lstm_cnn' # history is a list of weekly data for in # walk-forward validation over each week for in # predict the week # store the predictions # get real observation and add to history for predicting the next week # evaluate predictions days for each week 0 return 0 Here we show a uni-variate and multi-variate, multi-step time-series prediction. Multi-Step Conv2D + LSTM (Uni-variate & Multi-Variate) based Prediction for State Delhi Source A type of , where the convolutional reading of input is built directly into each LSTM unit. CNN-LSTM is the ConvLSTM (primarily for two-dimensional spatial-temporal data) Here for this particular uni-variate time-series we have the input vector as train_x, train_y = to_supervised_2cnn_lstm(train, n_input) verbose, epochs, batch_size = , , n_timesteps, n_features, n_outputs = train_x.shape[ ], train_x.shape[ ], train_y.shape[ ] train_x = train_x.reshape((train_x.shape[ ], n_steps, , n_length, n_features)) train_y = train_y.reshape((train_y.shape[ ], train_y.shape[ ], )) model = Sequential() model.add(ConvLSTM2D(filters= , kernel_size=( , ), activation= , input_shape=(n_steps, , n_length, n_features))) model.add(Flatten()) model.add(RepeatVector(n_outputs)) model.add(LSTM( , activation= , return_sequences= )) model.add(TimeDistributed(Dense( , activation= ))) model.add(TimeDistributed(Dense( ))) model.compile(loss= , optimizer= ) model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) model data = train.reshape((train.shape[ ]*train.shape[ ], train.shape[ ])) X, y = list(), list() in_start = _ range(len(data)): in_end = in_start + n_input out_end = in_end + n_out out_end <= len(data): x_input = data[in_start:in_end, ] x_input = x_input.reshape((len(x_input), )) X.append(x_input) y.append(data[in_end:out_end, ]) in_start += array(X), array(y) data = array(history) data = data.reshape((data.shape[ ]*data.shape[ ], data.shape[ ])) input_x = data[-n_input:, ] input_x = input_x.reshape(( , n_steps, , n_length, )) yhat = model.predict(input_x, verbose= ) yhat = yhat[ ] yhat model = build_model_cnn_lstm_2d(train, n_steps, n_length, n_input) history = [x x train] predictions = list() i range(len(test)): yhat_sequence = forecast_2cnn_lstm(model, history, n_steps, n_length, n_input) predictions.append(yhat_sequence) history.append(test[i, :]) predictions = array(predictions) score, scores = evaluate_forecasts(test[:, :, ], predictions) score, scores, test[:, :, ], predictions df_state_all = pd.read_csv( ) df_state_all = df_state_all.drop(columns=[ , , ]) stateName = unique_states[ ] dataset = df_state_all[df_state_all[ ] == unique_states[ ]] dataset = dataset.sort_values(by= , ascending= ) dataset = dataset[(dataset[ ] >= ) & (dataset[ ] <= )] print(np.shape(dataset)) daterange = dataset[ ].values no_Dates = len(daterange) dateStart = daterange[ ] dateEnd = daterange[no_Dates - ] print(dateStart) print(dateEnd) dataset = dataset.drop(columns=[ , , , , , , ]) print(np.shape(dataset)) n = np.shape(dataset)[ ] scaler = MinMaxScaler(feature_range=( , )) scaled_dataset = scaler.fit_transform(dataset) train, test = split_dataset(scaled_dataset) n_steps, n_length = , n_input = n_length * n_steps score, scores, actual, predicted = evaluate_model_2cnn_lstm(train, test, n_steps, n_length, n_input) summarize_scores(ModelType, score, scores) # train CONV LSTM2D model : def build_model_cnn_lstm_2d (train, n_steps, n_length, n_input) # prepare data # define parameters 0 750 16 1 2 1 # reshape into subsequences [samples, time steps, rows, cols, channels] 0 1 # reshape output into [samples, timesteps, features] 0 1 1 # define model 64 1 3 'relu' 1 200 'relu' True 100 'relu' 1 'mse' 'adam' # fit network return # convert history into inputs and outputs : def to_supervised_2cnn_lstm (train, n_input, n_out= ) 7 # flatten data 0 1 2 0 # step over the entire history one time step at a time for in # define the end of the input sequence # ensure we have enough data for this instance if 0 1 0 # move along one time step 1 return # make a forecast : def forecast_2cnn_lstm (model, history, n_steps, n_length, n_input) # flatten data 0 1 2 # retrieve last observations for input data 0 # reshape into [samples, time steps, rows, cols, channels] 1 1 1 # forecast the next week 0 # we only want the vector forecast 0 return # evaluate a single model : def evaluate_model_2cnn_lstm (train, test, n_steps, n_length, n_input) # fit model # history is a list of weekly data for in # walk-forward validation over each week for in # predict the week # store the predictions # get real observation and add to history for predicting the next week # evaluate predictions days for each week 0 return 0 'all_states/all.csv' 'Latitude' 'Longitude' 'index' 8 'Name of State / UT' 8 'Date' True 'Date' '2020-03-25' 'Date' '2020-06-06' 'Date' 0 1 'Unnamed: 0' 'Date' 'source1' 'state' 'Name of State / UT' 'tagpeopleinquarantine' 'tagtotaltested' 0 0 1 # split into train and test # define the number of subsequences and the length of subsequences 2 7 # define the total days to use as input # summarize scores The model parameters can be summarized as : Model Summary Conv2D + LSTM The evaluate_model function appends the model forecasting score at each step and returns it at the end. The below figure illustrates the after the predicted outcome has been inverse-transformed (to remove the effect of scaling). Actual vs Predicted Outcome of the Multi-Step ConvLSTM2D model Uni-Variate ConvLSTM2D For multi-variate time series wit and , we take into consideration the following changes: In function we replace the input data shaping to constitute the multi-variate features h 22 input features one output prediction forecast_2cnn_lstm input_x = data[-n_input:, :]. input_x = input_x.reshape(( , n_steps, , n_length, data.shape[ ])) #In function forecast_2cnn_lstm #replacing 0 with : # reshape into [samples, time steps, rows, cols, channels] 1 1 1 #replacing 1 with #data.shape[1] for multi-variate Further, in function , we replace x_input's feature size from 0 to : and 1 with 23 features as follows: to_supervised_2cnn_lstm x_input = data[in_start:in_end, :] x_input = x_input.reshape((len(x_input), x_input.shape[ ])) 1 Multi-Variate ConvLSTM2D Conv2D + BI_LSTM We can further try out Bi-Directional LSTM with a 2D Convolution Layer as depicted in the figure below. The model stacking and subsequent layers remain the same as tried in the previous step, with the exception of using a BI-LSTM in place of a single LSTM. Source Now, let's look at the comparison metrics of different deep learning models. Conclusion In this blog, I have discussed multi-step time-series prediction using deep learning mechanisms and compared/evaluated them based on RMSE. Here, we notice that for a forecasting time-period of 7 days stacked works the best, followed by , , and networks. More extensive model evaluation with different hidden layers and neurons with efficient hyperparameter tuning can further improve accuracy. ConvLSTM2D LSTM with CNN CNN LSTM Though we see the model accuracy decreases for multi-step models, this can be a useful tool for having long term forecasts where predicted outcomes in the previous week help in playing a dominant role on predicted outputs. For complete source code check out https://github.com/sharmi1206/covid-19-analysis Acknowledgements Special thanks to . as some of the concepts have been taken from there. machinelearningmastery.com References https://arxiv.org/pdf/1801.02143.pdf https://machinelearningmastery.com/multi-step-time-series-forecasting/ https://machinelearningmastery.com/multi-step-time-series-forecasting-with-machine-learning-models-for-household-electricity-consumption/ https://machinelearningmastery.com/how-to-develop-lstm-models-for-multi-step-time-series-forecasting-of-household-power-consumption/ https://machinelearningmastery.com/convert-time-series-supervised-learning-problem-python/ https://www.tensorflow.org/tutorials/structured_data/time_series https://www.aiproblog.com/index.php/2018/11/13/how-to-develop-lstm-models-for-time-series-forecasting/