Abstract— Investors take huge risk in making investments because

they lack the information about the future sale of the company or we can also

say that they lack forecasting. Due to which they may have to face loss in

future. Similar is the case with transportation, especially when it comes to

airline mode because of huge increase in investment and risk factor. Keeping in

mind the above discussed issue this paper deals with forecasting. The paper

proposes the use of Long Short-Term Memory Network. Unlike the existing

schemes, the proposed scheme is capable enough to minimize the Root Mean Square

Error to maximum extent which ultimately would lead to better forecasting

results. The proposed scheme consist of four steps LSTM model construction,

data processing and making data suitable for LSTM model, fitting a stateful

LSTM network model to the training data and at last we evaluate the static LSTM

model on the test data and report the performance of forecast .Furthermore, the

obtained results indicates that the proposed scheme reduces the chances of

false positive to great extent and is practical enough to be implemented in

real-time scenario.

Keywords- LSTM , Time series prediction ,

Forecasting

I. Introduction

Time series modeling is a dynamic research area which has

attracted attentions of researcher’s community over last few decades. The main

aim of time series modeling is to carefully collect and rigorously study the

past observations of a time series to develop an appropriate model which

describes the inherent structure of the series. This model is then used to

generate future values for the series, that is to make forecasts. Time series

forecasting thus can be termed as the act of predicting the future by

understanding the past 1.

Now a days travelling is playing most important role in

life of people because of which huge amount of investment is made by various

people. Faster you travel from one place to another better it is for user and

because of which airlines are major targets to investors. This paper deals with

the airline dataset which gives the current and previous records of number of

passengers travelled using airline. With help of the theses records this paper

will analyse the number of passengers in the upcoming months and years. This

information will be the key factor for investors to decide whether they want to

investment in the airline business or not depending the change in rate of the

number of customers whether its positive or negative.

For solving the above problem we have used Stacked

Stateful LSTM with memory between the batches. In the proposed schema we will

construct a LSTM model. LSTM is the type of Recurrent Neural Network (RNN). The

benefit of this type of network is that it can learn and remember over long

sequences and does not rely on a pre-specified window lagged observation. The

dataset loaded contains information of the number of passengers who had travelled

through the airline in particular month and year. The dataset needs to be

transformed to make it more suitable for LSTM model which include.

A. MOTIVATION

Risk factor for any investment increases as the amount of

investment increases because of which investors do keep tract of the company’s

profile and measure the stability of the company. But predictions from the

above method turn out of to give huge amount of error. This paper proposed a

scheme to reduce this error and for more accurate forecasting. The major

motivation of this paper is to reduce the risk factor and also to analyse the

company.

B. CONTRIBUTION

The primary contribution of this paper is to Long

Short-Term Model is to that it reduces the Root Mean Square Error of LSTM model

with use of Keras deep learning library. The another major contribution of this

paper is the combination of Keras deep learning library with LSTM.

C. ORGANIZATION

The rest of the paper is organized as follows. Section II

gives the relative work which are done in this field. Section III gives the brief description about

the working of the proposed scheme. Section IV elaborates the LSTM model. The

results and discussion are presented in Section V. The paper is finally

concluded in Section V.

II. RELATED WORK

The model we present in

the next section is the result of inspiration we have taken from prior work on

prediction on airline based on various factors. 18 This paper used the

back-propagation neural network and genetic algorithm to forecast the air

passenger demand in Egypt(International and Domestic). The factors that

influence air passenger are identified, evaluated and analyzed by applying the

back-propagation neural network on the monthly data from 1970 to 2013. 19

This paper is based on comparative study of new method and traditional

forecasting technique such as moving average, exponential smoothing,

regression, etc. All methods were compared on the basis of a standard error

measure- mean absolute percentage error. 20 This paper proposes an ensemble empirical mode decomposition

(EEMD) based support vector machines (SVMs) modeling framework incorporating a

slope-based method to restrain the end effect issue occurring during the

shifting process of EEMD. Along with above research papers One

of the most popular and frequently used stochastic time series models is the

Autoregressive Integrated Moving Average (ARIMA) 3, 5 model. The basic

assumption made to implement this model is that the considered time series is

linear and follows a particular known statistical distribution, such as the

normal distribution. ARIMA model has subclasses of other models, such as the

Autoregressive (AR) 6, 8, Moving Average (MA) 5 and Autoregressive Moving

Average (ARMA) 4, 5 models.

For

seasonal time series forecasting, Box and Jenkins 8 had proposed a quite

successful variation of ARIMA model, viz. the Seasonal ARIMA (SARIMA) 5, 8.

The popularity of the ARIMA model is mainly due to its flexibility to represent

several varieties of time series with simplicity as well as the associated

Box-Jenkins methodology 3, 5 for optimal model building process. But the severe

limitation of these models is the pre-assumed linear form of the associated

time series which becomes inadequate in many practical situations.

III PROPOSED SCHEME

This section illustrate the

proposed scheme. To solve the problem of airlines passenger time series prediction we have developed a model which consists of a

Sequential , 2 LSTM and one Dense Layer. Fig 1 shows the flow of data across

the various layers of the model.

LSTM LAYER

SEQUENTIAL LAYER:

It is used to create a sequential model. It acts as a

linear stack of layers. All the other layers like LSTM and Dense are added to

it to create a model.

LSTM LAYER:

Long Short Term Memory networks –

usually just called “LSTMs” – are a special kind of RNN, capable of learning

long-term dependencies. They were introduced by Hochreiter & Schmidhuber

(1997)21.LSTMs are explicitly

designed to avoid the long-term dependency problem.All recurrent neural

networks have the form of a chain of repeating modules of neural network. In

standard RNNs, this repeating module will have a very simple structure, such as

a single tanh layer.

Fig:2

LSTMs also have this chain like

structure, but the repeating module has a different structure. Instead of

having a single neural network layer, there are four, interacting in a very

special way.

Fig:3

DENSE LAYER:

A

dense layer is simply a layer where each unit or neuron is connected to each

neuron in the next layer.

IV PROPOSED METHODOLOGY

The

steps that we had taken to solve the problem at had were:

1. In

the first step we imported the required modules which were namely numpy ,

matplotlb , keras, math , pandas and sklearn .

2. In

the second step the dataset “international-airlines-passenger.csv” was loaded

using pandas read_csv() function.

3. In

the third step the data was normalised using MinMaxscaler() function of sklearn

module .

4. In

the fourth step the dataset was broken into two parts for one for training(67%)

and the other for testing(33%) and the broken data chunks were reshaped so that

they could be feed into the model.

5. In

the fifth step we trained and fitted our

model.

6. In

the sixth step we made prediction for the training and testing dataset.

7. And

finally in the last step graphs were plotted for the predicted values.

V RESULTS

The

result obtained after working through the above algorithm or methodology can be

summarised using a graph . The graph so plotted shows how accurately the above

model predicts the actual data points.

Fig:4

The

key to read the graph shown in Fig 4 is present in the table 1.

COLOUR

DATA

POINTS

BLUE

ACTUAL DATA POINTS

ORANGE

PREDICTED PONTS OF TRAINING DATA

GREEN

PREDICTED POINTS OF TESTING DATA

Table:1

VI CONCLUSION

From

the graph shown in Fig 4 we can conclude that the model so used works quite

well for recreating the values that had been used in training but lacks

efficiency in predicting the unseen data. But overall the model works fairly

well. Also there is a sudden jump that can be seen while shifting from training

to testing data.

VII FUTURE WORK

In

future one might look at extending the model by incorporating convolution

layers and try making the model even more deeper so that the efficiency of the

network improves and the sudden observed change also fades away while moving

from training to testing data points.

VIII REFERENCES

1 T. Raicharoen, C. Lursinsap, P. Sanguanbhoki, “Application

of critical support vector

machine to time series prediction”, Circuits and Systems,

2003. ISCAS ’03.Proceedings of

the 2003

International Symposium on Volume 5, 25-28 May, 2003, pages: V-741-V-744

2 G.P. Zhang, “A neural network ensemble method with

jittered training data for time

series

forecasting”, Information Sciences 177 (2007), pages: 5329–5346.

3 G.P. Zhang, “Time series forecasting using a hybrid ARIMA

and neural network

model”,

Neurocomputing 50 (2003), pages: 159–175.

4 John H. Cochrane, “Time Series for Macroeconomics and

Finance”, Graduate School

of Business,

University of Chicago, spring 1997.

5 K.W. Hipel, A.I. McLeod, “Time Series Modelling of Water

Resources and

Environmental

Systems”, Amsterdam, Elsevier 1994.

6 J. Lee, “Univariate time series modeling and forecasting

(Box-Jenkins Method)”, Econ

413, lecture

4.

7 C. Hamzacebi, “Improving artificial neural networks’

performance in seasonal time

series

forecasting”, Information Sciences 178 (2008), pages: 4550-4559.

8 G.E.P. Box, G. Jenkins, “Time Series Analysis,

Forecasting and Control”, Holden-Day,

San Francisco,

CA, 1970.

9 H. Tong, “Threshold Models in Non-Linear Time Series

Analysis”, Springer-Verlag,

New York,

1983.

10 Chu KL, Sahari KSM. Behavior recognition for humanoid

robots using long short-term memory. 2016;13(6):172988141666336.

11 Palangi H, Deng L, Shen YL, Gao JF, He XD, Chen JS, et al. Deep

Sentence Embedding Using Long Short-Term Memory Networks: Analysis and

Application to Information Retrieval. IEEE-ACM Trans Audio Speech Lang.

2016;24(4):694–707

12 Palangi H, Ward R, Deng L. Distributed Compressive Sensing: A Deep

Learning Approach. IEEE Transactions on Signal Processing. 2016;64(17):4504–18.

13 Yue J, Zhao W, Mao S, Liu H. Spectral—spatial classification of

hyperspectral images using deep convolutional neural networks. Remote Sensing

Letters. 2015;6(6):468–77

14 Hinton G, Deng L, Yu D, Dahl GE, Mohamed A, Jaitly N, et al. Deep

Neural Networks for Acoustic Modeling in Speech Recognition. IEEE Signal

Processing Magazine. 2012;29(6):82–97.

15 Dahl GE, Yu D, Deng L,

Acero A. Context-Dependent Pre-Trained Deep Neural Networks for

Large-Vocabulary Speech Recognition. IEEE Transactions on Audio Speech &

Language Processing. 2012;20(1):30–42.

16 Mesnil

G, He X, Deng L, Bengio Y. Investigation of recurrent-neural-network

architectures and learning methods for spoken language understanding.

Interspeech. 2013

17 Hochreiter S, Schmidhuber J. Long Short-Term Memory. Neural

Computation. 1997;9(8):1735–80. pmid:9377276

18

International Journal of Computer Application (0975-8887) on “Airline

Passenger Forecasting in EGYPT (Domestic and International) by M.M.Mohie

El-Din, M. S. Farag and A. A. Abouzeid. Department of Mathematics, Faculty of

science, Al-Azhar University Nasr City, Cairo 31884, Egypt

19 Journal of Revenue

and Pricing Management, Volume 1, Number 4 on “Neural network forecasting for

airlines : A comparative analysis ” by Lawrence R. Weatherford, *Travis W.

Gentry and Bogdan Wilamowski.

20 “Forecasting Air Passenger Traffic by Support

Vector Machines with Ensemble Empirical Mode Decomposition and Slope-Based

Method” by Yukun Bao, Tao Xiong, and Zhongyi

Hu Department of Management Science and Information System,

School of Management, Huazhong University of Science and Technology, Wuhan

430074, China

21 In addition to the original authors, a lot of people

contributed to the modern LSTM. A non-comprehensive list is: Felix Gers, Fred

Cummins, Santiago Fernandez, Justin Bayer, Daan Wierstra, Julian Togelius,

Faustino Gomez, Matteo Gagliolo, and Alex Graves.