Abstract – As
the technology extends, the cost of main memory is significantly shrinks. This
change the way of how Big Data has been processed and managed in comparison to the
traditional, ad-hoc parallel data processing method. The recent advent of
in-memory data processing and management starting to play a major role in
today’s data driven arena. Among these, one of the most well-known systems is
Apache Spark. Spark has been increasingly adopted by industries in recent years
for big data analysis by providing a fault tolerant, scalable and easy to use
in memory abstraction. However, there are unique computational and statistical
challenges such as data locality, fault-tolerance and consistency which needs
to be identified and classified due to the rapid development of in-memory based
applications. This paper presents a state-of-the- art a Systematic Literature Review
(SLR) that presents a comprehensive literature review of in-memory data
processing, management and its methods theorized/proposed/ employed by
organizations and academia to help understand this landscape in a systematic
manor. This SLR is carried out through observing and evaluating contributions,
summarizing knowledge, thereby identifying limitations, implications and
potential research paths to support in choosing a research theme. Thus, to
trace the implementation of Big Data strategies, a profiling method is employed
to analyse articles (peer-reviewed journals between 2005 and 2017) extracted
from the various databases such as IEEE, ScienceDirect, ACM Digital Library and