In order to understand ‘Big Data’, we first need to know what ‘data’ is.Oxford dictionary defines ‘data’ as -“Thequantities, characters, or symbols on which operations are performed by acomputer, which may be stored and transmitted in the form of electrical signalsand recorded on magnetic, optical, or mechanical recording media. “Big data is a term that is used to describe data that is highvolume, high velocity, and/or high variety; requires new technologies andtechniques to capture, store, and analyze it; and is used to enhance decisionmaking, provide insight and discovery, and support and optimize processes.Here, big data is used tobetter understand customers and their behaviors and preferences.
Companies arekeen to expand their traditional data sets with social media data, browser logsas well as text analytics and sensordata to get a more complete picture of their customers.Big Data Sources. Big data sources are repositories of large volumes of data.
…
This bringsmore information to users’ applications without requiring that the data be held in asingle repository or cloud vendor proprietary data store.Examples of big data sources are AmazonRedshift, HP Vertica, and MongoDB.The general consensus of theday is that there are specific attributes that define big data. In most bigdata circles, these are called the four V’s: volume, variety,velocity, and veracity. (You mightconsider a fifth V, value.
)That’s why big data analyticstechnology is so important to heathcare. By analyzing large amounts of information – both structured andunstructured – quickly, health care providers can provide lifesaving diagnosesor treatment options almost immediately.Big data tools: Talend Open Studio.
Talend also offers an Eclipse-based IDEfor stringing together data processingjobs with Hadoop. Its tools are designedto help with data integration, data quality, and data management,all with subroutines tuned to these jobs.So, ‘Big Data’ is also a data butwith a huge size. ‘Big Data’ isa term used to describe collection of data that is huge in size and yet growingexponentially with time.In short, sucha data is so large and complex that none of the traditional data managementtools are able to store it or process it efficiently. Statistic shows that 500+terabytes of new data gets ingested into the databases of socialmedia site Facebook, every day. Thisdata is mainly generated in terms of photo and video uploads, messageexchanges, putting comments etc.