Big Data In 5 Minutes | What Is Big Data?| Big Data Analytics | Big Data Tutorial | Simplilearn
Estimated read time: 1:20
AI is evolving every day. Don't fall behind.
Join 50,000+ readers learning how to use AI in just 5 minutes daily.
Completely free, unsubscribe at any time.
Summary
In this Simplilearn video, the concept of Big Data is explored, highlighting its immense scale and the challenges it poses to traditional computing systems. With 40 exabytes generated monthly per smartphone user, the global data output is enormous, driven by activities such as social media, searches, and emails. The video explains the Five V's of Big Data: Volume, Velocity, Variety, Veracity, and Value, using the healthcare industry as an example. It also introduces frameworks like Hadoop for managing big data and discusses its real-world applications in gaming and disaster management. The video concludes with a trivia question about Hadoop and an invitation to viewers to think about Big Data's future impact.
Highlights
Smartphones generate about 40 exabytes of data monthly. π±
The Five V's help define and manage Big Data effectively. β
Hadoop's Distributed File System offers secure, distributed data storage. π
MapReduce breaks tasks into smaller, parallel tasks for efficient processing. π
Big Data's impact is significant in diverse areas like healthcare and natural disaster prediction. π
Key Takeaways
Big Data is a massive amount of data beyond traditional computing's capacity to manage. π
The Five V's of Big Data: Volume, Velocity, Variety, Veracity, and Value, guide its classification and handling. ποΈ
Hadoop, using its Distributed File System, stores data safely across machines, ensuring data is secure even if one machine fails. π½
Parallel processing with MapReduce makes data processing faster and more efficient. π
Big Data applications span across sectors like healthcare, gaming, and disaster management, revolutionizing outcomes. π
Overview
Imagine the vast data ocean created by our daily digital habits! Every text, call, photo, and search adds to a massive digital universe. If 40 exabytes per month per smartphone isn't mind-boggling enough, multiply it by 5 billion users globally. That's the realm of Big Data, far exceeding traditional computing's grasp.
Big Data dances on the principles of the Five V's: Volume, Velocity, Variety, Veracity, and Value. Take the healthcare industry as an example where enormous amounts of data, from patient records to medical images, demand swift and accurate handling to offer real value in medical insights and treatment enhancements.
Hadoop knocks out Big Data challenges! By slicing data into smaller chunks across different machines (thanks HDFS!), and leveraging parallel processing (hello MapReduce), it ensures not only secure storage but also super-speedy data analysis. From gaming to forecasting hurricanes, Big Dataβs influence is as extensive as it is exciting.
Big Data In 5 Minutes | What Is Big Data?| Big Data Analytics | Big Data Tutorial | Simplilearn Transcription
00:00 - 00:30 we all use smart phones but have you ever wondered how much data it generates in the form of texts phone calls emails photos videos searches and music approximately 40 exabytes of data gets generated every month by a single smartphone user now imagine this number multiplied by 5 billion smartphone users that's a lot for our mind even process isn't it in fact this amount of data is quite a lot for traditional computing systems to handle and this massive amount of data
00:30 - 01:00 is what we term as big data let's have a look at the data generated per minute on the internet 2.1 million snaps are shared on snapchat 3.8 million search queries are made on google one million people log onto facebook 4.5 million videos are watched on youtube 188 million emails are sent that's a lot of data so how do you classify any data as big data this is possible with the
01:00 - 01:30 concept of five v's volume velocity variety veracity and value let us understand this with an example from the health care industry hospitals and clinics across the world generate massive volumes of data 2314 exabytes of data are collected annually in the form of patient records and test results all this data is generated at a very high speed which attributes to the velocity of big data
01:30 - 02:00 variety refers to the various data types such as structured semi-structured and unstructured data examples include excel records log files and x-ray images accuracy and trustworthiness of the generated data is termed as veracity analyzing all this data will benefit the medical sector by enabling faster disease detection better treatment and reduced cost this is known as the value of big data
02:00 - 02:30 but how do we store and process this big data to do this job we have various frameworks such as cassandra hadoop and spark let us take hadoop as an example and see how hadoop stores and processes big data hadoop uses a distributed file system known as hadoop distributed file system to store big data if you have a huge file your file will be broken down into smaller chunks and stored in various machines not only that when you break
02:30 - 03:00 the file you also make copies of it which goes into different nodes this way you store your big data in a distributed way and make sure that even if one machine fails your data is safe on another mapreduce technique is used to process big data a lengthy task a is broken into smaller tasks b c and d now instead of one machine three machines take up each task and complete it in a parallel fashion and assemble
03:00 - 03:30 the results at the end thanks to this the processing becomes easy and fast this is known as parallel processing now that we have stored and processed our big data we can analyze this data for numerous applications in games like halo 3 and call of duty designers analyze user data to understand at which stage most of the users pause restart or quit playing this insight can help them rework on the story line of the game and improve the
03:30 - 04:00 user experience which in turn reduces the customer churn rate similarly big data also helped with disaster management during hurricane sandy in 2012. it was used to gain a better understanding of the storm's effect on the east coast of the u.s and necessary measures were taken it could predict the hurricane's landfall five days in advance which wasn't possible earlier these are some of the clear indications of how valuable big data can be once it is accurately processed and analyzed
04:00 - 04:30 so here's a question for you which of the following statements is not correct about hadoop distributed file system hdfs a hdfs is the storage layer of hadoop b data gets stored in a distributed manner in hdfs c hdfs performs parallel processing of data d smaller chunks of data are stored on multiple data nodes in hdfs give it a thought and leave your answers
04:30 - 05:00 in the comment section below three lucky winners will receive amazon gift vouchers now that you have learned what big data is what do you think will be the most significant impact of big data in the future let us know in the comments below if you enjoyed this video it would only take a few seconds to like and share it also to subscribe to our channel if you haven't yet and hit the bell icon to get instant notifications about our new content stay tuned and keep learning