Data Scientist | AI Practitioner | Software Developer. Giving talks, teaching, writing.
We live in a world where billions of data points are generated every single day from different sources, such as banks, telecommunication companies, industries, tourism, the agriculture sector, educational institutions (primary, secondary, colleges, and universities), and mobile devices. Any organization can start using their data to make data-driven decision-making that is effective and supportive of their mission and vision.
Regardless of the size of the business you’re running, you need valuable data to provide you with business insights. The insights help you to know your target audience and their preferences, and as a result, your business will be able to anticipate their needs. You can use insights from big data to outperform your competition by capturing and innovating through big data.
Companies like Google and Alibaba are using it to discover flaws in their services and products, suppliers and buyers, and consumer intent and preferences so they can create newer, better ones.
“There were 5 Exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days ” Eric Schmidt
The world data is doubling every 1.2 years. Here in Tanzania, we are more than 60.9 million people, more than 43.46 million mobile subscribers (72% penetration), more than 23 million internet users (38% penetration), and 4.40 million social media users. Each day we spend an average of 27.6 million texts, and we are not consuming it but creating it, we are Data Agents. 80% of these new data are unstructured, it is too large and too complex, and too disorganized to be analyzed by traditional tools.
“Everything we do generate data” — Dr. Michael Rappa
The question is how we can benefit from this huge amount of data generated every day?
In this article, you will learn what is big data, big data analytics, and the life cycle of big data analytics.
Big Data is the massive amount of data that cannot be stored, processed, and analyzed using traditional tools due to its volume, velocity, value, and veracity. Big Data sets are generally huge (tens of terabytes) and sometimes crossing the threshold of petabytes.
Many big companies around the world like Alibaba, Amazon, Netflix, General Electric, and others use big data for decision making. There’s no doubt that Big Data will continue to play an important role in many different industries around the world. In order to reap more benefits from Big Data, it’s important for any company to train its employees about Big Data management. With proper management of Big Data, your business will be more productive and efficient.
The importance of Big Data does not revolve around how much data a company has, but how a company utilizes the collected data. Every company uses data in its own way; the more efficiently a company uses its data, the more potential for business growth. Example telecommunication companies in Tanzania can use data generated by mobile subscribers and internet users to provide better services to their customers and add competitive advantages.
But Big Data and its rows form have no value to us we must try to derive meaningful information from it in order to benefit from this Big Data by using Big Data analytics.
Big Data analytics is the process of extracting useful information by analyzing different types of big data sets. Big Data analytics is used to discover hidden patterns, market trends, consumer preferences, and unknown correlations for the benefit of organizational decision-making.
Big Data analytics is more than technology is the new way of thinking, making a data-driven decision and increase their operational efficiency, it will help business companies better understand their customer's experience, support product development, and innovation find hidden opportunities even help our government better serve citizens.
In order to provide a framework to organize the work needed by an organization and deliver clear insights from Big Data, it’s useful to think of it as a cycle with different stages. These stages will help you reach your objectives. The upcoming sections explore a specific data analytics lifecycle that organizes and manages the tasks and activities associated with the analysis of Big Data.
(a) Business case evaluation
The first and most important stage is to have a business case that defines the reasons and purpose behind the analysis. The business case must be identified, created, assessed, and approved. We need to understand why we are analyzing so that we know how to do it and what are the different parameters have to be looked into. The business case helps determine assessment criteria and guidance for the evaluation of the analytic results.
(b) Identification of data
A broad variety of data sources are identified and gather all the data which will be required for analysis. Identifying a wider variety of data sources may increase the probability of finding hidden patterns and correlations. Sometimes the required datasets and their sources can be internal and/or external to the enterprise depending on the scope of the analysis.
(c) Data filtering
All the identified data from the previous stage is gathered and filtered to remove corrupt data or data that have no value to the analysis objectives. Keep in mind, not all data collected during the previous stage will have meaningful information.
(d) Data extraction
Some of the meaningful data identified from the third stage can be incompatible with the analysis tools. The data that is not compatible with the tool is extracted and then transformed into a form that is compatible with the analysis tools you plan to use.
(e) Data Aggregation
In this stage, data with the same fields across different datasets are integrated together to arrive at a unified view. Example Date and ID.
(f) Data analysis
This is a very important stage in the life cycle of big data analytics. Data analysis is the process of evaluating data using analytical and statistical tools to discover patterns or correlations. This stage can be iterative means the analysis process can be repeated until desired results are uncovered.
(g) Visualization of data
The results of the Data Analysis stage are then graphically communicated using tools like Tableau, PowerBI, and QlikView. Graphs generated can help to communicate the analysis results for effective interpretation by business users and then obtain values from the analysis.
(f) Utilization of Analysis results
The Analytical results obtained are made available for different business stakeholders to support business decision-making. This will help to improve data-driven decision-making in the company rather than depending on your personal experiences.
Congratulations 👏👏, you have made it to the end of this article!. I hope you have learned something new that will help you in your career.
If you learned something new or enjoyed reading this article, please share it so that others can see it. Until then, see you in the next post! You can also find me on Twitter @Davis_McDavid.
This article was also published on Medium.
Create your free account to unlock your custom reading experience.