Stages of Data Analysis: 1

Let’s understand the stages of Data Analysis in this topic and then deep dive into the first 2. 

Data analysis

The steps

Data analysis is the act of examining, cleaning, converting, and modeling data in order to uncover relevant information, inform conclusions, and support decision-making.

The whole process can be summarized into 4/5 main steps. We shall briefly go through an overview of each of the steps. But first, watch this video on the process of analytics

To clarify, the process can be divided into 4 or 5 steps depending on which resource you use. In some textbooks the process can be broken up into more stages. 

For this course however, we shall focus on 4 particular stages and the order will slightly vary. Here is a map that shows the 4 stages. 

Step 1: Data Preparation

Data collection + Data preparation

As an agent of LM13, you will need to have a strategy on how to collect the data you need. You will also need to know how to organize it and therefore “prepare” it for the next stage.

This stage is said to be the most crucial stage of the whole process as any outcome, is directly affected by this stage. 

Data collection

For this course, the data is already available and collected! Data collection is a massive topic all by itself – but to keep things simple and maintain focus on data analysis with power BI, we shall not go into data collection. 

If you are interested in learning more about data collection, here is a link and a video.

  • A step-by-step guide to data collection – click here
  • A video to understand data collection – click here

Data preparation

Let us imagine we have data that we have already collected from different sources. Next – we need to ensure the data we have is ready for use in analysis. To do this, we first need to check that the data we have is

  • accurate, 
  • relevant and 
  • consistent. 

Data preparation and cleaning is the act of transforming raw data (from many data sources) into a format that can be easily and correctly analyzed. 

Data preparation activities:

Click on the blue arrows below to find out more about each data preparation activity:

This involves adding the data to a common storage where you can easily retrieve it for use. In this course, we shall be using a tool called power bi for analysis, and therefore we will need to know how to load data into that tool before we can even do anything on the data.

As we shall be handling different datasets from different places and in different formats, we shall also need to bring all that data together to combine it and make it consistent.

We need to make sure the data we have is consistent and relevant, this means we can trim what we know we don’t need from the data. We can consistently format the data which is especially important for values like dates, and times which can be formatted differently. We will also need to consider what we need to do with data that is incomplete, as this can affect our accuracy.

Here is where the fun begins, where we can actually create models using the data, or run calculations on the data. You can think of it as using the existing data to generate new data, (which usually is more high level).

Here, we are looking at how we can further format the data we have for our use, what are we expecting to get from the data?

So you know the first Stage – Data Collection and Preparation. 

Next up – let’s learn about the next 3 stages of data analysis.