Data analysis is the act of examining, cleaning, converting, and modeling data in order to uncover relevant information, inform conclusions, and support decision-making.
The whole process can be summarized into 4/5 main steps. We shall briefly go through an overview of each of the steps. But first, watch this video on the process of analytics
To clarify, the process can be divided into 4 or 5 steps depending on which resource you use. In some textbooks the process can be broken up into more stages.
For this course however, we shall focus on 4 particular stages and the order will slightly vary. Here is a map that shows the 4 stages.
As an agent of LM13, you will need to have a strategy on how to collect the data you need. You will also need to know how to organize it and therefore “prepare” it for the next stage.
This stage is said to be the most crucial stage of the whole process as any outcome, is directly affected by this stage.
For this course, the data is already available and collected! Data collection is a massive topic all by itself – but to keep things simple and maintain focus on data analysis with power BI, we shall not go into data collection.
If you are interested in learning more about data collection, here is a link and a video.
Let us imagine we have data that we have already collected from different sources. Next – we need to ensure the data we have is ready for use in analysis. To do this, we first need to check that the data we have is
Data preparation and cleaning is the act of transforming raw data (from many data sources) into a format that can be easily and correctly analyzed.
Click on the blue arrows below to find out more about each data preparation activity:
This involves adding the data to a common storage where you can easily retrieve it for use. In this course, we shall be using a tool called power bi for analysis, and therefore we will need to know how to load data into that tool before we can even do anything on the data.
As we shall be handling different datasets from different places and in different formats, we shall also need to bring all that data together to combine it and make it consistent.
We need to make sure the data we have is consistent and relevant, this means we can trim what we know we don’t need from the data. We can consistently format the data which is especially important for values like dates, and times which can be formatted differently. We will also need to consider what we need to do with data that is incomplete, as this can affect our accuracy.
Here is where the fun begins, where we can actually create models using the data, or run calculations on the data. You can think of it as using the existing data to generate new data, (which usually is more high level).
Here, we are looking at how we can further format the data we have for our use, what are we expecting to get from the data?
So you know the first Stage – Data Collection and Preparation.
Next up – let’s learn about the next 3 stages of data analysis.
Whitespace – basically empty space (like spaces, tabs, or enter button) that makes the code look neat and organized, but the computer doesn’t care about it.
A User – is any person that interacts (by inputs) with a program without having to write the code directly. For example, you are the user to the code that makes your browser/website run. When you clicked on this pop-up, that was the input that made this explanation come up when the code detected it.
To book your spot – click here
All learners who are aspiring web developers will have an opportunity to build a website for a live NGO or charity client as part of their community service hours. This project will be run jointly with Community Hours – so all your time spent counts towards your LO credits. This event is suitable for learners, parents and their teachers.
TechWays will be providing the WordPress course and web dev resources for free to any learner wanting to participate.
Besides the amazing community service you’ll be doing for a charity in need – you’ll also be building your portfolio of web dev skills. Who knows – web dev could become a side hustle for extra income?
Book your spot HERE
Indentation – In the written form of many languages, an indentation or indent is an empty space at the beginning of a line to signal the start of a new paragraph.
Text editor – is the part of the IDE where you write the code. Most text editors highlight words with different properties like functions to help you distinguish them from one another.
Homogeneous – of the same kind; alike throughout.
Heterogeneous – diverse in character or content; containing different things
Prompt – to cause or bring about; to make something happen. For example making someone to say or write something.
Troubleshooting is a form of problem solving, often applied to repair failed products or processes on a machine or a system. It is a logical, systematic search for the source of a problem in order to solve it, and make the product or process operational again.
There are a lot of strings functions/methods in Python. Find full list in course manual. Here’s are some that you find useful in this course:
Functions
Methods
There a number of special string characters that have different functions when used inside ” “. Here’re some useful and common ones:
In programming Concatenation is a process of appending one string to another.
\ – escape character is a string character that tell Python that the next character after it should be taken as a string and not as an instruction.
str ( ) is a built-in function that converts and sequence of characters (numbers especially) in to text.
Mad Libs is a phrasal template word game created by Leonard Stern and Roger Price. It consists of one player prompting others for a list of words to substitute for blanks in a story before reading aloud.
type ( ) is a built-in function (still to cover what built-in functions are later) that determines the Data Type of any data presented.
input ( ) is a built-in function (still to cover what built-in functions are later) allows a user to insert info into a program/the code.
print ( ) is a built-in function (still to cover what built-in functions are later) that executes data inside the brackets. The results get printed out on the console/results section.
Integrated Development Environment (IDE) – A digital environment used to develop games, software, hardware, that offers integration from debugging to compiling. Essentially where you write, edit, and run to test your code.
More about variables
Info about variables
To book your spot – click here
#WOW – What Outstanding Work – Awards: join us to learn from our students.
Our top 20 learners are from St Andrews for Girls, Reddam Umhlanga, Evolve Online, Nova Pioneer and Sutherland High.
Learners will be presenting their final projects. Come celebrate their successes and lessons learnt with us at our TechWays #WOW Awards.
This event is suitable for learners, parents and their teachers. Book your spot HERE
To book your spot – click here
Calling on all high schoolers interested in tech as a career. Join us on Thursday 22 September at 5:30pm.
We will be sharing:
There are only 100 spaces – so book your spot now – please RSVP here Book
To access the recording – click here
Calling on all high schoolers interested in tech as a career to join us on 16 September at 5:30pm. If you missed it, we’ll host another one on 18 November.
We covered the following:
To access the recording – click here
To book your spot – click here
We will be talking to Noelene Kinsley from GC Network. Noelene has been specialised in the exciting career of Genetic Counseling and wants to share her passion for making the world a healthier place using genetics….and data science technology.
Let’s hear more about the trends in the health/genetics industries, where jobs are moving to and what kind of skills you’ll need in this exciting world of opportunities out there.
This event is suitable for learners, parents and their teachers. Book your spot HERE
To book your spot – click here
We will be talking to Jason Suttie from Devson. Jason has been in the tech world since he was six years old. He headed IT innovation unit at RMB and has since left to start up his own software consulting company – solving problems and building solutions for clients around the world.
Let’s hear more about the trends in the software and programming industries, where jobs are moving to and what kind of skills you’ll need in this exciting world of opportunities out there.
Book your spot HERE
Introduces Linux as an operating system, the basic open source concepts and an understanding of the Linux commands. Linux is crucial for cybersecurity.
Gives you the baseline skills you need to secure a company’s systems, software and hardware. This certificate gives practical hands-on skills to pursue a career in cyber security
Will give you skills in Information Security Threats and Attack Vectors, Attack Detection, Attack Prevention, Procedures, Methodologies and more.