Day 5 of this challenge series comprises four parts/challenges that focus on the concepts of Poisson and Normal Distribution. In this blog, we will cover the Normal distribution.
To ensure a solid foundation, let’s begin by clarifying the fundamental concept before delving into the challenges.
Normal Distribution
- The normal distribution, also known as the Gaussian distribution or bell curve, is a continuous probability distribution that is symmetric and characterized by its mean and standard deviation.
- Imagine you have a group of students and you’re measuring their heights. You plot the heights on a graph, with the shortest students on the left and the tallest students on the right.
- Now, let’s say most of the students have an average height, and as you move towards the shorter or taller heights, there are fewer and fewer students. In a normal distribution, the majority of the students’ heights will be clustered around the middle, with fewer students being either very tall or very short.
- A normal distribution is symmetrical, which means that if you were to fold the graph in half, the left side would look very similar to the right side. This symmetry tells us that there is an equal likelihood of finding a student with a height slightly below the average as there is of finding a student with a height slightly above the average.
Standard Normal Distribution
- If mean= 0 and variance = 1 , then the normal distribution is known as standard normal distribution:
💡 Cumulative Probability is also known as cumulative frequency distribution
- Imagine you are waiting for a friend to meet you at a coffee shop, and you know they usually arrive around 2:00 PM. However, they might be a bit early or late sometimes. You want to figure out the likelihood of them arriving at or before a certain time.
- Cumulative Probability is like calculating the chance of your friend arriving by a specific time. For example, let’s say you want to know the probability of them arriving before 2:15 PM. You can look at the past data or observations and count how many times they arrived before 2:15 PM in those instances. If they arrived before that time in 80% of the past observations, then the cumulative probability of them arriving before 2:15 PM is 80%.
- In essence, cumulative probability is all about calculating the likelihood of an event happening up to a certain point in time or within a specific range.
The cumulative distribution function for a function with normal distribution is:
The CDF is denoted as F(x) and is defined as follows:
F(x) = P(X ≤ x)
For a normal distribution with mean μ and standard deviation σ, the CDF is given by the formula:
F(x) = Φ((x — μ) / σ)
Where:
- Φ is the cumulative distribution function of the standard normal distribution (mean = 0 and standard deviation = 1). This function is sometimes denoted as N(0, 1).
- x is the value of the random variable for which we want to find the probability.
- μ is the mean of the normal distribution.
- σ is the standard deviation of the normal distribution.
Challenge #1 — Normal Distribution I
Question —
Answer — 0.401, 0.341
Explanation —
Here, we will be using this concept — F(x) = Φ((x — μ) / σ)
import math as m
def cdf(mn, std, x):
return 0.5 * (1 + m.erf((x-mn)/(std*m.sqrt(2))))
print(f'{cdf(20,2,19.5):.3f}')
print(f'{cdf(20,2,22) - cdf(20,2,20):.3f}')
Challenge #1 — Normal Distribution II
Question —
Answer —
Code —
import math as m
def cdf(mn, std, x):
return 1/2 * (1 + m.erf((x-mn)/(std*m.sqrt(2))))
print(f'{(1 - cdf(70,10,80))*100:.2f}')
print(f'{(1 - cdf(70,10,60))*100:.2f}')
print(f'{cdf(70,10,60)*100:.2f}')
The blog journey continues! Follow me for more captivating content in the upcoming blogs.