Day 4 (Part 1) — 10 days of Statistics — (HackerRank)

4 min readJun 17, 2023

Day 4 of this challenge series comprises four parts/challenges that focus on the concepts of Binomial Distribution and Geometric Distribution. In this blog, we will cover the Binomial distribution. In the next blog, we’ll delve into the Geometric distribution.

To ensure a solid foundation, let’s begin by clarifying the fundamental concept before delving into the challenges.

Binomial Distribution

Random variable

A random variable is a variable in statistics and probability theory that takes on different numerical values based on the outcome of a random event or experiment. It represents the uncertain or random aspect of a particular phenomenon.
Tossing a Coin: Let’s consider a random variable X that represents the outcome of a coin toss. X can take on two possible values: “Heads” (H) or “Tails” (T). This is a discrete random variable.

Random variable Types

Discrete — can only take on a finite or countably infinite set of values. Eg, no of heads
Continuous — can take on any value within a specified range. Eg, height

Bernoulli Random Variable

A Bernoulli random variable represents a single trial or experiment with two possible outcomes: success or failure.
It is a discrete random variable that can take on only two values: 1 (representing success) or 0 (representing failure).
It is characterized by a single parameter, p, which is the probability of success in a single trial.
The probability mass function (PMF) of a Bernoulli random variable is given by P(X = k) = p^k * (1-p)^(1-k), where k is the outcome (0 or 1).
The mean of a Bernoulli random variable is E(X) = p, and the variance is Var(X) = p * (1-p).
Bernoulli random variables are often used to model a single binary outcome, such as flipping a coin or the success/failure of a single event.

Binomial Distribution

The binomial distribution models the number of successes in a fixed number of independent Bernoulli trials.
It is a discrete probability distribution that describes the probabilities associated with different numbers of successes.
It is characterized by two parameters: n (the number of trials) and p (the probability of success in each trial).
The probability mass function (PMF) of the binomial distribution is given by P(X = k) = C(n, k) * p^k * (1-p)^(n-k), where X is the random variable representing the number of successes.
The mean of the binomial distribution is E(X) = n * p, and the variance is Var(X) = n * p * (1-p).
The binomial distribution is commonly used to model situations where there are a fixed number of independent trials with a binary outcome, such as the number of successes in multiple coin flips or the number of defective items in a sample.

PMF

The probability mass function is a function that gives the probability of a discrete random variable taking on a specific value.

Cumulative distribution function

It provides the probability that a random variable takes on a value less than or equal to a given value.
The cumulative probability function is often used to calculate percentiles, determine the probability of events occurring within a certain range, and analyze the behavior of random variables.
For example, if we have a random variable X that follows a normal distribution with mean 0 and standard deviation 1, the cumulative probability function can be used to find the probability that X is less than or equal to a specific value.
Properties
The cumulative probability is a non-decreasing function.
For any value x, 0 ≤ F(x) ≤ 1.
As x approaches negative infinity, F(x) approaches 0.
As x approaches positive infinity, F(x) approaches 1.

Challenge #1 — Binomial Distribution I

Question —

Answer —

Simply, put the given value in the formula

from math import comb as c
x, n, b, g = 3, 6, 1.09, 1
p, q = b/(b+g), g/(b+g)
print(f'{sum(map(lambda x:c(n,x)*p**x*q**(n-x),range(3,7))):.3f}')

Challenge #2 — Binomial distribution II

Question —

Answer —

from math import comb as c
n, p, q = 10, 0.12, 0.88
print(f'{sum(map(lambda x:c(n,x)*p**x*q**(n-x),range(0,3))):.3f}')
print(f'{sum(map(lambda x:c(n,x)*p**x*q**(n-x),range(2,11))):.3f}')

The first print function will calculate the probability of having no more than 2 rejects. The formula remains the same, but we need to adjust the range value in Python. The range should be specified just before the mentioned number.

The second print function will calculate the probability of having at least 2 rejects.

Keep an eye out for my upcoming part 2! Make sure to follow me for the latest updates.

Day 4 (Part 1) — 10 days of Statistics — (HackerRank)

Written by Celestial

No responses yet