Random Variables and Probability Distributions

Harshal Pawar
4 min readJun 29, 2021

Hello Everyone. This is Harshal Pawar and I am an aspiring Data Scientist. Currently, I am upskilling myself with the requisite of becoming a Data Scientist. In this article, I will put down all that I have learned about Random Variables and Probability Distributions so far.

So let's get started.

Why is Probability Distribution Important?

Probability distribution helps us to estimate the probability of the occurrence of an event or the variability of the occurrence of an event. It helps us to derive some good insights from a random phenomenon.

So, for us to dive deep into probability distributions, understanding random variables is a prerequisite.

1. Random Variables

Random Variables in the context of statistics are not the same as those we study in Algebra.

A Random Variable is a function that maps each outcome of a random process to a numeric value.

We can also say that,

A Random Variable is a rule for associating a number with each element in a sample space.

Where a sample space is the set of all possible outcomes of an experiment.

So what does that exactly mean? Let us understand it with a very typical example of tossing a coin.

Here, tossing a coin is a random process. The possible outcomes in this process will be Heads and Tails. So, a random variable, in this case, can be the number of Heads/Tails while tossing a coin.

Note: Random Variable is denoted by a capital letter.

Here,

a) Experiment is tossing a coin.

b) Sample space(S) is {H, T}.

c) X(Random Variable) is the number of heads when we toss a coin.

d) Then, X(H) = 1 and X(T) = 0.

Why do we do this?

A simple answer to this would be that if we quantify the outcomes, it becomes easy to apply Maths to it.

Random variables come in two varieties.

i. Discrete Random Variables.

ii. Continuous Random Variables.

Consider the below examples of random variables.

Example A. X= Birth year of a random student in a school.

Can we say that the birth year of someone in the class will be 1995.897?

Absolutely No! This brings us to the conclusion that X, in this case, takes distinct numeric values that can be counted. This type of random variable is called Discrete Random Variable.

A Discrete Random Variable takes values that are countable.

Example B. Y= The exact weight of any random student in a school.

Can we really tell the “exact” weight of a student? You might think ‘yes we can!’ But be reminded that the weight you see on the weighing machine is just rounded off to a few decimals and the actual weight can go on to very large decimal points that can be infinite. This type of random variable is called Continuous Random Variable.

A continuous random variable is one which takes an infinite number of possible values.

2. Probability Distributions

As we have already gone through the importance of Probability Distributions, let us see what it means and how random variables are an important part of it.

A) For Discrete Random Variables

Consider a random variable X= number of heads after tossing a coin thrice.

Before we move further, let me introduce the use of x(small x ) in random variables. ‘x’ is the set of all possible values that the random variable X can take. Here, in this example x ∈ {0,1,2,3} as only these can be the possible number of heads occuring.

All the possible outcomes after a coin is flipped thrice are, {HHH,HHT,HTT,TTT,TTH,THH,THT,HTH}.

What will be the probability that 0 heads occur?

a) We denote it as P(X=0)=1/8=0.125

b) Similary, the probability of getting exactly 1 head= P(X=1)=3/8=0.375

c)P(X=2)=3/8=0.375

d)P(x=3)=1/8=0.125

If we sum up the probabilities of all outcomes, it will be equal to one. This gives us the Probability Distribution of that random variable.

Probability Distribution
Probability Distribution

In the case of Discrete Random Variables, the function that denotes the probability of the random variable for each x in the range of X is known as the Probability Distribution Function(PDF).

B) For Continuous Random Variables

What will be the probability of X=x in the case of a continuous random variable? Could you imagine a probability of a number that cannot be counted? You can’t. This is why unlike discrete random variables we distribute the probabilities in specific intervals and always talk about the probabilities of those intervals instead of a single number.

Probability Density Function

Where fₓ(x) is the Probability Density Function.

To find the probability of a certain interval, say a to b, we find the area under that curve by integrating the PDF in that interval.

3. Cumulative Distribution Function

The Cumulative Distribution Function of X, evaluated at x is the probability that X will take a value less than or equal to x.

A) For Discrete Random Variables

Considering the above example of tossing a coin thrice.

The probability that the number of heads will be less than or equal to 0 is denoted as Fₓ(0).

Hence,

Fₓ(0) = P(X ≤ 0) = 1/8

Fₓ(1) = P(X ≤ 1) = P(X =0)+P(X=1)=1/8+3/8=4/8

Fₓ(2) = P(X ≤ 2) = P(X =0)+P(X=1)+P(X=2)=1/8+3/8+3/8=7/8

Fₓ(3) = P(X ≤ 3) = P(X =0)+P(X=1)+P(X=2)+ +P(X=3)=1/8+3/8+3/8=1

B) For Continuous Random Variables

For Continuous Random Variables also CDF is defined in the same way.

Fₓ(x) = P(X ≤ x)=∫fₓ(t)dt; in the interval of -∞ to x.

This concludes the blog! I will surely come up with more such topics.

--

--

Harshal Pawar

🤖 Mechanical Engineer turned Data Sorcerer 🪄 | Cracking the AI Enigma 🚀 | On a quest to make AI less artificial and more avocado-toast-loving 🥑🤖 |