Probability Distribution
Random Variable
1. Explain random variables (RV).
2. Explain discrete random variable and provide example(s).
3. Explain continuous random variable and provide example(s).
4. What is an expected value?
5. What does the expected value represents or how should we interpret it?
6. Provide the formula to calculate the expected value of a discrete random variable.
Formula: $E[X] = \sum_i x_i\cdot P(X=x_i)$
- $E $: Expected value
- $X$: Random variable
- $x_i$: Each possible value that the random variable $X$ can take on
- $P(X=x_i)$: Probability mass function (PMF) that assigns the probability of a random variable X taking on the specific value $x_i$
7. Explain the similarities between expected value and the standard mean calculation $(\frac{1}{n} \cdot \sum_i x_i)$.
Both formulas involve summing up the product of each value and its corresponding weight or frequency.
- In the case of the expected value, the weights are the probabilities assigned by the probability mass function (PMF).
- In the case of the mean, the weights are the same for each value, and thus able to divide by the total number of observations $(\frac{1}{n})$.
8. Provide the formula to calculate the variance of a discrete random variable.
Formula: $Var(X) = \sum_i(x_i-E[X])^2\cdot P(X=x_i)$
- $Var(X)$: Variance
- $E $: Expected value
- $X:$ Random variable
- $x_i$: Each possible value that the random variable $X$ can take on
- $P(X=x_i)$: Probability mass function (PMF) that assigns the probability of a random variable X taking on the specific value $x_i$
9. What is the mode of a discrete random variable?
Probability Distribution Introduction
1. Explain probability distribution.
2. Provide the 3 key properties of a probability distribution.
- Non-negativity.
- Each probability is between 0 and 1.
- The sum of all the probabilities is 1.
3. Explain Probability Mass Function (PMF) and provide example(s) of PMF.
- Definition: Describes the probability distribution over a discrete random variable (RV). It is a function that returns the probability of a RV being exactly equal to a specific outcome or value.
- Example(s): Bernoulli Distribution, Binomial Distribution, Poisson distribution
4. Explain Probability Density Functions(PDF) and provide example(s) of PDF.
- Definition: Describes the probability distribution over a continuous random variable (RV). Where the random variable can take on any value within a certain range.
- Example(s): Normal Distribution, Exponential Distribution
5. What is the total area under a continuous and discrete probability distribution curve?
6. In a probability density function, is it possible to find the probability of a single, specific point?
7. Explain Cumulative Distribution Function (CDF).
Bernoulli Distribution
1. Explain the Bernoulli distribution.
2. Provide some examples (experiments) where a Bernoulli distribution can be applied.
- Flipping a coin (heads or tail)
- Passing an exam (pass of fail)
3. Provide the Bernoulli distribution function.
$P(k;p) = p^k(1-p)^{1-k}$
Where:
- $k$ is the possible outcomes (0 for failure, 1 for success)
- $p$ is the probability of success
4. Derive the expected value for the Bernoulli distribution.
5. Derive the variance for the Bernoulli distribution.
$Var(X)=E(X^2)−E(X)^2$
$E(X^2) = 1^2×p+0^2×(1−p) = p$
$E(X)^2 = p^2$
$Var(X) = p−p^2=p(1−p).$
Binomial Distribution
1. Explain the Binomial distribution.
2. Provide some examples (experiments) where a Binomial distribution can be applied.
- Counting the number of heads in a series of coin flips.
- Finding the number of students that will pass a series of independent exams.
3. State the key characteristics of a Binomial Distribution.
- Binary Outcomes: Each trial results in one of two possible outcomes, often referred to as success and failure.
- Fixed Number of Trials ($n$): The number of trials is predetermined and remains constant throughout the experiment.
- Independence: The outcome of one trial does not affect the outcome of another. Each trial is considered independent.
- Constant Probability of Success ($p$): The probability of success (denoted by $p$) remains the same for each trial.
4. Provide the Binomial distribution function.
$P(X=k) = \binom nk p^k(1-p)^{1-k}$
Where:
- $k$ is the number of successes
- $n$ is the number of independent trials
- $p$ is the probability of success
- $\binom n k$ is the binomial coefficient, representing the number of ways to choose $k$ successes from $n$ trials.
5. Provide the expected value of the Binomial distribution.
The expected value of a binomially distributed random variable $X$ is the sum of $n$ identical Bernoulli random variables, each with expected value $p$, which gives us ..
$E(X)= np$
6. Provide the variance for the Binomial distribution.
Similar the the expected value, the variance of a binomially distributed random variable $X$ is the sum of independent Bernoulli random variable variance $p(1-p)$, which gives us
$Var(X)= npq =np(1-p)$
7. What happens to the binomial distribution as the number of trials increases?
Normal Distribution
1. Explain normal distribution (ND).
2. What makes the normal distribution such an important concept?
3. In the context of a normal distribution, how are the mean, median, and mode interrelated?
4. What is the skewness of a normal distribution?
5. Explain the empirical rule and the main use-case.
The 68-95-99.7 rule, also known as the empirical rule is used to estimate the spread of data in a normal distribution.
- 68% of the data falls within 1 standard deviation of the mean.
- 95% of the data falls within 2 standard deviation of the mean.
- 99.7% of the data falls within 3 standard deviation of the mean.
6. Define standard normal distribution in the context of mean and standard deviation.
7. How do you transform a normal distribution into a standard normal distribution?
$Z= \frac{x-\mu}{\sigma}$
- $Z$: Z-score
- $x$: Individual observation
- $\mu$: Mean
- $\sigma$: Standard deviation
8. How does the $Z$ formula transforms a normal distribution mean to 0 and standard deviation to 1?
Formula: $Z= \frac{x-\mu}{\sigma}$
- Shifting to the mean $(x-\mu)$:
- Subtracting the mean shifts the distribution so that the mean becomes 0.
- Scaling the standard deviation $(\frac{1}{\sigma})$:
- When we divide each data point by the standard deviation, we are expressing the data points in terms of unit of standard deviation from the mean. Hence the standard deviation is 1.
9. What does the $Z$ score tells us?
10. What does a positive and negative $Z$ score indicates?
- Positive $Z$ score indicates that the data point is above the mean of the distribution.
- Negative $Z$ score indicates that the data point is below the mean of the distribution.
11. What are the main use-cases of transforming a normal distribution into a standard normal distribution?
- Comparisons and Standardisation:
- Standardising the data allows for meaningful comparisons across different normal distributions. It simplifies the analysis by providing a common scale for measurements, regardless of the original distribution’s parameters.
- Probability Calculations:
- The transformation allow us to calculate probabilities associated with specific values in the standard normal distribution.
- Statistical Testing:
- Many statistical tests and hypothesis testing procedures assume a standard normal distribution.
12. Describe the steps to find probability using the $Z$ distribution.
- State the problem:
- Finding the probability greater than a value, less than a value or within an interval.
- Standardise the value:
- Calculate the Z-score.
- Determine the probability:
- Find the corresponding cumulative probability of that $Z$ score.
13. In the context of standard normal distribution, provide the formula to find the probability less than or equals to a certain value. Explain your answer.
- Formula: $P(Z\le a)$
- Explanation: The Z-score table have provided the cumulative probability which represents the probability that a randomly selected value from a standard normal distribution is less than or equal to $a$.
14. In the context of standard normal distribution, provide the formula to find the probability more than a certain value. Explain your answer.
- Formula: $P(Z>a) =1- P(Z\le a)$
- Explanation: Subtracting $P(Z\le a)$ from 1 gives the probability that Z is greater than a.
15. In the context of standard normal distribution, provide the formula to find the probability occurring within a certain interval Explain your answer.
- Formula: $P(a \le Z \le b) = P(Z \le b)-P(Z \le a)$
- Explanation: Subtracting $P(Z\le a)$ from $P(Z\le b)$ gives the probability of $Z$ falling within the interval $[a,b]$. We are removing the portion of the distribution less than a, leaving the portion between $a$ and $b$.