## Karnataka 2nd PUC Statistics Notes Chapter 5 Theoretical Distribution

High Lights of the Topic:

→ The probability distribution of a random variable obtained on the basis of some theoretical assumptions are known as theoretical or probability distributions.

→ Discrete probability distributions: Probability distribution of a discrete random variable is known as discrete probability distribution.

Ex: Number of Heads obtained when three coins are tossed, Number of female children in a family, Number of accidents occurring in a city in a day, drawing balls without replacement from a bag of different coloured balls etc. are discrete variable examples. The following probability distributions are used to deal such examples.

• Bernoulli distribution
• Binomial distribution
• Poisson distribution
• Hyper-geometric distribution

Continuous Probability Distributions:
Probability distribution of a continuous random Variable is known as continuous probability distribution. Ex: Height/ Weight/ Marks obtained by a of class of students, Age/Wages/Income of employees of a factory etc. are all continuous variable examples. The following probability distributions are used to deal with such examples.

• Normal distribution
• Chi-square distribution
• Student’s t-distribution

Discrete Probability Distributions:

Bernoulli Distribution:

{Introduced both Bernoulli and Binomial distributions by Mr.James Bernoulli} A random experiment which has only two outcomes as ‘success’ and ‘failure’ where
P(succeess) = p & P(failure) = q or (1 – p) is called Bernoulli Trail or Experiment.

Examples:
1. Tossing a fair coin once, and getting out comes as Head (success-p) or Tail (failure-q)

2. A new born baby may be male (p) or female (q)

3. A bomb is dropped on a target may hit (p) or may not hit (q)

4. An item chosen at random may be defective or not

5. Rolling a die and getting no. 6 (success) or not (other numbers). The probability mass function (p.m.f) is:
P(x) = Px (1 – P)1 – X;                             where p > 0, and X = 0, 1
OR P(x) = Px q1 – x x = 0, 1                   Where p is probability of success (0 < p < 1)

→ Here x-is discrete and is called Bernoulli variate.

• The Bernoulli distribution with the parameter p denoted by B(p)
• The distribution can also be written as:

→ A random variable x assumes values 1 and 0 with respective probabilities p and (1 – p) is called Bernoulli variate
The Bernoulli distribution can also be writtens is: Where p-the probability of success

Properties/Features:

• Here p-is the parameter, is a constant
• Mean = E(x) = p,
• var(x) = p (1 – p) or pq
• For the distribution Mean(p) > Variance
s.d(x) = $$\sqrt{p(1-p)}$$ or $$\sqrt{p q}$$ Binomial Distribution:

Bernoulli distribution tends to Binomial distribution:
If x1, x2, x3 …………. xn are independently identically distributed (i.i.d.) Bernoulli variates, then (x1 + x2 + x3 + ………… + xn) is a Binomial variate with parameters n and p

Conditions/Assumptions that Binomial distribution can be applied:

• Trails are repeated number of times and are independent.
• Each trail is a Bernoulli trial with two outcomes as success and failure
• The probability of success ‘p’ should be constant for each of the trails
• Experiment should be conducted under similar conditions for a fixed number of trails say ‘n’.

Examples:

• Number of heads obtained when 5 coins are tossed
• Number of male children in a family of 3 children
• Number-of defective articles in a random sample of 7 articles
• Number of bombs hitting a target when 4 bombs are dropped on it.

Similarly number of accidents, deaths, infections, contracting a disease, literates, mango trees among the trees etc.
The p.m.f is: P(x) = ncxPxqn – x; Where x = 0, 1, 2, 3 …………….. n, and range of p: 0 < P < 1
Here x is discrete and is called Binomial variate.

Properties / Features:-
→ n & p are the parameters

→ Range: 0, 1, 2, n

→ The Binomial distribution with the parameters n, p denoted by B(n, p)

→ Mean = np, var(x) = npq, sd(x) = √var(x) = $$\sqrt{\mathrm{npq}}$$

→ Relation between mean and variance: mean > variance, ie. np > npq

→ Binomial distribution is symmetric when p = $$\frac{1}{2}$$ (i.e., β1 = 0 non-skewed).

→ Expected /Theoretical frequency = Tx = p(x).N

→ The distribution is called symmetric when p = q

→ Recurrence relation to get theoretical frequency = Tx = $$\frac{n+1-x}{x} \frac{p}{q} T_{x-1}$$

→ Recurrence relation to get theoretical P(x) = $$\frac{n+1-x}{x} \cdot \frac{p}{q} p_{x-1}$$

→ The terms of B.D are: → If p > $$\frac{1}{2}$$ or q >$$\frac{1}{2}$$ then binomial distribution is positively skewed (i.e., β1 > 0).

→ If P < $$\frac{1}{2}$$ or q < $$\frac{1}{2}$$, then binomial distribution is negatively skewed (i.e., β1 < 0). Poisson Distribution

{French mathematician S.D.Poisson ini 837 used to describe the behavior of rare happening of events.}

Examples:

• Number of telephone calls received in one minute
• No. of printing mistakes in a book/typing mistakes (typographical errors) in a page.
• No. of accidents/deaths occurring in a city in a day
• No. of defective articles manufactured in a lot by a firm.
• Number of vehicles crossing a junction in one minute.

Binomial distribution tends to Poisson distribution under the following conditions:
(i) When n is large ie., n → ∞
(ii) When P is very small ie., p → 0 and
(iii) Mean = np = λ is fixed / constant, which is parameter of the Poisson distribution Poisson distribution is:
A distribution which has the following p.m.f. as:-
P(x) = $$\frac{e^{-\lambda} \lambda^{x}}{x !}$$; where x = 0, 1, 2, ………….. ∞ and m > 0, (λ read lamda)
Here x is discrete is called Poisson variate.

Properties Features:

• e-Euler’s constant (2.7184) is the base of the natural number,
• Range : 0, 1, 2 …………… ∞.
• λ – Parameter
• Mean = E(x) = λ, Var(x) = λ,
• Here mean = variance; is the relation b/w mean and variance
• Theoretical frequency/Expected frequency = Tx = P(x).N
• Recurrence relation to get theoretical frequencies Tx = $$\frac{\lambda}{x} \mathrm{~T}_{\mathrm{x}-1}$$
• First three Terms of distribution:- Note:  Hyper-geometric distribution:

Examples:-

• Number of girls in student representatives when 6 students are selected from 50 boys and 30 girls of a class.
• Number of coffee drinkers in a sample of 5 selected from a teaching staff of 15 coffee drinkers and 12 tea drinkers.
• Number of red balls drawn in a draw of 3 balls urn with 5 red and 4 black balls.
• Number of computer illiterates in a selection of 5 persons from an office of 10 men and 8 women.

A probability distribution which has the following probability mass function (p.m.f) as;
P(x) = $$\frac{{ }^{a} C_{x}{ }^{b} C_{n-x}}{{ }^{a+b} C_{n}}$$; where x = 0, 1, 2, ………….. min(a, n); Where a, b and n are positive integers (> 0) Here X is discrete called Hypergeometric variate.
Note: Here n ≤ (a + b) .

Properties/Features:

1. a, b and n are the parameters.

2. Range: 0, 1, 2, ……….. min (a, n).

3. For a hyper-geometric distribution mean = $$\frac{\mathrm{na}}{\mathrm{a}+\mathrm{b}}$$

4 Var(x) = $$\frac{n a b(a+b-n)}{(a+b)^{2}(a+b-1)}$$ and S.D = √var(x)

5. Hypergeometric distribution tends to Binomial distribution when:
(i) a is large ie. a → ∞
(ii) b is large ie. b → ∞ and
{Binomial distribution is a limiting form of Hyper-geometric distribution with parameters n and p = $$\frac{a}{a+b}$$}.

6. A hyper-geometric distribution with parameters a, b and n is denoted by H(x; a, b, n) or H(a, b, n).

7. If a = 3 , b = 5 and n = 2 the Hypergeometric distribution can be written as:
The terms: Continuous Probability Distributions Normal Distribution

[Introduced and developed by De-Moivre, Pierre Laplace, Carl F-Gauss, also this distribution is called Gaussian distribution]

→ The Normal distribution is a limiting case of the Binomial distribution ie. Binomial tends to Normal, under following conditions:

• The number trails ‘n’ becomes very large, ie. n → ∞
• Neither p nor q is very small, and np = µ, σ = $$\sqrt{\mathrm{npq}}$$

→ In Poisson distribution with parameter λ becomes large we use normal distribution as an approximation ie. Poisson tends to Normal when, λ → ∞ and mean = µ = λ, σ = √λ

Examples:

• Ht. / Wt. of students of a class
• Wt. of apples grown in an orchard
• I.Q. of a large group of children.
• Marks scored by students in an examination.
• Wages / Income of employees.

A probability distribution which has the following probability density function (p.d.f.) as:- Here x is continuous and is called Normal variate.

For a N.D:

• Range: (- ∞, ∞)
• p and a are parameters,
• In the distribution π = 3.14, e = 2.718 euler’s constant .
• Mean = E(x) = µ Var(x) = σ2, S.D = σ
• A normal variate with parameters and is denoted by N(µ, σ2)

Properties of Normal distribution /Normal curve: – A Normal distribution with parameters is and a has the following properties:
1. The curve is bell shaped:

• The curve is symmetrical (non-skew) β1 = 0
• Mean = Median = Mode, ie. Mean, Median and Mode are all equal.

2. The Quartiles Q1 & Q3 are equidistant from the Median are given by:
Q1 = µ – 0.6745σ and Q3 = µ + 0.6745µ (Here, Q2/Z/µ = $$\frac{\mathrm{Q}_{1}+\mathrm{Q}_{3}}{2}$$)

3. The curve is Asymptotic to the x-axis ie., the curve touches the x-axis at -∞ & + ∞.

4. The curve has Points of Inflexion at µ ± σ.

5. For the distribution: S.D = σ, Q.D = $$\frac{2}{3}$$σ, M.D = $$\frac{4}{5}$$σ, Here QD = $$\frac{\mathrm{Q}_{3}-\mathrm{Q}_{1}}{2}$$

6. The distribution is mesokurtic β2 = 3.

7. The total area under the curve is one (1):
ie. (a) P(µ – σ < X <µ + σ) = 0.6826,
(b) P(µ – 2σ < X < µ + 2σ) = 0.9544,
(c) P(µ – 3σ < X > µ + 3σ) = 0.9974 Standard Normal Variate (SNV): A Normal variate with mean µ = 0 and S.D. σ = 1 is called
S.N.V. Denoted by Z ; ie, Z = $$\frac{x-\mu}{\sigma}$$ ~ N(0, 1).

The P.d.f of SNV is – f(z) = $$\frac{1}{\sqrt{2 \pi}} \mathrm{e}^{-\frac{Z^{2}}{2}}$$; where – ∞ < Z < + ∞, Here Z = $$\frac{x-\mu}{\sigma}$$;

Let x be a normal variate with, mean µ and S.D (σ), then Z is Standard Normal Variate. To find any probability regarding X, S.N.V is used to find the probability under the area under the Normal curve from 0 to z or from z to ∞ Chi-Square Distribution Note: Definition of x distribution:- Let Z1, Z2, Z3 …… Zn are n S.N.V’s ; then
x2 = Z12 + Z22 + Z32 + + Zn2 ~ x2(n) Features/Properties:

• Parameter = n;
• Range (0, ∞)
• Mean = n, *Variance = 2n, * SD. = √var9(x) = $$\sqrt{2 n}$$
• Mode = (n – 2) for n > 2,
• The curve is positively skewed for n > 2 (β1 > 0).
• χ2 – distribution is leptokurtic (β2 > 3).
• Total area under the χ2 – curve is equal to 1.
• χ2 – distribution tends to follow standard normal distribution When n is large ie. n → ∞
• χ2 – distribution is leptokurtic (β2 > 3).

Application:

• Test for population variance
• Test for Goodness of Fit
• Test for Independence of Attributes.

Students’s T-Distribution

This distribution developed by W.S.Gossett in 1908.it is derived from the normal distribution. Note 1: The t-distribution can also can be written:
If k = $$\frac{1}{\sqrt{n} \beta\left(\frac{1}{2}, \frac{n}{2}\right)}$$

Then; f(t) = k × $$\frac{1}{\left(1+\frac{t^{2}}{n}\right)^{\frac{n+1}{2}}}$$ Range; – ∞ < t < ∞

Note 2: t – variate with n d.f. is denoted by t(n).

Features / Properties:

• parameter ‘n’ called degrees of freedom;
• Range: (-∞, ∞)
• The t-curve is bell shaped
• Mean = 0,(X̄ = M = Z = 0),
• Var(x) = $$\frac{\mathrm{n}}{\mathrm{n}-2}$$ for n > 2; and S.D(x) = $$\sqrt{V(x)}$$
• The t-distribution is symmetrical about t = 0 ie. β1 = 0.
• The distribution is leptokurtic β1 > 3.
• t-distribution is asymptotic to X-axis.
• t-distribution tends to Normal distribution when n is large. Application:- t – distribution is used in small sample tests of testing hypothesis :

• To test for mean,
• Test for equality of means,
• Test for equality of population means when observations are paired (paired t-test).

2nd PUC Statistics Notes