Mathematics Revision Notes Of Statistics For NDA
Click for Only Video

Statistics

Statistics is the science of collection, organisation, presentation, analysis and interpretation of the numerical data.

(i) Data The word data means, Data can be defined as a collection of facts or information from which conclusions may be drawn.

e.g.

The data shown below are Sameer's scores on five Math tests conducted in 10 weeks.
45, 23, 67, 82, 71

(ii) Primary and secondary data The data collected by the investigator himself is known as the primary data, while the data, which is already
collected by other persons is know as secondary data. e.g. As investigator collects data related to industries through the government organisations.

(iii) Variable or Variate A characteristics that varies in magnitude from observation to observation. e.g. Weight, height, income, age, etc. are variables.

(iv) Frequency The number of times an observation occurs in the given data, is called the frequency of the observation.

Classification of Data

(i) Grouped data : A data which can be organised into classes is called grouped data.

e.g.,

`0-10 qquad \ \ 3`
`10-20 qquad 4`
`20-30 qquad 5`

(a) Continuous data : If the upper limit of first interval is equal to the lower limit of second interval, then it is called continuous data. Above example is for continuous data.

(b) Discontinuous data If the upper limit of first interval is not equal to lower limit of second interval, then it is called discontinuous data.

e.g.,
`0-9 qquad \ \ \ 3`
`10-19 qquad 2`
`20-29 qquad 1`

(ii) Ungrouped data : A data which cannot be organised into classes; or it just a list of number is called ungrouped data.
e.g. `3, 5, 11, 15, ...`

Frequency Distribution

There are two types of frequency distribution which are as follows:

Discrete frequency distribution

A frequency distribution is called a discrete frequency distribution, if data is presented in such a way that exact measurements of
the units arc clearly shown.

Continuous frequency distribution

A frequency distribution in which data are arranged in classes groups which are not exactly measurable.

Cumulative Frequency Distribution

The frequency of the first class is added to that of the second and this sum is added to that of the third and so
on, then the frequencies so obtained are known as cumulative frequency (`cf`).

There are two types of cumulative frequency viz. less than and greater than. For less than cumulative frequency, we add up the frequency from above and for greater than cumulative frequency, we add up frequencies from below.

The cumulative frequency distributions are as follow :

Relative Frequency Distribution

Relative frequencies are very useful for the comparison
of two or more frequency distributions. To find the
percentage of relative frequency to the total frequency,
given formula is used i.e.


Relative frequency % `= [(" Class frequency")/(" Total Frequency")] xx 100`

MEASURES OF CENTRAL TENDENCY

An average or central value of a statistical series is the value of the variable which describes the characteristic
of the entire distribution.

The following are the five measures of central tendency

1. Mathematical averages

(i) Arithmetic Mean or Mean
(ii) Geometric Mean
(iii) Harmonic mean

2. Positional averages:

(i) Median
(ii) Mode

Out of these measures of central tendency, arithmetic mean, median and mode are sometimes known as
measures of location.

Arithmetic Mean (AM)

It is the sum of all the numbers in the series is divided by the total number of series is called the arithmetic mean.

Arithmetic Mean of Ungrouped or Individual Observations

If `x_1, x_2, .. . , x_n` are `n` values of a variable `X`, then the arithmetic mean of these values is given by

(i) Direct method :

`bar X = (x_1 +x_2 +......+x_n)/n ` or `bar X = 1/n (sum+(i=1)^n x_i)`

(ii) Shortcut method :

`bar X = A+1/n sum_(i=1)^n d_i`

where, ` A=` Assumed mean and `d_i =x_i-A`

Weighted Arithmetic Mean

If `w_p w_2 , w_3 , . .. ,w_n` are the weights assigned to then values of `x_1 , x_2 , ... , x_ n`, respectively, then the weighted
average of AM is given by

`bar X =(w_1x_1 +w_2x_2 +.........+ w_nx_n)/(w_1+w_2+........+w_n) = (sum_(i=1)^n w_i x_i)/(sum_(i=1)^n w_i)` or `bar X = (Sigma wx)/(Sigma w)`

Arithmetic Mean of a Discrete Frequency Distribution

In a discrete frequency distribution, the arithmetic mean may be computed by anyone of the following methods

(i) Direct method : If a variable X takes values `x_1, x_2 , .... , x_n` with corresponding frequencies `f_1, f_2 , ... , f_n` respectively, then the arithmetic mean of the values is

`bar X =(f_1x_1 +f_2x_2 +.........+ f_nx_n)/(f_1+f_2 +.......+ f_n)` or `bar X = (sum_(i=1)^n f_i x_i)/N`

where, `N = f_1 + f_2 + ... + f_n = sum_(i=1)^n f_i`

(ii) Shortcut method : If the values of `x` or (and) `f` is large, the calculation of arithmetic mean by the
formula used above is quite tedious and time consuming. In such a case, we use the formula

`bar X = A+ 1/N sum_(i=1)^n f_i d_i`

where, `d_i =x_i -A` and `A` is assumed mean.


Arithmetic Mean of a Grouped or Continuous Frequency Distribution

For computing arithmetic in a continuous frequency distribution, we need to compute the mid-point of class
intervals (`x`). The mid-points are multiplied by the corresponding frequencies `(f_x )`.

The sum of this product is obtained and is divided by the sum of frequencies. The arithmetic mean may be
computed by applying any of the methods used in a discrete frequency distribution.

Combined Mean

If we are given the AM of the two data sets and their sizes, then the combined AM of two data sets can be
obtained by the formula

`bar x_(12)= (n_1bar x_1+n_2 bar x_2)/(n_1+n_2)`

where, `bar x_(12) =` Combined mean of two data sets 1 and 2

`bar x_1 =` Mean of the first data

`bar x_2 =` Mean of the second data

`x_1 =` Size of the first data

`x _2 =` Size of the second data

Properties of Arithmetic Mean

(i) Mean is dependent of change of origin but it is independent of change of scale.
(ii) Algebraic sum of the deviations of a set of values from their arithmetic mean is zero. ·
(iii) The sum of the squares of the deviations of a set of values is minimum, when taken about mean

Geometric Mean (GM)

The nth root of the product of the values is called geometric mean

(i) Geometric mean for ungrouped data If `x_1, x_2 , .... , x_n)^(1//n)` are n non-zero values of a variate X,
then geometric mean is

GM `= (x_1 * x_2 .......x_n)^(1//n)`

`log GM = 1/n (log x_1 + log x_2 +.......+ log x_n)`

`log GM = 1/n sum_(i=1)^n (log x_1+log x_2 +......+ log x_n)`

`log GM = 1/n sum_(i=1)^n log x_i, G=` antilog `(1/n sum_(i=1)^n log x_i)`

(ii) Geometric mean for grouped data If `x_1, x_2,....x_n)` are n observations whose corresponding frequencies
are `f_ 1, f_ 2 , ...., f_n` then geometric mean is given by

`GM= ( x_1^(f_1) * x_2^(f_2) .........x_n^(f_n))^(1//N)`

`log GM = 1/N (f_1 log x_1 +f_2 log x_2 +.........+ f_n log x_n)`

`log GM = 1/n sum_(i=1)^n log x_i , G=antilog (1/N sum_(i=1)^n f_i log x_i)`

(iii) Combined geometric mean : If `G_1` and `G_2` are the geometric means of two series of sizes `n_1` and `n_2`
respectively, then the geometric mean GM of the combined series is given by

`log GM =(n_1 log G_1 + n_2 log G_2)/(n_1+n_2)`

as median divides as distribution into two equal parts, when `N = f_1 + f_2 + · · · + f_n`

Harmonic Mean (HM)

The harmonic mean of any series is the reciprocal of the arithmetic mean of the reciprocals of the observations.

(i) Harmonic mean for ungrouped data : If `x_1, x_2 , .... , x_n` are `n` non-zero values of a variate
`X`, then harmonic mean is

`HM= n/(1/x_1 +1/x_2 + ...+1/x_n)= n/(sum_(i=1)^n (1/x_i))`

(ii) Harmonic mean for grouped data : If `x_1 , x _2 , ... , x_n` are `n` observations, whose corresponding
frequencies of each variate is `f_1, f_2 ,.... , f_n`, then

`HM= (f_1 +f_2 +....+f_n)/(f_1/x_1 +f_2/x_2 +........+ f_n/x_n)`

`= N/(sum_(i=1)^n (f_i/x_i))= (sum_(i=1)^n f_i)/(sum_(i=1)^n f_i/x_i)`

where, `N = f_1 + f_2 + ... + f_n`


Median

Median of a distribution is the value of the variable such that the number of observations above it is equal to the
number of observations below it.

Median of Ungrouped Individual Observations

In case of individual observations, `x_1, x_2, .... , x_n` to find the median, we use the following method

(i) Arrange the observations `x _1, x _2, .. . , x_n` in ascending or descending order.

(ii) If `n` is odd, then median is the value of `((n+1)/2)th` observation. If `n` is even, then median is the values of `((n/2)th " and " (n/2 +1) th)/2` observations.

Median of Discrete Frequency Distribution

In case of a discrete frequency distribution `x_i, f_i, i = 1, 2, ... , n`, we calculate the median by using the
following method

(i) First arrange the data in ascending or decending order and then find the cumulative frequencies (`cf`).

(ii) Find `N/2` , where `N= sum_(i=1)^n f_i`

(iii) See the cumulative frequency (`cf`) just greater than `N/2`. The corresponding value of `x` is median

Here, `N=120 => N/2 = 60`

We find that the cumulative frequency just greater than `N //2` is 65 and the value of `x` corresponding to 65 is 5.

Therefore, median is `5`.

Median of a Grouped or Continuous Frequency Distribution

To calculate the median of a grouped or continuous frequency distribution, we use the following method

(i) Prepare the cumulative frequency column and obtain `N= Sigma f_i` and find `N/2`

(ii) See the cumulative frequency just greater than `N/2` and determine the corresponding class. This class is
known as the median class.

(iii) Use the following formula, Median

`=l+((N//2-C)/f) xx b`

where,

`l =` Lower limit of the median class
`f =`Frequency of the median class
`h =`Width (size) of the median class
`C =` Cumulative frequency of the class preceding the median class

Symmetric Distribution

A distribution is a symmetric distribution, if the values of mean, mode and median coincide. In a symmetric
distribution, frequencies are symmetrically distributed on both sides of the centre point of the frequency curve

 
SiteLock