Mathematics Introduction, Measures of Dispersion , Range , Mean Deviation , Mean deviation for ungrouped data and Mean deviation for grouped data

### Topics covered

star Introduction
star Measures of Dispersion
star Range
star Mean Deviation
star Mean deviation for ungrouped data
star Mean deviation for grouped data

### Introduction

\color{red} ✍️ We know that statistics deals with data collected for specific purposes. We can make decisions about the data by analysing and interpreting it.

\color{red} ✍️ In earlier classes, we have studied methods of representing data graphically and in tabular form.

\color{red} ✍️ This representation reveals certain salient features or characteristics of the data. We have also studied the methods of finding a representative value for the given data. color(vlue)("This value is called the measure of central tendency. ")

Recall mean (arithmetic mean), median and mode are three measures of central tendency.

\color{red} ✍️ A measure of central tendency gives us a rough idea where data points are centred. But, in order to make better interpretation from the data, we should also have an idea how the data are scattered or how much they are bunched around a measure of central tendency.

color(red)(=>"Consider now the runs scored by two batsmen") in their last ten matches as follows:

color{orange}("Batsman A" : 30, 91, 0, 64, 42, 80, 30, 5, 117, 71)

color{orange}("Batsman B" : 53, 46, 48, 50, 53, 53, 58, 60, 57, 52)

Clearly, the mean and median of the data are

color{navy}(\ \ \ \ \ \ \ \ \ \ \ "Batsman A" \ \ \ \ \ \ \ \ \ \ \ "Batsman B")

color{orange}("Mean" \ \ \ \ \ \ \ \ \ \ 53\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 53)

color{orange}("Median" \ \ \ \ \ \ \ \ \53\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 53)

Recall that, we calculate the mean of a data (denoted by color(blue)(barx) ) by dividing the sum of the observations by the number of observations, i.e

color{blue}(barx = 1/n sum_(i=1)^(n) x_i)

Also, the median is obtained by first arranging the data in ascending or descending order and applying the following rule.

\color{red} ✍️ If the number of observations is odd, then the median is color{orange}(((n+1)/2)^(th)) observation.

\color{red} ✍️ If the number of observations is even, then median is the mean of color{orange}((n/2)^(th)) and color{orange}(((n+1)/2)^(th)) observations.

\color{red} ✍️ We find that the mean and median of the runs scored by both the batsmen A and B are same i.e., 53. Can we say that the performance of two players is same?

Clearly No, because the variability in the scores of batsman A is from 0 (minimum) to 117 (maximum).

Whereas, the range of the runs scored by batsman B is from 46 to 60.

Let us now plot the above scores as dots on a number line. We find the following diagrams:

\color{red} ✍️ We can see that the dots corresponding to batsman B are close to each other and are clustering around the measure of central tendency (mean and median), while those corresponding to batsman A are scattered or more spread out.

\color{red} ✍️ Thus, the measures of central tendency are not sufficient to give complete information about a given data.

\color{red} ✍️ Variability is another factor which is required to be studied under statistics. Like ‘measures of central tendency’ we want to have a single number to describe variability. This single number is called a ‘measure of dispersion’.

\color{red} ✍️ In this Chapter, we shall learn some of the important measures of dispersion and their methods of calculation for ungrouped and grouped data.

### Measures of Dispersion

The dispersion or scatter in a data is measured on the basis of the observations and the types of the measure of central tendency, used there. There are following measures of dispersion:

(i) Range, (ii) Quartile deviation, (iii) Mean deviation, (iv) Standard deviation.

In this Chapter, we shall study all of these measures of dispersion except the quartile deviation.

### Range

\color{fuchsia} {ul ★ "Range"}

We find the difference of maximum and minimum values of each series. This difference is color(blue)("called the ‘Range’ of the data.")

Thus, color(red)("Range of a series = Maximum value – Minimum value. ")

\color{red} ✍️ Recall that, in the example of runs scored by two batsmen A and B, we had some idea of variability in the scores on the basis of minimum and maximum runs in each series. To obtain a single number for this,

In case of color(navy)("batsman A,") color{orange}("Range" = 117 – 0 = 117) and for color(navy)("batsman B,") color{orange}("Range" = 60 – 46 = 14.)

Clearly, color(navy)("Range of A > Range of B.") Therefore, the scores are scattered or dispersed in case of A while for B these are close to each other.

\color{red} ✍️ The range of data gives us a rough idea of variability or scatter but does not tell about the dispersion of the data from a measure of central tendency. For this purpose, we need some other measure of variability. Clearly, such measure must depend upon the difference (or deviation) of the values from the central tendency.

\color{red} ✍️ The important measures of dispersion, which depend upon the deviations of the observations from a central tendency are mean deviation and standard deviation. Let us discuss them in detail.

### Mean Deviation

color{green} ✍️ Recall that the deviation of an observation x from a fixed value ‘a’ is the difference color{orange}(x – a.)

In order to find the dispersion of values of x from a central value ‘a’ , we find the deviations about a.

An absolute measure of dispersion is the mean of these deviations.

To find the mean, we must obtain the sum of the deviations.

But, we know that a measure of central tendency lies between the maximum and the minimum values of the set of observations.

Therefore, some of the deviations will be negative and some positive. Thus, the sum of deviations may vanish.

Moreover, the sum of the deviations from mean color{orange}(( barx )) is zero.

Also color{blue}(\ \ \ \ \ \ \ \ \ \ \ \ \ \ "Mean of deviations" = ("Sum of deviations")/("Number of observations") = 0/n = 0)

Thus, finding the mean of deviations about mean is not of any use for us, as far as the measure of dispersion is concerned.

color{green} ✍️ Remember that, in finding a suitable measure of dispersion, we require the distance of each value from a central tendency or a fixed number ‘a’.

Recall, that the absolute value of the difference of two numbers gives the distance between the numbers when represented on a number line.

Thus, to find the measure of dispersion from a fixed number ‘a’ we may take the mean of the absolute values of the deviations from the central value. This mean is called the color(blue)("‘mean deviation’.")

Thus mean deviation about a central value ‘a’ is the mean of the absolute values of the deviations of the observations from ‘a’.

The mean deviation from ‘a’ is denoted as M.D. (a). Therefore,

color{blue}(M.D.(a) = ("Sum of absolute values of deviations from 'a'")/("Number of observations"))

color{orange}("Remark : ") Mean deviation may be obtained from any measure of central tendency. However, mean deviation from mean and median are commonly used in statistical studies.

Let us now learn how to calculate mean deviation about mean and mean deviation about median for various types of data

### Mean deviation for ungrouped data

Let n observations be color{blue}(x_1, x_2, x_3, ...., x_n.)

The following steps are involved in the calculation of mean deviation about mean or median:

color{red}("Step 1 : ") Calculate the measure of central tendency about which we are to find the mean deviation. Let it be ‘a’.

color{red}("Step 2 : ") Find the deviation of each color{blue}(x_i) from color{blue}(a), i.e., color{green}(x_1 – a, x_2 – a, x_3 – a,. . . , x_n– a)

color{red}("Step 3 : ") Find the absolute values of the deviations, i.e., drop the minus sign (–), if it is there i.e., , color{blue}(|x_1-a_1|,|x_2-a|,|x_3-a|,....,|x_n-a|)

color{red}("Step 4 : ") Find the mean of the absolute values of the deviations. This mean is the mean deviation about color{blue}(a), ie..,

color{blue}(M.D (a) = {sum_(i=1)^(n) |x_1-a|) /n}

color{green}(M.D (barx)= 1/n * sum_(i=1)^(n) |x_1-barx|)  where color{orange}(barx =) Mean

Thus color{navy}(M.D (barx) = 1/n sum_(i=1)^(n) |x_i -M|, \ \ )  where color{orange}(M) = Median
Q 3176567476

Find the mean deviation about the mean for the following data:
6, 7, 10, 12, 13, 4, 8, 12

Solution:

We proceed step-wise and get the following:

Step 1 Mean of the given data is

bar x = ( 6 + 7 + 10 + 12 + 13+ 4 + 8 +12)/8 =72/8 = 9

Step 2 The deviations of the respective observations from the mean bar x, i.e., x_i– bar (x) are

6 – 9, 7 – 9, 10 – 9, 12 – 9, 13 – 9, 4 – 9, 8 – 9, 12 – 9,

or –3, –2, 1, 3, 4, –5, –1, 3

Step 3 The absolute values of the deviations, i.e.,  | x_i − bar (x) | are
3, 2, 1, 3, 4, 5, 1, 3

Step 4 The required mean deviation about the mean is

M.D.  ( bar x ) = (sum_(i=1)^8 | x_i - bar x | )/8

 = (3+2+1+3+34+5+1+ 3)/8 = 22/8 = 2.75
Q 3146667573

Find the mean deviation about the mean for the following data :
12, 3, 18, 17, 4, 9, 17, 19, 20, 15, 8, 17, 2, 3, 16, 11, 3, 1, 0, 5

Solution:

We have to first find the mean (bar x ) of the given data :

bar x = 1/20 sum_(i=1)^20 x_i = 200/20 = 10

The respective absolute values of the deviations from mean, i.e.,  | x_i - bar x| are

2, 7, 8, 7, 6, 1, 7, 9, 10, 5, 2, 7, 8, 7, 6, 1, 7, 9, 10, 5

Therefore  sum_(i=1)^20 | x_i - bar x | =124

and M.D. (bar x ) = 124/20 = 6.2
Q 3176667576

Find the mean deviation about the median for the following data:
3, 9, 5, 3, 12, 10, 18, 4, 7, 19, 21.

Solution:

Here the number of observations is 11 which is odd. Arranging the data into
ascending order, we have 3, 3, 4, 5, 7, 9, 10, 12, 18, 19, 21

Now median  = ((11 +1)/2)^(th) or 6th observation = 9

The absolute values of the respective deviations from the median, i.e., | x_i − M | are

6, 6, 5, 4, 2, 0, 1, 3, 9, 10, 12

Therefore sum_( i=1)^11 | x_i -M| =58

and MD(M) =1/11 sum_(i=1)^11 | x_i - M | = 1/11 xx 58 =5.27

### Mean deviation for grouped data

We know that data can be grouped into two ways :

color(red)((a)) color(blue)(" Discrete frequency distribution,")

color(red)((b)) color(blue)(" Continuous frequency distribution.")

Let us discuss the method of finding mean deviation for both types of the data.

color{red}("(a) Discrete frequency distribution")

Let the given data consist of n distinct values color{orange}(x_1, x_2, ..., x_n) occurring with frequencies color{orange}(f_1, f_2 , ..., f_n) respectively.

This data can be represented in the tabular form as given below, and is color(blue)("called discrete frequency distribution :")

color{orange}(x : " " x_1" "x_2" "x_3 ........ x_n)

color{orange}(f : " " f_1" "f_2" " f_3 ........ f_n)

color{red}("(i) Mean deviation about mean")

First of all we find the mean color(blue)(barx) of the given data by using the formula

color{orange}(barx = (sum_(i=1)^(n) x_i f_i)/(sum_(i=1)^(n)f_i) = 1/N sum_(i=1)^(n) x_i f_i,)

where color{navy}(sum_(i=1)^(n) x_i f_i,) denotes the sum of the products of observations x_i with their respective frequencies color{orange}(f_i)

and color{navy}(N = sum_(i=1)^(n) x_i f_i,) is the sum of the frequencies.

Then, we find the deviations of observations x_i from the mean barx and take their absolute values, i.e., color{orange}(|x_i - bar x|) for all i =1, 2,..., n.

After this, find the mean of the absolute values of the deviations, which is the required mean deviation about the mean. Thus

color{blue}(M.D .(barx) = (sum_(i=1)^(n) f_i|x_i-barx|)/(sum_(i=1)^(n) f_i) = 1/N * sum_(i=1)^(n) f_i|x_i - barx|)

color{red}("(ii) Mean deviation about median")

To find mean deviation about median, we find the median of the given discrete frequency distribution.

For this the observations are arranged in ascending order. After this the cumulative frequencies are obtained. Then, we identify the observation whose cumulative frequency is equal to or just greater than color{orange}(N/2) N is the sum of frequencies. This value of the observation lies in the middle of the data, therefore, it is the required median. After finding median, we obtain the mean of the absolute values of the deviations from median.Thus,

color{blue}(M.D .(M) = 1/N sum_(i=1)^(n) f_i |x_i - M|)

color{red}((b) "Continuous frequency distribution")

A continuous frequency distribution is a series in which the data are classified into different class-intervals without gaps along with their respective frequencies.

For example, marks obtained by 100 students are presented in a continuous frequency distribution as follows :

color{red}((i) "Mean deviation about mean")

While calculating the mean of a continuous frequency distribution, we had made the assumption that the frequency in each class is centred at its mid-point.

Here also, we write the mid-point of each given class and proceed further as for a discrete frequency distribution to find the mean deviation.

color{red}((ii) "Mean deviation about median")

The process of finding the mean deviation about median for a continuous frequency distribution is similar as we did for mean deviation about the mean.

The only difference lies in the replacement of the mean by median while taking deviations.

Let us recall the process of finding median for a continuous frequency distribution. The data is first arranged in ascending order. Then, the median of continuous frequency distribution is obtained by first identifying the class in which median lies (median class) and then applying the formula

color{blue}("Median" = l + (N/2-C)/fxxh)

where median class is the class interval whose cumulative frequency is just greater than or equal to color{orange}((N/2),N) is the sum of frequencie color{orange}( l , f , h) and color{orange}(C) are, respectively the lower limit , the frequency, the width of the median class and C the cumulative frequency of the class just preceding the median class.

After finding the median, the absolute values of the deviations of mid-point color{orange}(x_i) of each class from the median i.e., color{orange}(|x_i - M|) are obtained.

Then color{blue}(M.D. (M) = 1/N sum_(i=1)^(n) f_i |x_i-M|)

Q 3116667579

Find mean deviation about the mean for the following data :
x_i 2 5 6 8 10 12
f_i 2 8 10 7 8 5

Solution:

Let us make a Table 15.1 of the given data and append other columns after
calculations.

N= sum_(i=1)^6 f_1 = 40 , sum_(i=1)^6 f_i x_i=300 , sum_(i=1)^6 f_i | x_i - bar x | = 92

Therefore bar x = 1/N sum_(i=1)^6 f_i x_i = 1/40 xx 300 = 7.5

ND MD ( bar x ) =1/N sum_(i=1)^6 f_i | x_i - bar x | =1/40 xx 92 =2.3
Q 3176867776

Find the mean deviation about the median for the following data:

Solution:

corresponding to cumulative frequencies to the given data, we get (Table 15.2).

Now, N=30 which is even.

Median is the mean of the 15th and 16th observations. Both of these observations
lie in the cummulative freqeuncy 18, for which the corresponding observation is 13.

Therefore, Median M = ( text (15th observation + 16th observation) )/2 = (13+13)/2 =13

Now, absolute values of the deviations from median, i.e., xi − M are shown in
Table 15.3.

sum_(i=1)^8 f_i =30 and sum_(i=1)^8 f_i | x_i -M| = 149

Therefore M.D.(M) =1/N sum_(i=1)^8 f)i | x_i -M|

 =1/30 xx 149 =4.97
Q 3136067872

Find the mean deviation about the mean for the following data.

Solution:

We make the following Table 15.4 from the given data :

Here N= sum_(i=1)^7 f_i =40, sum_(i=1)^7 f_i x_i = 1800, sum_(i=1)^7 f_i | x_i - bar x| =400

Therefore bar x = 1/N sum_(i=1)^7 f_i x_i = 1800/40 =45

and M.D.(bar x ) = 1/N sum_(i=1)^7 f_i | x_i - bar x | = 1/40 xx 400 = 10
Q 3116167970

Calculate the mean deviation about median for the following data :

Solution:

Form the following Table 15.6 from the given data :

The class interval containing (N^(th))/2 or 25th item is 20-30. Therefore, 20–30 is the median class. We know that

Median = l + (N/2 -C)/f xx h

Here l = 20, C = 13, f = 15, h = 10 and N = 50

Therefore, Median=20 + (25-13)/15 xx 10 = 20 +8 =28

Thus, Mean deviation about median is given by

M.D.(M) = 1/N sum_(i=1)^6 f_i | x_i -M| =1/50 xx 508 = 10.16