 Class 10 Introduction And Mean of Grouped Data

### Topic Covered

♦ Introduction
♦ Mean of Grouped Data

### Introduction

You have studied the classification of given data into ungrouped as well as grouped frequency distributions. You have also learnt to represent the data pictorially in the form of various graphs such as bar graphs, histograms (including those of varying widths) and frequency polygons.

In fact, you went a step further by studying certain numerical representatives of the ungrouped data, also called measures of central tendency, namely, mean, median and mode.

Here, we shall extend the study of these three measures, i.e., mean, median and mode from ungrouped data to that of grouped data.

We shall also discuss the concept of cumulative frequency, the cumulative frequency distribution and how to draw cumulative frequency curves, called ogives.

### Mean of Grouped Data

The mean (or average) of observations, as we know, is the sum of the values of all the observations divided by the total number of observations.

Recall that if x_1, x_2,. . ., x_n are observations with respective frequencies f_1, f_2, . . ., f_n, then this means observation x_1 occurs f_1 times, x_2 occurs f_2 times, and so on.

Now, the sum of the values of all the observations = f_1 x_1 + f_2 x_2 + . . . + f_n x_n, and
the number of observations = f_1 + f_2 + . . . + f_n.

So, the mean bar x of the data is given by

 color { red} { bar x = (f_1 x_1 +f_2 x_2 + ............+ f_n x_n )/( f_1 + f_2 +........+ f_n) }

Recall that we can write this in short form by using the Greek letter Σ (capital sigma) which means summation. That is,

 color { blue } { bar x = ( sum_(i=1)^n (f_i x_i))/( sum_(i=1)^n f_i) }

which, more briefly, is written as  color { green } { bar x = (sum f_i x_i )/( sum f_i) }  if it is understood that i varies from 1 to n.

Let us apply this formula to find the mean in the following example.
Q 3250134014 The marks obtained by 30 students of Class X of a certain school in a
Mathematics paper consisting of 100 marks are presented in table below. Find the
mean of the marks obtained by the students.
Class 10 Chapter 14 Example 1 Solution:

Recall that to find the mean marks, we require the product of each x_i with
the corresponding frequency f_i. So, let us put them in a column as shown in Table 14.1 Now,  bar x = (sum f_i x_i)/( sum f_i) = 1779/30 = 59.3

Therefore, the mean marks obtained is 59.3.

In most of our real life situations, data is usually so large that to make a meaningful
study it needs to be condensed as grouped data. So, we need to convert given ungrouped
data into grouped data and devise some method to fin1.1d its mean.

Let us convert the ungrouped data of Example 1 into grouped data by forming
class-intervals of width, say 15. Remember that, while allocating frequencies to each
class-interval, students falling in any upper class-limit would be considered in the next
class, e.g., 4 students who have obtained 40 marks would be considered in the classinterval
40-55 and not in 25-40. With this convention in our mind, let us form a grouped
frequency distribution table (see Table 14.2). Now, for each class-interval, we require a point which would serve as the
representative of the whole class. It is assumed that the frequency of each classinterval
is centred around its mid-point. So the mid-point (or class mark) of each
class can be chosen to represent the observations falling in the class. Recall that we
find the mid-point of a class (or its class mark) by finding the average of its upper and
lower limits. That is,

text (Class mark ) = ( text (Upper class limit) + text (Lower class limit) )/2

With reference to Table 14.2, for the class 10-25, the class mark is  (10+25)/2 , i.e,

17.5. Similarly, we can find the class marks of the remaining class intervals. We put
them in Table 14.3. These class marks serve as our x_i’s. Now, in general, for the i^(th)
class interval, we have the frequency f_i corresponding to the class mark xi. We can
now proceed to compute the mean in the same manner as in Example 1. The sum of the values in the last column gives us Σ f_i x_i. So, the mean bar x of the
given data is given by

bar x = (sum f_i x_i )/( sum f_i) = (1860.0)/30 = 62

This new method of finding the mean is known as the Direct Method.

We observe that Tables 14.1 and 14.3 are using the same data and employing the
same formula for the calculation of the mean but the results obtained are different.
Can you think why this is so, and which one is more accurate? The difference in the
two values is because of the mid-point assumption in Table 14.3, 59.3 being the exact
mean, while 62 an approximate mean.

Sometimes when the numerical values of x_i and f_i are large, finding the product
of x_i and f_i becomes tedious and time consuming. So, for such situations, let us think of
a method of reducing these calculations.

We can do nothing with the f_i’s, but we can change each x_i to a smaller number
so that our calculations become easy. How do we do this? What about subtracting a
fixed number from each of these x_i’s? Let us try this method.

The first step is to choose one among the xi’s as the assumed mean, and denote
it by ‘a’. Also, to further reduce our calculation work, we may take ‘a’ to be that x_i
which lies in the centre of x_1, x_2, . . ., x_n. So, we can choose a = 47.5 or a = 62.5. Let
us choose a = 47.5.

The next step is to find the difference d_i between a and each of the x_i’s, that is,
the deviation of ‘a’ from each of the x_i’s.

i.e., d_i = x_i – a = x_i – 47.5

The third step is to find the product of di with the corresponding f_i, and take the sum
of all the f_i d_i’s. The calculations are shown in Table 14.4. So, from Table 14.4, the mean of the deviations, bar d = (sum f_i d_i )/( sum f_i )

Now, let us find the relation between bar d and bar x .

Since in obtaining d_i, we subtracted ‘a’ from each x_i, so, in order to get the mean
bar x , we need to add ‘a’ to bar d . This can be explained mathematically as:

Mean of deviations, bar d = (sum f_i d_i )/(sum f_i)

So, bar d = (sum f_i (x_i -a) )/(sum f_i )

= (sum f_i x_i)/(sum f_i) - ( sum f_i a)/( sum f_i )

= bar x - a (sum f_i )/(sum f_i)

= bar x - a

So, bar x = a+ bar d

i.e., bar x = a + (sum f_i d_i )/( sum f_i )

Substituting the values of a , sum f_i d_i  and sum f_i from Table 14.4, we get

bar x = 47.5 + 435/30 = 47.5 +14.5 =62,

Therefore, the mean of the marks obtained by the students is 62.

The method discussed above is called the Assumed Mean Method.

### Activity 1 :

From the Table 14.3 find the mean by taking each of x_i (i.e., 17.5, 32.5, and so on) as ‘a’. What do you observe? You will find that the mean determined in each case is the same, i.e., 62.

So, we can say that the value of the mean obtained does not depend on the choice of ‘a’.

Observe that in Table 14.4, the values in Column 4 are all multiples of 15. So, if we divide the values in the entire Column 4 by 15, we would get smaller numbers to multiply with f_i. (Here, 15 is the class size of each class interval.)

So, let u_i = (x_i -a)/h , where a is the assumed mean and h is the class size.

Now, we calculate ui in this way and continue as before (i.e., find f_i u_i and
then  Σ f_i u_i ). Taking h = 15, let us form Table 14.5. Let bar u = (sum f_i u_i )/(sum f_i )

Here, again let us find the relation between bar u and bar x .

We have,  u_i = (x_i -a )/h

Therefore,  color { orange } { bar u = (sum f_i (x_i -a)/h )/( sum f_i) =1/h [ (sum f_i x_i - a sum f_i )/(sum f_i ) ] }

color { violet } { = 1/h [ (sum f_i x_i )/(sum f_i) -a (sum f_i )/(sum f_i) ] }

= 1/h [bar x -a ]

So, h bar u = bar x - a

i.e., x = a + h bar u

so, bar x = a+h ( (sum f_i u_i )/(sum f_i ) )

Now, substituting the values of a, h, Σf_i u_i and Σ f_i from Table 14.5, we get

bar x = 47.5 +15 xx (29/30)

=47.5 +14.5 = 62

So, the mean marks obtained by a student is 62.

The method discussed above is called the Step-deviation method.

We note that :

♦ the step-deviation method will be convenient to apply if all the d_i’s have a
common factor.

♦ The mean obtained by all the three methods is the same.

♦ The assumed mean method and step-deviation method are just simplified
forms of the direct method.

♦ The formula bar x = a + h bar u still holds if a and h are not as given above, but are

any non-zero numbers such that u_i = (x_i -a )/h

Let us apply these methods in another example.
Q 3270634516 The table below gives the percentage distribution of female teachers in
the primary schools of rural areas of various states and union territories (U.T.) of
India. Find the mean percentage of female teachers by all the three methods discussed
in this section.
Class 10 Chapter 14 Example 2 Solution:

Let us find the class marks, x_i, of each class, and put them in a column
(see Table 14.6): Here we take a = 50, h = 10, then d_i = x_i – 50 and u_i = (x_i -50 )/10

We now find d_i and u_i and put them in Table 14.7. From the table above, we obtain Σ f_i = 35, Σ f_i x_i = 1390,

Σ f_i d_i = – 360, Σ f_i u_i = –36.

Using the direct method, bar x = (sum f_i x_i )/( sum f_i) = 1390/35 = 39.71

Using the assumed mean method,

bar x = a + (sum f_i d_i )/(sum f_i ) =50 + ( -360)/(35) = 39.71

Using the step-deviation method,

bar x = a + ( (sum f_i u_i )/( sum f_i ) ) xx h = 50 + ( (-36)/(35) ) xx 10 = 39.71

Therefore, the mean percentage of female teachers in the primary schools of
rural areas is 39.71.

text ( Remark : ) The result obtained by all the three methods is the same. So the choice of
method to be used depends on the numerical values of x_i and f_i. If x_i and f_i are
sufficiently small, then the direct method is an appropriate choice. If x_i and f_i are
numerically large numbers, then we can go for the assumed mean method or
step-deviation method. If the class sizes are unequal, and x_i are large numerically, we
can still apply the step-deviation method by taking h to be a suitable divisor of all the d_i’s.
Q 3210634519 The distribution below shows the number of wickets taken by bowlers in
one-day cricket matches. Find the mean number of wickets by choosing a suitable
method. What does the mean signify?
Class 10 Chapter 14 Example 3 Solution:

Here, the class size varies, and the x_i,s are large. Let us still apply the stepdeviation
method with a = 200 and h = 20. Then, we obtain the data as in Table 14.8. So, bar u = (-106)/45. Therefore, bar x = 200 +20 ( (-106)/45) = 200 -47.11 =152.89

This tells us that, on an average, the number of wickets taken by these 45 bowlers
in one-day cricket is 152.89.

Now, let us see how well you can apply the concepts discussed in this section!

text ( Activity 2 : )`

Divide the students of your class into three groups and ask each group to do one of the
following activities.

1. Collect the marks obtained by all the students of your class in Mathematics in the
latest examination conducted by your school. Form a grouped frequency distribution
of the data obtained.

2. Collect the daily maximum temperatures recorded for a period of 30 days in your
city. Present this data as a grouped frequency table.

3. Measure the heights of all the students of your class (in cm) and form a grouped
frequency distribution table of this data.

After all the groups have collected the data and formed grouped frequency
distribution tables, the groups should find the mean in each case by the method which
they find appropriate.

### Activity 2 :

Divide the students of your class into three groups and ask each group to do one of the following activities.

1. Collect the marks obtained by all the students of your class in Mathematics in the latest examination conducted by your school. Form a grouped frequency distribution of the data obtained.

2. Collect the daily maximum temperatures recorded for a period of 30 days in your city. Present this data as a grouped frequency table.

3. Measure the heights of all the students of your class (in cm) and form a grouped frequency distribution table of this data.

After all the groups have collected the data and formed grouped frequency distribution tables, the groups should find the mean in each case by the method which they find appropriate. 