# Construction of Frequency Distribution

The following steps are involved in the construction of a frequency distribution.

(1) Find the range of the data: The range is the difference between the largest and the smallest values.

(2) Decide the approximate number of classes in which the data are to be grouped. There are no hard and first rules for number of classes. In most cases we have $5$ to $20$ classes. H.A. Sturges provides a formula for determining the approximation number of classes.

where $K$= Number of classes
and $\log N$ = Logarithm of the total number of observations.

Example: If the total number of observations is $50$, the number of classes would be

$K = 1 + 3.322\log N$
$K = 1 + 3.322\log 50$
$K = 1 + 3.322(1.69897)$
$K = 1 + 5.644$
$K = 6.644$
$7$ classes, approximately.

(3) Determine the approximate class interval size: The size of class interval is obtained by dividing the range of data by the number of classes and is denoted by $h$ class interval size

In the case of fractional results, the next higher whole number is taken as the size of the class interval.

(4) Decide the starting point: The lower class limit or class boundary should cover the smallest value in the raw data. It is a multiple of class intervals.

Example: $0$,$5$,$10$,$15$,$20$etc. are commonly used.

(5) Determine the remaining class limits (boundary): When the lowest class boundary has been decided, by adding the class interval size to the lower class boundary you can compute the upper class boundary. The remaining lower and upper class limits may be determined by adding the class interval size repeatedly till the largest value of the data is observed in the class.

(6) Distribute the data into respective classes: All the observations are divided into respective classes by using the tally bar (tally mark) method, which is suitable for tabulating the observations into respective classes. The number of tally bars is counted to get the frequency against each class. The frequency of all the classes is noted to get the grouped data or frequency distribution of the data. The total of the frequency columns must be equal to the number of observations.