Point Estimation

In the point estimation procedure we make an attempt to compute a numerical value from sample observations, which could be taken as an approximation to the parameter. The estimators, which are also referred to as statistics (plural of statistic), since they are based on observations which are random variables themselves. A number of estimation methods like method of least square, method of maximum likelihood, method of moments, etc., are available with some specific properties.

Method of Least Square

The method of least square is specifically used in regression analysis to estimate the regression coefficients. To understand the technique of estimation let us consider the following simple example. Please note that a formal treatment of the least square method which involves the inclusion of the disturbance term has been avoided for simplicity.

Suppose that consumption expenditure $Y$ is linearly related to only one variable, family income $X$. This can be written mathematically as $Y = a + bX$

In economics, this relation is known as a consumption function, where $a$ is a measure of the consumption expenditure at zero level of income and $b$ is a measure of the marginal propensity to consume, i.e., it gives a measure of how much will be consumed from each additional unit of income. The consumption function is in the parametric form, specifying a different relationship for different values of the parameters ($a$ and $b$). The parameters ($a$, $b$) are not known and need to be estimated on the basis of a sample. A random sample of $n$ households is drawn from the population under study. The information about consumption and income is recorded as follows for each of these households.

 Consumption Expenditure Family Income Y1 X1 Y1 X2 $\vdots$ $\vdots$ Yn Xn

On the basis of these sample observations we wish to estimate the consumption function. Let the estimating equation be $\widehat Y = \widehat a + \widehat bX$, where $\widehat Y$ ($Y$- hat), $\widehat a$ ($a$- hat) and $\widehat b$ ($b$- hat) are the estimates of $Y$, $a$ and $b$ respectively.

Since $\widehat Y$ is an estimate of $Y$, it will be very lucky on our part to have a $\widehat Y$ equal to $Y$; otherwise they will be different. The difference between an estimate value $\widehat Y$ and the observed value $Y$ is denoted by $e$, which is usually termed “residual”, “deviation” or “error term”. This residual may be positive or negative.

$\begin{gathered} e = Y – \widehat Y \\ e = Y – \widehat a – \widehat bX \\ \end{gathered}$
The smaller the residuals are, the closer the estimating equation $\widehat Y = \widehat a + \widehat bX$ is to the original model $Y = a + bX$. Hence, to have a closer estimating equation for $Y = a + bX$ we should minimize the residuals. The residuals are minimized according to the following principle, which states that:

“Those values of $\widehat a$ and $\widehat b$ should be chosen which minimize the sum of squared residual”. This principle is known as the “principle of least squares”

Thus, the sum of the squared residual may be written as $\sum {e^2} = \sum {\left( {\widehat Y – \widehat a – \widehat bX} \right)^2}$

In order to minimize the quantity $\sum {e^2}$, we will use the technique of differential calculus. Hence, differentiating $\sum {e^2} = \sum {\left( {\widehat Y – \widehat a – \widehat bX} \right)^2}$ with respect to $\widehat a$ and $\widehat b$ equating the resulting derivatives to zero.

$\begin{gathered} \frac{{\partial \sum {e^2}}}{{\partial \widehat a}} = – 2\sum \left( {Y – \widehat a – \widehat bX} \right) = 0 \\ \frac{{\partial \sum {e^2}}}{{\partial \widehat b}} = – 2\sum X\left( {Y – \widehat a – \widehat bX} \right) = 0 \\ \end{gathered}$

Simplifying the above equations, we have

$\begin{gathered} \sum Y = n\widehat a + \widehat b\sum X \\ \sum XY = \widehat a\sum X + \widehat b\sum {X^2} \\ \end{gathered}$

These two equations are called the “Normal Equation” in which if we substitute the values $\sum Y,\,\,\sum X,\,\,\sum {X^2},\,\,\sum XY$ and $n$ from our sample observations, the two estimates $\widehat a$ and $\widehat b$ of the unknown parameters a and b can be determined by solving the simultaneous equations.