# Linear Model

Regression involves the study of equations. First we talk about some simple equations or linear models. The simplest mathematical model or equation is the equation of a straight line.

Example:

Suppose a shopkeeper is selling pencils, and he sells one pencil for 2 cents. The table below gives the number of pencils sold and the sale price of the pencils.

 Number of pencils sold $0$ $1$ $2$ $3$ $4$ $5$ Sale price (cents) $0$ $2$ $4$ $6$ $8$ $10$

Let us examine the two variables given in the table. For the sake of convenience, we can give some names to the variables given in the table. Let $X$ denote the number of pencils sold and $S$ ($S$ for sale) denote the amount realized by selling $X$ pencils. Thus,

 $X$ $0$ $1$ $2$ $3$ $4$ $5$ $S$ $0$ $2$ $4$ $6$ $8$ $10$

The information written above can be presented in some other forms as well. For example, we can write an equation describing the above relation between $X$ and $S$. It is very simple to write the equation: The algebraic equation connecting $X$ and $S$ is $S = 2X$.

This is called a mathematical equation or mathematical model in which $S$ depends upon $X$. Here $X$ is called the independent variable and $S$ is called the dependent variable. So, cent $4$ is neither less than $4$, nor more than $4$.

The above model is called a deterministic mathematical model because we can determine the value of $S$ without any error by putting the value of $X$ in the equation. The sale amount $S$ is said to be a function of $X$. This statement in symbolic form is written as: $S = f\left( X \right)$.

It is read as “$S$ is function of $X$”. It means that $S$ depends upon $X$, and only $X$ and no other element. The data in the table can be presented in the form of a graph as shown in the figure below.

The main features of the graph in the figure are:

1. The graph lies in the first quadrant because all the values of $X$ and $S$ are positive.
2. It is an exact straight line. However, not all graphs are in the form of a straight line; there could also be a curve.
3. All the points (pairs of $X$ and $S$) lie on the straight line.
4. The line passes through the origin.
5. Take any point $P$ on the line and draw a perpendicular line $PQ$ which joins $P$ with the X-axis. Let us find the ratio $\frac{{PQ}}{{OQ}}$. Here $PQ = 6$ units and $OQ = 3$ units. Thus $\frac{{PQ}}{{OQ}} = \frac{6}{3} = 2$ units. This is called the slope of the line and in general it is denoted by “$b$”. The slope of the line is the same at all points on the line. The slope “$b$” is equal to the change in $Y$ for a unit change in $X$. The relation $S = 2X$ is also called the linear equation between $X$ and $S$.

Example:

Suppose a carpenter wants to make some wooden toys for small children. He has purchased some wood and some other materials for 20$. The cost of making each toy is$$5$. The table below gives the information about the number of toys made and cost of the toys.

 Number of toys $0$ $1$ $2$ $3$ $4$ $5$ Cost of toys $20$ $25$ $30$ $35$ $40$ $45$

Let $X$ denote the number of toys and $Y$ denote the cost of the toys. What is the algebraic relationship between $X$ and $Y$? When $X = 0$, $Y = 20$. This is called a fixed or starting cost and it may be denoted by “$a$”. For each additional toy, the cost is $5$ dollars. Thus $Y$ and $X$ are connected through the following equation: $Y = 20 + 5X$

This is called the equation of a straight line. It is also a mathematical model of deterministic nature. Let us make a graph of the data in the given table. The figure below is the graph of the data in the table. We also note some important features of the graph.

1. The line $AB$ does not pass through the origin; it passes through the point $A$ on the Y-axis. The distance between $A$ and the origin $0$ is called the intercept and is usually denoted by “$a$”.
2. Take any point $P$ on the line and complete a triangle $PQA$ as shown in the figure. Let us find the ratio between the perpendicular $PQ$ and the base $AQ$ of this triangle. The ratio is, $\frac{{PQ}}{{AQ}} = \frac{{15}}{3} = 5$units.

This ratio is denoted by “$b$” in the equation of a straight line. Thus the equation of a straight line $Y = 20 + 5X$ has the intercept $a = 20$ and slope $b = 5$. In general, when the values of the intercept and slope are not known, we write the equation of a straight line as $Y = a + bX$. It is also called a linear equation between $X$ and $Y$, and the relationship between $X$ and $Y$ is called linear. The equation $Y = a + bX$ may also be called an exact linear model between $X$ and $Y$ or simply a linear model between $X$ and $Y$. The value of $Y$ can be determined completely when $X$ is given. The relationship $Y = a + bX$ is therefore called the deterministic linear model between $X$ and $Y$. In statistics, when we use the term linear model, we do not mean a mathematical model as described above.