Independence of Attributes

Let us consider certain examples before we discuss independence in a formal manner.

 

Example:

Consider the following sample data on the liking of fish for males and females.

Gender
 
Males
Females
Total
Like Fish
80
80
160
Do Not Like Fish
20
20
40
Total
100
100
200

Discussion: Out of 100 males 80 like fish and out of 100 females 80 like fish. Males and females have the same liking for fish. We say that there is independence between gender and liking or disliking fish. Another way of saying the same thing is that there is no relation between gender and liking fish.

 

Example:
           
Consider the following sample data on smoking for adult males and females:

Gender
 
Males
Females
Total
Smokers
20
1
21
Non-Smokers
80
99
179
Total
100
100
200

Discussion: Out of 100 males 20 are smokers and out of 100 females there is only one smoker. It means that there are 20 times more male smokers than among females. Males have a strong relation or association with smoking. Thus males and smoking are strongly associated. We say that there is a positive association between males and smoking. There are 99 females who are non-smokers as compared to 80 male-non smokers. Thus females are inclined towards non-smoking. The association between females and not smoking is also of positive type. There is only 1 female smoker as compared to 20 male smokers. Thus there is negative association between females and smoking and there is also a negative association between males and not smoking. Thus in a certain contingency table when there is a positive association between two attributes, then in the same table there exists a negative association between some other pairs of attributes. If there is a positive association between A and B, then \alpha and \beta are also positively associated. In this case there is a negative association between A and \beta , and between \alpha and B.

 

The data may be written as

 
Males A
Females \alpha
Total
Non-Smokers, B
\left( {AB} \right) = 80
\left( {\alpha B} \right) = 99
179
Smokers, \beta
\left( {A\beta } \right) = 20
\left( {\alpha \beta } \right) = 1
21
Total
100
100
200

In this table 80 is less than 99 and 20 is greater than 1 (or 1 is less than 20). There is a negative association between A and B and between \alpha and \beta . There is a positive association between A and \beta and between \alpha and B. If the attributes in the one diagonal have a positive association, then the attributes in the other diagonal have a negative association.