Dimensionality Reduction Problem

Dimension?

To understand the dimension of a pattern classification problem. Let's consider the following example.
Example:1 
Consider a class room of 100 students. We want to measure the BMI of each students and classify them into four classes say, Normal, Overweight, Obese, Morbidly Obese. To do this we shall just calculate weight and height of each students and will find their BMI. Now we know how to calculate BMI. Which is nothing but wight (measured in kg) divided by square of height (measured in meter). Now this is an toy example of classification problem having four classes and  two  measurements (viz., Height, Weight). Total number of such measurements are termed as dimension of a problem. 


Now, I guess we got an idea about dimension. But our life is not that simple alike the previous example. Very often we do not have a prior idea about class boundaries. So we need some decision rule to solve these problems. Bayesian decision theory is one of them.

Dimensionality reduction!!!  Is it really necessary?

Example2:
Again to answer this question let us consider another example. Assume that, you are a bank manager and you want to form an surveillance system like the following. There will be a camera installed over the top of the entrance of your branch and it will capture images of the person entering the bank. Then there will be a complex image processing techniques, that will tell whether the person is carrying a gun or not. If the system decides that the person is carrying a gun, the door will not open and a emergency message will be sent to the nearest police station. Since machine is taking the decision, there must be mistakes. Not let's see what are the consequences of these mistakes. 
Mistake 1: The person is actually not carrying a gun but your system says as affirmative. So police will come and a search harassment will be done to the innocent customer.
Mistake: The person is actually carrying a gun but system says as negative.  So the whole point of making such system is gone.
So we have to minimize mistakes. One general prescription to do so is the inclusion of  as many as attribute as possible (i.e., increase dimension of the problem). But it will increase the complexity of the system as a result we will be getting high processing time. So there will be a long queue in front of the entrance. So instead of 1000 customers, 100 customers will be able to enter in the bank. As a result there will be huge monitory loss from your end. So we can not practically deal with high dimensions. Some how we have to reduce dimension so that we still get useful information.   

0 Reviews