Monday, October 27, 2014

Predictions - Effect of unique number of target classes on accuracy




When we perform machine learning of type classification, the target variable is a categorical (nominal) variable that has a set of unique values or classes . It could be a simple two class target variable like "approve application? " with classes (values)  of "yes" or "no". Sometimes they might indicate ranges like "Excellent", "Good" etc. for a target variable like satisfaction score. We might also convert continuous variables like test scores (1 - 100)  into classes like grades (A, B, C etc).

This experiment is to find the effect of the number of unique classes in the target variable on the accuracy of the prediction. The hypothesis is that accuracy will go down as the number of classes increases. This is because, with each additional class boundary, there is additional chance of a predicted sample to end up on the wrong side of the boundary.

For this experiment, I used a data set of  blood pressure levels. Each observation contains the patient's demographics and the actual systolic blood pressure measured. The value of the blood pressure is the binned into multiple classes (blood pressure ranges). Prediction of the blood pressure range is then done for varying number of bins (classes). The results are then tabulated as follows.


The experiment confirms the hypothesis. Accuracy drops sharply as the number of classes in the target variable increases. It does taper out beyond as size of 8.






14 comments:

  1. I have read your blog and I gathered some needful information from your blog. Keep update your blog. Awaiting for your next update.
    Data Science Online Training
    Hadoop Online Training

    ReplyDelete
  2. Hello,
    The Article on Effect of unique number of target classes on accuracy is really amazing give detail information about it .Thanks for Sharing the information about it. data science consulting

    ReplyDelete
  3. That is very interesting; you are a very skilled blogger. I have shared your website in my social networks! A very nice guide. I will definitely follow these tips. Thank you for sharing such detailed article.

    Data science training in Marathahalli|
    Data science training in Bangalore|
    Hadoop Training in Marathahalli|
    Hadoop Training in Bangalore|

    ReplyDelete
  4. Thanks for one marvelous posting! I enjoyed reading it; you are a great author. I will make sure to bookmark your blog and may come back someday.
    nebosh course in chennai

    ReplyDelete

  5. The blogs are really appreciable and one can trust the knowledge and information provided in the writing. The article you do produce on a weekly base really the best. I have found a similar websitedata science consulting visit the site to know more about Omdata.

    ReplyDelete
  6. got knowledge in comparing performance of different machine learning algorithms Surya Informatics

    ReplyDelete
  7. Hi,
    Best article, very useful and well explanation. Your post is extremely incredible.Good job & thank you very much for the new information, i learned something new. Very well written. It was sooo good to read and usefull to improve knowledge. Who want to learn this information most helpful. One who wanted to learn this technology IT employees will always suggest you take Best Training institutes for python course in BTM Layout

    ReplyDelete

  8. Wondeful post,very well explained.Thanks for sharing,extremly easy to understand.Python is higly expressive prograaming language.These days all IT industries suggest one to take course on python.
    Best Python Training in BTM Layout

    ReplyDelete
  9. This comment has been removed by the author.

    ReplyDelete