Thursday, 13 September 2018

Classification III : Baye's Theorem



Bayes Theorem:

This is arguably one of the most important concepts of probability and predictions based on existing set, and the reason I got to start butterfly predictions.
Bayes Theorem : It the probability of an event, based on prior knowledge of conditions that might be related to the event.



P (A|B)     =     P (B|A) * P(A)
                       P(B)

Mathematics 

Now let’s take a use case, I have 2 Email Accounts with me one is with Gmail, calling it G and other is with outlook called O.

Expectation is, finding the probability that the next outlook main is a spam ?

Details

G 20 mails per day
O 10 mails per day
Out of all mails 5% are spams, and out of all spams 50% are from G and 50% from O.

Solution:

1.Now I receive 30 mails a day and any new mail coming from Gmail has probability:

P(G) = 20/30 =0 .66
P(O)=10/30=0.33

2.Probability of spam mails are

P(S)=5%=5/100=0.05

3. Since we know that 50% spams are coming from Gmail , hence the probability of span emails coming given source is Gmail is 50%  (of total spams), implies

P(G|S) =50%=.5

similarly for Outlook as well 

P(O|S)=50%=.5

4.Now the Probability of spam coming from outlook is, written as

P(S|O) =?

Applying Bayes Theorem:

P (Spam|Outlook) = P(Outlook | Spam) * P(Spam)
                                                P(Outlook)

P(S|O)                  =   P(O|S) * P(S)         = 0.075 (7.5%)
                                         P(O)

Conclusion: For all 100 mails coming from outlook, you will have 7-8 mails as spams.   

P (Spam|Gmail)  =  P(Gmail | Spam) * P(Spam)
                                          P(Gmail)

P(S|O)                 =   P(G|S) * P(S)          = 0.37
                                         P(G)

Conclusion: For all 100 mails coming from Gmail, you will have nearly 4 mails as spams.          

Part II : Naive Bayes in Machine Learning

Naive Bayes classifiers is one of simple probabilistic classifiers based on applying Bayes Theorem with naive independence assumptions between the features.

Abstractly if the n features(independent variables ) f1,f2,f3…fn and outcome class C , the probability in one range(for large number of features ) can be said as :

P(C|F) = P(C|F) P(C)
                     P(F)

In Bayes terms

Posterior  =   Prior * Likelihood
                                Evidence


How to apply the Theorem: Example

We have a data set with us which provides us the sales of a car , with respect to few attributes , now we would like to predict using Naïve Classifier , based on these features, how much are the chances of selling the car , following R code will explain in simple lines how can we achieve that.




R Code :Naive Bayes Classifier

Loading the libraries

library (e1071)
library(MASS)
library(caTools)
library(caret)
## Loading required package: lattice
## Loading required package: ggplot2
library(xlsx)

Loading data , DA frame remove the continuous key values (SNos) which is not required while classification

Data<-read.xlsx(file="NBDS.xlsx",sheetName="Sheet2",header=TRUE)
DA<- Data[,2:5]
DA
##     Color   Type   Origin Sold
## 1     Red Sports Domestic    1
## 2     Red Sports Domestic    0
## 3     Red Sports Domestic    1
## 4  Yellow Sports Domestic    0
## 5  Yellow Sports Imported    1
## 6  Yellow    SUV Imported    0
## 7  Yellow    SUV Imported    1
## 8  Yellow    SUV Domestic    0
## 9     Red    SUV Imported    0
## 10    Red Sports Imported    1

Splitting the data in training and test set , 75% split ratio, although this is very small set , but its required for validation.

DA$Sold<-as.factor(DA$Sold)
set.seed(123)
split = sample.split(DA$Sold, SplitRatio = 0.75)
train = subset(DA, split == TRUE)
test = subset(DA, split == FALSE)
str(DA)
## 'data.frame':    10 obs. of  4 variables:
##  $ Color : Factor w/ 2 levels "Red","Yellow": 1 1 1 2 2 2 2 2 1 1
##  $ Type  : Factor w/ 2 levels "Sports","SUV": 1 1 1 1 1 2 2 2 2 1
##  $ Origin: Factor w/ 2 levels "Domestic","Imported": 1 1 1 1 2 2 2 1 2 2
##  $ Sold  : Factor w/ 2 levels "0","1": 2 1 2 1 2 1 2 1 1 2

Creation of Model

NB_Model<-naiveBayes(x=train[,1:3],y=train$Sold)
NB_Model
## 
## Naive Bayes Classifier for Discrete Predictors
## 
## Call:
## naiveBayes.default(x = train[, 1:3], y = train$Sold)
## 
## A-priori probabilities:
## train$Sold
##   0   1 
## 0.5 0.5 
## 
## Conditional probabilities:
##           Color
## train$Sold Red Yellow
##          0 0.5    0.5
##          1 0.5    0.5
## 
##           Type
## train$Sold Sports  SUV
##          0   0.50 0.50
##          1   0.75 0.25
## 
##           Origin
## train$Sold Domestic Imported
##          0     0.75     0.25
##          1     0.50     0.50

Scoring and Confusion Matrix

pred <- predict(NB_Model,newdata=DA)
cm=confusionMatrix(data=pred,reference = DA$Sold)
cm
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction 0 1
##          0 5 3
##          1 0 2
##                                           
##                Accuracy : 0.7             
##                  95% CI : (0.3475, 0.9333)
##     No Information Rate : 0.5             
##     P-Value [Acc > NIR] : 0.1719          
##                                           
##                   Kappa : 0.4             
##  Mcnemar's Test P-Value : 0.2482          
##                                           
##             Sensitivity : 1.000           
##             Specificity : 0.400           
##          Pos Pred Value : 0.625           
##          Neg Pred Value : 1.000           
##              Prevalence : 0.500           
##          Detection Rate : 0.500           
##    Detection Prevalence : 0.800           
##       Balanced Accuracy : 0.700           
##                                           
##        'Positive' Class : 0               
## 

  
Summary : We get 70% accuracy with our set , since the data is scattered its fine for a very small sample , next articles we will evaluate how do we measure the accuracy and achieve better results of classification models.

Let me know if you need additional details in comment section.

0 comments:

Post a Comment