## Classification III : Baye's Theorem

**Bayes Theorem:**

Bayes Theorem : It the probability of an event,
based on prior knowledge of conditions that might be related to the event.

**P (A|B) =**

__P (B|A) * P(A)__**P(B)**

**Mathematics**

Now let’s take a use case,
I have 2 Email Accounts with me one is with Gmail, calling it G and other is
with outlook called O.

Expectation
is, finding the probability that the next outlook main is a spam ?

**Details**

G 20 mails per day

O 10 mails per day

Out of all mails 5% are
spams, and out of all spams 50% are from G and 50% from O.

**Solution:**

1.Now I receive 30 mails a
day and any new mail coming from Gmail has probability:

P(G) = 20/30 =0 .66

P(O)=10/30=0.33

2.Probability of spam
mails are

P(S)=5%=5/100=0.05

3. Since we know that 50%
spams are coming from Gmail , hence the probability of span emails coming given source is Gmail is 50% (of total spams), implies

P(G|S) =50%=.5

similarly for Outlook as well

P(O|S)=50%=.5

4.Now the Probability of spam
coming from outlook is, written as

P(S|O) =?

Applying Bayes Theorem:

P (Spam|Outlook) =

__P(Outlook | Spam) * P(Spam)__
P(

__Outlook__)
P(S|O)
=

__P(O|S) * P(S)__= 0.075 (7.5%)
P(O)

**Conclusion: For all 100 mails coming from outlook, you will have 7-8 mails as spams.**

P (Spam|Gmail)
=

__P(Gmail | Spam) * P(Spam)__
P(

__Gmail__)
P(S|O) =

__P(G|S) * P(S)__= 0.37
P(G)

**Conclusion: For all 100 mails coming from Gmail, you will have nearly 4 mails as spams.**

**Part II : Naive Bayes in Machine Learning**

**Naive Bayes classifiers**is one of simple probabilistic classifiers based on applying Bayes Theorem with naive independence assumptions between the features.

Abstractly if the n
features(independent variables ) f1,f2,f3…fn and outcome class C , the
probability in one range(for large number of features ) can be said as :

P(C|F) =

__P(C|F) P(C)__
P(F)

In Bayes terms

**Posterior**

**=**

__Prior * Likelihood__**Evidence**

**How to apply the Theorem: Example**

We have a data set with us which provides us the sales of a car ,
with respect to few attributes , now we would like to predict using Naïve Classifier , based on these features, how much are the chances of selling the car ,
following R code will explain in simple lines how can we achieve that.

### R Code :Naive Bayes Classifier

#### Loading the libraries

```
library (e1071)
library(MASS)
library(caTools)
library(caret)
```

`## Loading required package: lattice`

`## Loading required package: ggplot2`

`library(xlsx)`

#### Loading data , DA frame remove the continuous key values (SNos) which is not required while classification

```
Data<-read.xlsx(file="NBDS.xlsx",sheetName="Sheet2",header=TRUE)
DA<- Data[,2:5]
DA
```

```
## Color Type Origin Sold
## 1 Red Sports Domestic 1
## 2 Red Sports Domestic 0
## 3 Red Sports Domestic 1
## 4 Yellow Sports Domestic 0
## 5 Yellow Sports Imported 1
## 6 Yellow SUV Imported 0
## 7 Yellow SUV Imported 1
## 8 Yellow SUV Domestic 0
## 9 Red SUV Imported 0
## 10 Red Sports Imported 1
```

#### Splitting the data in training and test set , 75% split ratio, although this is very small set , but its required for validation.

```
DA$Sold<-as.factor(DA$Sold)
set.seed(123)
split = sample.split(DA$Sold, SplitRatio = 0.75)
train = subset(DA, split == TRUE)
test = subset(DA, split == FALSE)
str(DA)
```

```
## 'data.frame': 10 obs. of 4 variables:
## $ Color : Factor w/ 2 levels "Red","Yellow": 1 1 1 2 2 2 2 2 1 1
## $ Type : Factor w/ 2 levels "Sports","SUV": 1 1 1 1 1 2 2 2 2 1
## $ Origin: Factor w/ 2 levels "Domestic","Imported": 1 1 1 1 2 2 2 1 2 2
## $ Sold : Factor w/ 2 levels "0","1": 2 1 2 1 2 1 2 1 1 2
```

#### Creation of Model

```
NB_Model<-naiveBayes(x=train[,1:3],y=train$Sold)
NB_Model
```

```
##
## Naive Bayes Classifier for Discrete Predictors
##
## Call:
## naiveBayes.default(x = train[, 1:3], y = train$Sold)
##
## A-priori probabilities:
## train$Sold
## 0 1
## 0.5 0.5
##
## Conditional probabilities:
## Color
## train$Sold Red Yellow
## 0 0.5 0.5
## 1 0.5 0.5
##
## Type
## train$Sold Sports SUV
## 0 0.50 0.50
## 1 0.75 0.25
##
## Origin
## train$Sold Domestic Imported
## 0 0.75 0.25
## 1 0.50 0.50
```

#### Scoring and Confusion Matrix

```
pred <- predict(NB_Model,newdata=DA)
cm=confusionMatrix(data=pred,reference = DA$Sold)
cm
```

```
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 5 3
## 1 0 2
##
## Accuracy : 0.7
## 95% CI : (0.3475, 0.9333)
## No Information Rate : 0.5
## P-Value [Acc > NIR] : 0.1719
##
## Kappa : 0.4
## Mcnemar's Test P-Value : 0.2482
##
## Sensitivity : 1.000
## Specificity : 0.400
## Pos Pred Value : 0.625
## Neg Pred Value : 1.000
## Prevalence : 0.500
## Detection Rate : 0.500
## Detection Prevalence : 0.800
## Balanced Accuracy : 0.700
##
## 'Positive' Class : 0
##
```

Summary : We get 70%
accuracy with our set , since the data is scattered its fine for a very small
sample , next articles we will evaluate how do we measure the accuracy and achieve
better results of classification models.

Let me know if you need additional details in comment section.

## 0 comments:

## Post a Comment