美文网首页想法简友广场
数据挖掘:朴素贝叶斯Naive Bayes Classifier

数据挖掘:朴素贝叶斯Naive Bayes Classifier

作者: Cache_wood | 来源:发表于2022-04-10 16:28 被阅读0次

@[toc]

A probabilistic framework for solving classification problems

Conditional Probability:
P(C|A) = \frac{P(A,C)}{P(A)}\\ P(A|C) = \frac{P(A,C)}{P(C)}
Bayes theorem:
P(C|A) = \frac{P(A|C)P(C)}{P(A)}
Consider each attribute and class label as random variables

Given a record with attributes (A_1,A_2,…,A_n)

  • Goal is to predict class C
  • Specifically, we want to find the value of C that maximizes P(C|A_1,A_2,…,A_n)

Approach:

  • compute the posterior probability P(C|A_1,A_2,…,A_n) for all values of C using the Bayes theorem
    P(C|A_1A_2…A_n) = \frac{P(A_1A_2…A_n)P(C)}{P(A_1A_2…A_n)}

  • Choose value of C that maximizes
    P(C|A_1,A_2,…,A_n)

  • Equivalent to choosing value of C that maximizes
    P(A_1,A_2,…,A_n|C)P(C)

Naive Bayes Classifier

Assume independence among attributes A_i when class is given:

  • P(A_1,A_2,…,A_n|C_j) = P(A_1|C_j)P(A_2|C_j)…P(A_n|C_j)
  • Can estimate P(A_i|C_j) for all A_i and C_j.
  • New point is classified to C_j if P(C_j)\Pi P(A_i|C_j) is maximal.

How to Estimate Probabilities from Data

For continuous attributes:

  • Discretize the range into bins
    • one ordinal attribute per bin
    • violates independence assumption
  • Two-way split: (A<v) or (A>v)
    • cjoose only one of the two splits as new attribute
  • Probability density estimation
    • Assume attribute follows a normal distribution
    • Use data to estimate parameters of distribution(e.g., mean and standard deviation) b
    • Once probability distribution is known, can use it to estimate the conditional probability P(A_i|c)

Normal distribution :P(A_i|c_j) = \frac{1}{\sqrt{2\pi\sigma_{ij}^2}}e^{-\frac{(A_i-\mu_{ij})^2}{2\sigma_{ij}^2}}

One for each (A_i,c_i) pair

If one of the conditional probability is zero, then the entire expression becomes zero

Probability estimation:

c :number of classes, p :prior probability, m :parameter
Original: P(A_i|C) = \frac{N_{ic}}{N_c}\\ Laplace:P(A_i|C) = \frac{N_{ic}+1}{N_c+c}\\ m-estimate:P(A_i|C)= \frac{N_{ic}+mp}{N_c+m}\\

Naive Bayes(Summary)

Robust to isolated noise points.

Handle missing values by ignoring the instance during probability estimate calculations

Robust to irrelevant attributes

Independence assumption may not hold for some attributes

  • Use other techniques such as Bayesian Belief Networks (BBN)

相关文章

网友评论

    本文标题:数据挖掘:朴素贝叶斯Naive Bayes Classifier

    本文链接:https://www.haomeiwen.com/subject/xshusrtx.html