Virtual Labs

Movie Review Sentiment Analysis using Naïve Bayes

Understanding Naive Bayes for Sentiment Analysis

1. Introduction

Sentiment analysis is a natural language processing (NLP) technique used to determine whether a given piece of text expresses a positive, negative, or neutral sentiment. It is widely applied in various domains such as customer feedback analysis, social media monitoring, and market research.

2. Importance of Sentiment Analysis

Enables businesses to understand customer emotions and opinions.
Automates the analysis of large volumes of text data.
Improves decision-making by identifying trends and customer satisfaction levels.
Used in recommendation systems, brand monitoring, and product reviews.

3. What is Naive Bayes?

Naive Bayes is a supervised learning algorithm based on Bayes' Theorem. It is particularly effective for text classification tasks like spam detection and sentiment analysis.

3.1 Bayes' Theorem

Bayes' Theorem describes the probability of an event occurring based on prior knowledge of related conditions.

Formula:
P(A|B) = [P(B|A) × P(A)] / P(B)

P(A|B) → Posterior probability
P(B|A) → Likelihood
P(A) → Prior probability of class
P(B) → Evidence

3.2 Why is it Called "Naive"?

The algorithm assumes that all features (words in a review) are conditionally independent given the class label — a "naive" assumption that is rarely true in language, yet the classifier still performs remarkably well.

3.3 Mathematical Formulation of Naive Bayes for Sentiment Analysis

Given a document X with words w₁, w₂, …, wₙ, we want to find the most probable class C:

P(C|X) = [P(X|C) × P(C)] / P(X)

Since P(X) is constant, we maximize:

P(C|X) ∝ P(C) × P(X|C)

Under the naive independence assumption:

P(X|C) = Π(i=1 to n) P(w_i|C)

Final scoring function:

P(C|X) ∝ P(C) × Π(i=1 to n) P(w_i|C)

Laplace (Add-One) Smoothing

To avoid zero probabilities for unseen words:

P(w_i|C) = (count(w_i, C) + 1) / (total words in class C + |V|)

where |V| = vocabulary size.

Log Probabilities (for Numerical Stability)

Instead of multiplying tiny numbers, we use log:

log P(C|X) = log P(C) + Σ(i=1 to n) log P(w_i|C)

We pick the class with the highest log score.

4. Steps in Sentiment Analysis Using Naive Bayes

Step 4: Train the Naive Bayes Model

Prior probability:
**`P(C) = (number of documents in class C) / (total number of documents)`**
Likelihoods computed using Laplace smoothing (formula above)

(All other steps remain the same — only formulas are highlighted)

5. Advantages of Naive Bayes in Sentiment Analysis

Very fast training and prediction
Works well with high-dimensional text data
Surprisingly effective despite strong independence assumption
Requires relatively little training data

6. Limitations of Naive Bayes

Feature independence assumption is violated in real language
Poor at capturing sarcasm, negation ("not good"), and context
Zero probability problem if a word wasn't seen in training (mitigated by smoothing)

7. Key Takeaways

Naive Bayes is a simple, fast, and surprisingly powerful baseline for sentiment analysis
Text preprocessing significantly impacts performance
Laplace smoothing and log probabilities are essential practical tricks
Always evaluate using proper metrics on unseen data