<< back to Guides

AI Core Concepts (Part 12): Bayesian Learning

Bayesian Learning is a probabilistic approach to modeling uncertainty in machine learning. Instead of finding a single best model, it estimates a distribution over possible models given the data.


1. Core Idea: Bayes’ Theorem

Bayesian learning updates our beliefs about model parameters using Bayes' theorem:

P(θ | D) = [ P(D | θ) * P(θ) ] / P(D)

Where:


2. Why Bayesian?


3. Example: Bayesian Linear Regression

Instead of estimating one best line, Bayesian regression gives a distribution of possible lines.

from sklearn.linear_model import BayesianRidge
import numpy as np

# Fake data
X = np.random.randn(100, 1)
y = 3 * X[:, 0] + np.random.randn(100) * 0.5

# Bayesian Regression
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y)

model = BayesianRidge()
model.fit(X_train, y_train)

# Predict with uncertainty
y_mean, y_std = model.predict(X_test, return_std=True)

The model outputs not only predictions (y_mean), but also the uncertainty (y_std) per point.


4. Priors and Posteriors in Practice

# Prior: w ~ N(0, α⁻¹)
# Likelihood: y ~ N(Xw, β⁻¹)
# Posterior: Computed via closed-form update or sampling

5. Approximate Bayesian Inference

Exact computation of posteriors is intractable in most deep models. Alternatives:

# Monte Carlo Dropout Example (Keras)
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout

model = Sequential([
    Dense(64, activation='relu'),
    Dropout(0.5),  # Dropout active during test
    Dense(1)
])

6. Bayesian vs Frequentist Learning

Concept Bayesian Frequentist
Parameters Distributions (random variables) Fixed but unknown
Inference Posterior from prior + data Maximum likelihood / optimization
Output Probabilistic predictions + variance Point estimates

7. When to Use Bayesian Learning

✅ Use when:

⚠️ Less practical when:


8. Libraries for Bayesian Learning


📚 Further Resources


<< back to Guides