Machine Learning (Chapter 7): Bias-Variance Tradeoff

By Ritesh Sahu July 01, 2024

Machine Learning Chapter 7: Bias-Variance Tradeoff

In machine learning, understanding the bias-variance tradeoff is crucial for building effective models. This chapter delves into the concepts of bias and variance, explores their implications, and provides practical examples with mathematical formulas and Python code.

1. Understanding Bias and Variance

Bias refers to the error introduced by approximating a real-world problem, which may be complex, by a simplified model. High bias can cause an algorithm to miss the important patterns, leading to underfitting.

Variance refers to the error introduced by the model’s sensitivity to small fluctuations in the training set. High variance can lead to overfitting, where the model performs well on the training data but poorly on unseen data.

Bias-Variance Tradeoff: There is a tradeoff between bias and variance. A model with low bias has high variance, and vice versa. The goal is to find a balance where both bias and variance are minimized.

2. Mathematical Formulation

The error of a machine learning model can be decomposed into three components:

Bias Squared: $( \mathbb{E}[f(x)] - f^*(x) )^2$
Variance: $\mathbb{E}[(f(x) - \mathbb{E}[f(x)])^2]$
Irreducible Error: $\sigma^2$

The total error $\text{Error}(x)$ can be expressed as:

$\text{Error}(x) = \text{Bias}^2(x) + \text{Variance}(x) + \sigma^2$

where:

$f(x)$ is the predicted value from the model.
$f^*(x)$ is the true value of the function.
$\sigma^2$ is the variance of the noise.

3. Illustrative Example in Python

Let's create a simple example to illustrate the bias-variance tradeoff using polynomial regression.

python:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

# Generate synthetic data
np.random.seed(0)
X = np.sort(5 * np.random.rand(80, 1), axis=0)
y = np.sin(X).ravel() + np.random.normal(0, 0.2, X.shape[0])

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# Function to plot polynomial regression
def plot_polynomial_regression(degree):
    poly = PolynomialFeatures(degree=degree)
    X_poly = poly.fit_transform(X_train)
    
    model = LinearRegression()
    model.fit(X_poly, y_train)
    
    X_fit = np.linspace(0, 5, 100)[:, np.newaxis]
    X_fit_poly = poly.transform(X_fit)
    y_fit = model.predict(X_fit_poly)
    
    plt.scatter(X, y, color='blue', label='Data')
    plt.plot(X_fit, y_fit, color='red', label=f'Degree {degree}')
    plt.title(f'Polynomial Regression (Degree {degree})')
    plt.xlabel('X')
    plt.ylabel('y')
    plt.legend()
    plt.show()
    
    # Predict and calculate mean squared error
    X_test_poly = poly.transform(X_test)
    y_pred = model.predict(X_test_poly)
    mse = mean_squared_error(y_test, y_pred)
    print(f'Degree {degree} - Mean Squared Error: {mse:.4f}')

# Plot for different polynomial degrees
for degree in [1, 3, 10]:
    plot_polynomial_regression(degree)

4. Analysis of Results

Degree 1 Polynomial (Linear Regression): This will likely show high bias and low variance, resulting in underfitting. The model may not capture the underlying patterns in the data.
Degree 3 Polynomial: This balance may provide a better fit to the data, showing a good compromise between bias and variance.
Degree 10 Polynomial: This often leads to low bias but high variance, resulting in overfitting. The model will fit the training data very well but perform poorly on unseen data.

5. Conclusion

The bias-variance tradeoff is a fundamental concept in machine learning that helps in selecting the right model complexity. By understanding and managing this tradeoff, one can build models that generalize well to new data, providing accurate and reliable predictions.

Search This Blog

Machine learning and artificial intelligence