Machine Learning (Chapter 5): Statistical Decision Theory

Machine Learning (Chapter 5): Statistical Decision Theory - Regression

By Ritesh Sahu June 15, 2024

Machine Learning (Chapter 5): Statistical Decision Theory - Regression

Introduction

Statistical Decision Theory is a framework used to make decisions under uncertainty. In the context of machine learning, it helps in choosing the best model or hypothesis given the data and associated risks or costs. Regression, a fundamental concept in machine learning, deals with predicting a continuous output based on input features. In this article, we explore regression within the framework of Statistical Decision Theory, focusing on the mathematical foundations and practical implementation in Python.

1. Basics of Statistical Decision Theory

Statistical Decision Theory involves selecting a decision function $\delta(X)$ that minimizes a loss function $L(Y, \delta(X))$ , where $Y$ is the true output, and $X$ is the input feature vector. The objective is to minimize the expected loss, also known as risk, given by:

R(\delta) = \mathbb{E}[L(Y, \delta(X))]

In regression, the most common loss function is the squared loss:

L(Y, \delta(X)) = (Y - \delta(X))^2

2. Regression and Statistical Decision Theory

In regression, our goal is to predict a continuous variable $Y$ based on the input variables $X$ . According to Statistical Decision Theory, the optimal decision function $\delta(X)$ that minimizes the expected loss (risk) is the conditional expectation of $Y$ given $X$ :

\delta(X) = \mathbb{E}[Y|X]

For example, in a simple linear regression model, $Y$ is predicted as a linear combination of the input features:

\delta(X) = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_p X_p

where $\beta_0, \beta_1, \dots, \beta_p$ are the coefficients of the model.

3. Mathematical Derivation

Given a set of data points $\{(X_i, Y_i)\}_{i=1}^{n}$ , we aim to estimate the coefficients $\beta$ by minimizing the empirical risk, which in the case of squared loss is the sum of squared errors:

\hat{\beta} = \underset{\beta}{\text{argmin}} \sum_{i=1}^{n} (Y_i - \beta_0 - \beta_1 X_{i1} - \dots - \beta_p X_{ip})^2

This can be solved using the normal equation:

\hat{\beta} = (X^TX)^{-1}X^TY

where $X$ is the matrix of input features and $Y$ is the vector of outputs.

4. Python Implementation

Let's implement a simple linear regression model in Python using both a manual approach and with the help of libraries like NumPy.

python:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Generating some synthetic data
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

# Adding bias term (X_0 = 1)
X_b = np.c_[np.ones((100, 1)), X]

# Manual computation of coefficients using the normal equation
theta_best = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)

print("Calculated coefficients:", theta_best)

# Making predictions
X_new = np.array([[0], [2]])
X_new_b = np.c_[np.ones((2, 1)), X_new]
y_predict = X_new_b.dot(theta_best)

# Plotting the results
plt.plot(X_new, y_predict, "r-", label="Predictions")
plt.plot(X, y, "b.")
plt.xlabel("$x_1$", fontsize=18)
plt.ylabel("$y$", rotation=0, fontsize=18)
plt.legend()
plt.show()

# Using Scikit-Learn
lin_reg = LinearRegression()
lin_reg.fit(X, y)
print("Scikit-Learn coefficients:", lin_reg.intercept_, lin_reg.coef_)

# Scikit-Learn prediction
y_predict_sklearn = lin_reg.predict(X_new)

5. Conclusion

Statistical Decision Theory provides a solid foundation for understanding and implementing regression models. By framing regression as a decision problem, we can systematically approach the task of predicting continuous outputs. The optimal decision rule in this context is to choose the model that minimizes the expected squared error, which, in many cases, leads to the use of linear regression models. The Python implementation illustrates how these concepts can be applied in practice, providing a bridge between theory and real-world applications.

By understanding the theoretical underpinnings and practical applications, one can make informed decisions when selecting and implementing regression models in machine learning tasks.

Search This Blog

Machine learning and artificial intelligence

Machine Learning (Chapter 5): Statistical Decision Theory - Regression

Machine Learning (Chapter 5): Statistical Decision Theory - Regression

Introduction

1. Basics of Statistical Decision Theory

2. Regression and Statistical Decision Theory

3. Mathematical Derivation

4. Python Implementation

5. Conclusion

Comments

Post a Comment

Popular posts from this blog

Machine Learning (Chapter 35): Decision Trees - Multiway Splits

Machine Learning (Chapter 41): The ROC Curve

Machine Learning (Chapter 32): Stopping Criteria & Pruning