Machine Learning (Chapter 13): Partial Least Squares

By Ritesh Sahu August 18, 2024

Machine Learning: Partial Least Squares (PLS) - Chapter 13

Partial Least Squares (PLS) is a powerful statistical technique used in machine learning and data analysis for modeling complex relationships between variables. It is particularly useful when dealing with high-dimensional datasets where traditional methods may struggle. This article provides an overview of PLS, including the mathematics behind it, and demonstrates its implementation with Python code.

Understanding Partial Least Squares (PLS)

PLS is a regression technique that models the relationships between two matrices: one containing predictors (X) and the other containing responses (Y). Unlike ordinary least squares (OLS) regression, which requires that predictors and responses be uncorrelated, PLS can handle multicollinearity and high-dimensional data.

Key Concepts

Latent Variables: PLS identifies latent (hidden) variables that capture the most variance in both predictors and responses. These latent variables are linear combinations of the original predictors and responses.
Orthogonal Transformation: PLS performs an orthogonal transformation to find a new set of basis vectors that better represent the relationships between predictors and responses.

Mathematical Formulation

The PLS algorithm aims to find linear combinations of the predictors (X) and responses (Y) that maximize the covariance between them. The model can be summarized as follows:

Model Definition:
$\mathbf{X} = \mathbf{T} \mathbf{P}^T + \mathbf{E}$ $\mathbf{Y} = \mathbf{U} \mathbf{Q}^T + \mathbf{F}$
where:
- $\mathbf{X}$ is the matrix of predictors.
- $\mathbf{Y}$ is the matrix of responses.
- $\mathbf{T}$ and $\mathbf{U}$ are matrices of latent variables (scores).
- $\mathbf{P}$ and $\mathbf{Q}$ are matrices of loadings.
- $\mathbf{E}$ and $\mathbf{F}$ are matrices of residuals.
Objective: Maximize the covariance between $\mathbf{T}$ and $\mathbf{U}$ :
$\text{Cov}(\mathbf{T}, \mathbf{U})$
Algorithm:
- Compute the latent variables $\mathbf{T}$ and $\mathbf{U}$ by finding the directions in which $\mathbf{X}$ and $\mathbf{Y}$ vary together.
- Extract loadings $\mathbf{P}$ and $\mathbf{Q}$ from the computed latent variables.
- Iterate to refine the latent variables and loadings until convergence.

Python Implementation

We can use the scikit-learn library to implement PLS regression. Below is an example demonstrating how to perform PLS regression using Python.

Example Code

python:
import numpy as np
from sklearn.cross_decomposition import PLSRegression
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Load dataset
data = load_diabetes()
X = data.data
y = data.target

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize and fit PLS model
pls = PLSRegression(n_components=2)
pls.fit(X_train, y_train)

# Make predictions
y_pred = pls.predict(X_test)

# Evaluate model
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.4f}")

# Display the coefficients
print("PLS Coefficients:")
print(pls.coef_)

Explanation of the Code

Data Preparation: We use the load_diabetes dataset from sklearn.datasets and split it into training and testing sets.
Model Initialization: We create an instance of PLSRegression with 2 components. The number of components is a hyperparameter that controls the complexity of the model.
Model Fitting: We fit the PLS model to the training data.
Prediction and Evaluation: We use the model to make predictions on the test set and evaluate the performance using mean squared error (MSE).
Coefficients: We print the coefficients of the PLS model to understand the influence of each predictor on the response.

Conclusion

Partial Least Squares (PLS) is a versatile tool for regression and dimensionality reduction, especially useful in scenarios with high-dimensional data or multicollinearity. By transforming predictors and responses into latent variables, PLS finds the directions of maximum covariance and provides a robust approach to modeling complex relationships. The provided Python example demonstrates how to implement PLS regression using scikit-learn and highlights its application in real-world data analysis.

Search This Blog

Machine learning and artificial intelligence