Machine Learning (Chapter 16): Linear Discriminant Analysis (LDA)

By Ritesh Sahu August 30, 2024

Machine Learning (Chapter 16): Linear Discriminant Analysis (LDA)

Linear Discriminant Analysis (LDA) is a powerful technique in machine learning used for classification and dimensionality reduction. It is especially useful when dealing with datasets where you have multiple classes and you want to reduce the number of features while preserving as much of the class discriminatory information as possible.

Mathematical Formulation

The core idea behind LDA is to find a linear combination of features that best separates the classes. The goal is to project the data onto a lower-dimensional space where the classes are as distinct as possible.

Let's break down the mathematics behind LDA:

Compute the Within-Class Scatter Matrix $S_W$ : For each class $k$ , compute the scatter matrix $S_{Wk}$ :
$S_{Wk} = \sum_{x \in D_k} (x - \mu_k)(x - \mu_k)^T$
where $D_k$ is the set of data points in class $k$ , and $\mu_k$ is the mean vector of class $k$ .
The total within-class scatter matrix is:
$S_W = \sum_{k=1}^{K} S_{Wk}$
where $K$ is the number of classes.
Compute the Between-Class Scatter Matrix $S_B$ : Compute the scatter matrix between classes as:
$S_B = \sum_{k=1}^{K} n_k (\mu_k - \mu)(\mu_k - \mu)^T$
where $n_k$ is the number of samples in class $k$ , $\mu_k$ is the mean vector of class $k$ , and $\mu$ is the overall mean vector of all classes.
Solve the Generalized Eigenvalue Problem: To find the linear discriminants, solve the eigenvalue problem:
$S_W^{-1} S_B \mathbf{w} = \lambda \mathbf{w}$
where $\mathbf{w}$ represents the eigenvectors (discriminant vectors) and $\lambda$ are the corresponding eigenvalues.
Project the Data: Use the top $d$ eigenvectors (where $d$ is the number of classes minus one) to project the data onto a $d$ -dimensional space:
$Y = X \mathbf{W}$
where $X$ is the matrix of input features, and $\mathbf{W}$ is the matrix of top eigenvectors.

Example in Python

Let's apply LDA to a simple dataset using Python. We will use the Iris dataset for demonstration purposes.

python:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load the Iris dataset
data = load_iris()
X = data.data
y = data.target

# Standardize features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.3, random_state=42)

# Apply LDA
lda = LinearDiscriminantAnalysis(n_components=2)
X_train_lda = lda.fit_transform(X_train, y_train)
X_test_lda = lda.transform(X_test)

# Train a classifier
from sklearn.linear_model import LogisticRegression
clf = LogisticRegression()
clf.fit(X_train_lda, y_train)

# Predict and evaluate
y_pred = clf.predict(X_test_lda)
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')

# Plot the results
import matplotlib.pyplot as plt

plt.figure(figsize=(8, 6))
colors = ['navy', 'turquoise', 'darkorange']
for color, i, target_name in zip(colors, [0, 1, 2], data.target_names):
    plt.scatter(X_train_lda[y_train == i, 0], X_train_lda[y_train == i, 1], color=color, alpha=.8, label=target_name)
plt.xlabel('LD1')
plt.ylabel('LD2')
plt.title('LDA: Iris dataset')
plt.legend(loc='best', shadow=False, scatterpoints=1)
plt.show()

Explanation of the Code

Data Loading and Preprocessing:
- We load the Iris dataset and standardize the features to have zero mean and unit variance.
Splitting the Data:
- The dataset is split into training and testing sets.
Applying LDA:
- We apply LDA to reduce the dimensionality to 2 components.
Training a Classifier:
- A logistic regression classifier is trained on the reduced feature set.
Evaluation:
- We evaluate the accuracy of the classifier on the test set.
Plotting:
- We visualize the results to see how well LDA separates the classes in the reduced dimensional space.

By applying LDA, we can achieve a lower-dimensional representation of the data while maintaining class separability, which is useful for both visualization and improving the performance of machine learning models.

Search This Blog

Machine learning and artificial intelligence