Machine Learning (Chapter 18): SVM - Formulation
Machine Learning (Chapter 18): SVM - Formulation
Introduction to Support Vector Machines (SVM)
Support Vector Machines (SVM) are powerful supervised learning models used for classification and regression tasks. The core idea of SVM is to find the optimal hyperplane that best separates the data into different classes. In this chapter, we'll explore the mathematical formulation of SVMs, discuss how they work, and implement a simple example using Python.
Mathematical Formulation
1. The SVM Problem
Consider a dataset with data points , where represents the feature vector, and is the class label. The goal of SVM is to find a hyperplane that maximizes the margin between the two classes.
2. Hyperplane Equation
A hyperplane in -dimensional space can be represented as:
where is the weight vector, and is the bias term.
3. Margin and Support Vectors
The margin is defined as the distance between the hyperplane and the closest data points from either class. To maximize the margin, we need to solve the following optimization problem:
subject to the constraint:
4. Optimization Problem
The above problem can be formulated as a convex optimization problem. We can solve it using the Lagrangian multipliers. The Lagrangian function is:
where are the Lagrange multipliers. The dual problem is then:
subject to:
Python Implementation
Let's implement a simple example using Python's Scikit-learn library to apply SVM to a classification problem.
Example: SVM on the Iris Dataset
python:import numpy as np
from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
# We will use only two classes for simplicity
X = X[y != 2]
y = y[y != 2]
# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Standardize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Create and train the SVM model
clf = svm.SVC(kernel='linear', C=1.0)
clf.fit(X_train, y_train)
# Make predictions
y_pred = clf.predict(X_test)
# Print the classification report and accuracy
print("Classification Report:\n", classification_report(y_test, y_pred))
print("Accuracy Score:", accuracy_score(y_test, y_pred))
# Plotting decision boundary
def plot_decision_boundary(X, y, model):
h = .02 # step size in the mesh
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.8)
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolor='k', marker='o')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('SVM Decision Boundary')
plt.show()
# Plot decision boundary for the trained model
plot_decision_boundary(X_test, y_test, clf)
Explanation of the Code
- Loading and Preparing Data: The Iris dataset is loaded, and only two classes are considered for simplicity.
- Splitting Data: The dataset is split into training and test sets.
- Standardization: Features are standardized to have mean 0 and variance 1.
- Training the Model: An SVM model with a linear kernel is trained on the data.
- Evaluation: The model's performance is evaluated using accuracy and classification report.
- Visualization: The decision boundary of the SVM is plotted to visualize the separation between classes.
Conclusion
Support Vector Machines are a robust and effective tool for classification problems. By understanding the mathematical formulation and implementing SVM in Python, you can leverage its power to solve various machine learning tasks. The example provided demonstrates a practical application of SVM and helps visualize how it separates data into different classes.

Comments
Post a Comment