Machine Learning (Chapter 9): Multivariate Regression
Machine Learning (Chapter 9): Multivariate Regression
Introduction
Multivariate regression is an extension of simple linear regression that deals with multiple independent variables. It aims to model the relationship between two or more features and a dependent variable. This chapter explores the concept, mathematical formulation, and practical implementation of multivariate regression.
Mathematical Formulation
In multivariate regression, we predict the dependent variable using multiple independent variables . The relationship can be described using the following linear equation:
where:
- is the dependent variable.
- is the intercept.
- are the coefficients of the independent variables .
- is the error term.
The goal is to estimate the coefficients such that the sum of squared errors (or residuals) between the predicted values and the actual values is minimized.
Cost Function
The cost function for multivariate regression, known as the Mean Squared Error (MSE), is given by:
where:
- is the number of training examples.
- is the hypothesis function (i.e., the predicted value) for the -th training example.
- is the actual value for the -th training example.
The hypothesis function in multivariate regression is:
Gradient Descent
To minimize the cost function, we use gradient descent. The update rule for the coefficients is:
where is the learning rate, and is the partial derivative of the cost function with respect to :
Python Implementation
Below is a Python implementation of multivariate regression using scikit-learn and numpy.
python:import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
# Example data
# Features: [x1, x2]
X = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
# Target variable
y = np.array([3, 5, 7, 9])
# Create and fit the model
model = LinearRegression()
model.fit(X, y)
# Predictions
y_pred = model.predict(X)
# Coefficients
intercept = model.intercept_
coefficients = model.coef_
print(f'Intercept: {intercept}')
print(f'Coefficients: {coefficients}')
# Model performance
mse = mean_squared_error(y, y_pred)
print(f'Mean Squared Error: {mse}')
Explanation
- Data Preparation: We define our feature matrix and target vector .
- Model Training: We create an instance of
LinearRegressionand fit it to our data. - Predictions: We use the model to predict the target values for our input features.
- Evaluation: We calculate the Mean Squared Error to assess the model’s performance.
Conclusion
Multivariate regression allows us to model complex relationships between multiple features and a target variable. By understanding the mathematical foundation and applying it through practical implementation, we can make accurate predictions and gain insights from our data.

Comments
Post a Comment