Machine Learning (Chapter 23): ANN I - Early Models
Machine Learning (Chapter 23): ANN I - Early Models
Introduction to Artificial Neural Networks (ANNs)
Artificial Neural Networks (ANNs) are computational models inspired by the human brain's structure and function. They consist of interconnected layers of nodes (neurons) that process information in a manner akin to biological neurons. This chapter delves into the early models of ANNs, particularly focusing on their mathematical foundations, structures, and simple implementations using Python.
Early Models of ANNs
The early models of ANNs, such as the Perceptron, Adaline (Adaptive Linear Neuron), and Multilayer Perceptron (MLP), laid the groundwork for modern neural networks.
1. The Perceptron Model
The Perceptron is one of the earliest and simplest forms of a neural network. It consists of a single layer of neurons, where each neuron receives multiple inputs, applies a linear combination, and passes the result through an activation function to produce an output.
Mathematical Formulation
Consider a Perceptron with inputs , and corresponding weights . The output is given by:
Where:
- is the bias term.
- The output is binary, based on the threshold function (Heaviside step function).
Python Implementation
Below is an example implementation of a simple Perceptron using Python:
python:
import numpy as np
class Perceptron:
def __init__(self, learning_rate=0.01, n_iters=1000):
self.lr = learning_rate
self.n_iters = n_iters
self.activation_func = self._unit_step_func
self.weights = None
self.bias = None
def _unit_step_func(self, x):
return np.where(x >= 0, 1, 0)
def fit(self, X, y):
n_samples, n_features = X.shape
self.weights = np.zeros(n_features)
self.bias = 0
for _ in range(self.n_iters):
for idx, x_i in enumerate(X):
linear_output = np.dot(x_i, self.weights) + self.bias
y_predicted = self.activation_func(linear_output)
update = self.lr * (y[idx] - y_predicted)
self.weights += update * x_i
self.bias += update
def predict(self, X):
linear_output = np.dot(X, self.weights) + self.bias
y_predicted = self.activation_func(linear_output)
return y_predicted
# Example usage:
if __name__ == "__main__":
# AND logic gate inputs and outputs
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([0, 0, 0, 1])
perceptron = Perceptron(learning_rate=0.1, n_iters=10)
perceptron.fit(X, y)
predictions = perceptron.predict(X)
print("Predictions:", predictions)
2. The Adaline Model
Adaline, or Adaptive Linear Neuron, is another early ANN model. Unlike the Perceptron, Adaline uses a linear activation function and employs a mean squared error (MSE) cost function for training.
Mathematical Formulation
Given inputs and weights , the Adaline model produces an output as:
The cost function (MSE) is defined as:
Where is the predicted output and is the actual target value.
Python Implementation
Below is an example implementation of Adaline using Python:
python:
import numpy as np
class Adaline:
def __init__(self, learning_rate=0.01, n_iters=1000):
self.lr = learning_rate
self.n_iters = n_iters
self.weights = None
self.bias = None
def fit(self, X, y):
n_samples, n_features = X.shape
self.weights = np.zeros(n_features)
self.bias = 0
self.cost_ = []
for _ in range(self.n_iters):
linear_output = np.dot(X, self.weights) + self.bias
errors = y - linear_output
self.weights += self.lr * np.dot(X.T, errors)
self.bias += self.lr * errors.sum()
cost = (errors**2).sum() / 2.0
self.cost_.append(cost)
def predict(self, X):
linear_output = np.dot(X, self.weights) + self.bias
return linear_output
# Example usage:
if __name__ == "__main__":
# Simple dataset
X = np.array([[1, 1], [2, 2], [3, 3]])
y = np.array([1, 2, 3])
adaline = Adaline(learning_rate=0.01, n_iters=10)
adaline.fit(X, y)
predictions = adaline.predict(X)
print("Predictions:", predictions)
3. Multilayer Perceptron (MLP)
The Multilayer Perceptron (MLP) extends the concept of the Perceptron by introducing multiple layers of neurons, including hidden layers. Each neuron in a layer is connected to every neuron in the next layer, enabling the network to learn complex, non-linear relationships.
Mathematical Formulation
For an MLP with one hidden layer:
- Let be the input vector.
- and are weight matrices for the first (input to hidden) and second (hidden to output) layers, respectively.
- and are bias vectors for the hidden and output layers, respectively.
The output is calculated as:
Where is the activation function, often a non-linear function like ReLU or sigmoid.
Python Implementation
Below is an example implementation of a simple MLP using Python and the scikit-learn library:
python:
from sklearn.neural_network import MLPClassifier
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Create dataset
X, y = make_moons(n_samples=100, noise=0.2, random_state=42)
# Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Define the MLP model
mlp = MLPClassifier(hidden_layer_sizes=(5,), max_iter=1000, learning_rate_init=0.01)
# Train the model
mlp.fit(X_train, y_train)
# Make predictions
y_pred = mlp.predict(X_test)
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
Conclusion
Early models of ANNs like the Perceptron, Adaline, and MLP have played a crucial role in the development of modern deep learning techniques. Understanding these foundational models helps in grasping the complexities of advanced neural network architectures. Through the examples provided, the basic concepts and implementations of these models have been illustrated, paving the way for further exploration into more sophisticated neural networks.

Comments
Post a Comment