Machine Learning (Chapter 23): ANN I - Early Models

 



Machine Learning (Chapter 23): ANN I - Early Models

Introduction to Artificial Neural Networks (ANNs)

Artificial Neural Networks (ANNs) are computational models inspired by the human brain's structure and function. They consist of interconnected layers of nodes (neurons) that process information in a manner akin to biological neurons. This chapter delves into the early models of ANNs, particularly focusing on their mathematical foundations, structures, and simple implementations using Python.

Early Models of ANNs

The early models of ANNs, such as the Perceptron, Adaline (Adaptive Linear Neuron), and Multilayer Perceptron (MLP), laid the groundwork for modern neural networks.

1. The Perceptron Model

The Perceptron is one of the earliest and simplest forms of a neural network. It consists of a single layer of neurons, where each neuron receives multiple inputs, applies a linear combination, and passes the result through an activation function to produce an output.

Mathematical Formulation

Consider a Perceptron with nn inputs x1,x2,,xnx_1, x_2, \ldots, x_n, and corresponding weights w1,w2,,wnw_1, w_2, \ldots, w_n. The output yy is given by:

y={1if i=1nwixi+b>00otherwisey = \begin{cases} 1 & \text{if } \sum_{i=1}^{n} w_i x_i + b > 0 \\ 0 & \text{otherwise} \end{cases}

Where:

  • bb is the bias term.
  • The output is binary, based on the threshold function (Heaviside step function).
Python Implementation

Below is an example implementation of a simple Perceptron using Python:

python:

import numpy as np class Perceptron: def __init__(self, learning_rate=0.01, n_iters=1000): self.lr = learning_rate self.n_iters = n_iters self.activation_func = self._unit_step_func self.weights = None self.bias = None def _unit_step_func(self, x): return np.where(x >= 0, 1, 0) def fit(self, X, y): n_samples, n_features = X.shape self.weights = np.zeros(n_features) self.bias = 0 for _ in range(self.n_iters): for idx, x_i in enumerate(X): linear_output = np.dot(x_i, self.weights) + self.bias y_predicted = self.activation_func(linear_output) update = self.lr * (y[idx] - y_predicted) self.weights += update * x_i self.bias += update def predict(self, X): linear_output = np.dot(X, self.weights) + self.bias y_predicted = self.activation_func(linear_output) return y_predicted # Example usage: if __name__ == "__main__": # AND logic gate inputs and outputs X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]]) y = np.array([0, 0, 0, 1]) perceptron = Perceptron(learning_rate=0.1, n_iters=10) perceptron.fit(X, y) predictions = perceptron.predict(X) print("Predictions:", predictions)

2. The Adaline Model

Adaline, or Adaptive Linear Neuron, is another early ANN model. Unlike the Perceptron, Adaline uses a linear activation function and employs a mean squared error (MSE) cost function for training.

Mathematical Formulation

Given inputs x1,x2,,xnx_1, x_2, \ldots, x_n and weights w1,w2,,wnw_1, w_2, \ldots, w_n, the Adaline model produces an output yy as:

y=i=1nwixi+by = \sum_{i=1}^{n} w_i x_i + b

The cost function (MSE) is defined as:

J(w)=12i=1m(y(i)y^(i))2J(w) = \frac{1}{2} \sum_{i=1}^{m} \left( y^{(i)} - \hat{y}^{(i)} \right)^2

Where y^(i)\hat{y}^{(i)} is the predicted output and y(i)y^{(i)} is the actual target value.

Python Implementation

Below is an example implementation of Adaline using Python:

python:

import numpy as np class Adaline: def __init__(self, learning_rate=0.01, n_iters=1000): self.lr = learning_rate self.n_iters = n_iters self.weights = None self.bias = None def fit(self, X, y): n_samples, n_features = X.shape self.weights = np.zeros(n_features) self.bias = 0 self.cost_ = [] for _ in range(self.n_iters): linear_output = np.dot(X, self.weights) + self.bias errors = y - linear_output self.weights += self.lr * np.dot(X.T, errors) self.bias += self.lr * errors.sum() cost = (errors**2).sum() / 2.0 self.cost_.append(cost) def predict(self, X): linear_output = np.dot(X, self.weights) + self.bias return linear_output # Example usage: if __name__ == "__main__": # Simple dataset X = np.array([[1, 1], [2, 2], [3, 3]]) y = np.array([1, 2, 3]) adaline = Adaline(learning_rate=0.01, n_iters=10) adaline.fit(X, y) predictions = adaline.predict(X) print("Predictions:", predictions)

3. Multilayer Perceptron (MLP)

The Multilayer Perceptron (MLP) extends the concept of the Perceptron by introducing multiple layers of neurons, including hidden layers. Each neuron in a layer is connected to every neuron in the next layer, enabling the network to learn complex, non-linear relationships.

Mathematical Formulation

For an MLP with one hidden layer:

  • Let xx be the input vector.
  • W(1)W^{(1)} and W(2)W^{(2)} are weight matrices for the first (input to hidden) and second (hidden to output) layers, respectively.
  • b(1)b^{(1)} and b(2)b^{(2)} are bias vectors for the hidden and output layers, respectively.

The output yy is calculated as:

hidden_layer=σ(W(1)x+b(1))\text{hidden\_layer} = \sigma(W^{(1)}x + b^{(1)}) y=σ(W(2)hidden_layer+b(2))y = \sigma(W^{(2)} \cdot \text{hidden\_layer} + b^{(2)})

Where σ\sigma is the activation function, often a non-linear function like ReLU or sigmoid.

Python Implementation

Below is an example implementation of a simple MLP using Python and the scikit-learn library:

python:

from sklearn.neural_network import MLPClassifier from sklearn.datasets import make_moons from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Create dataset X, y = make_moons(n_samples=100, noise=0.2, random_state=42) # Split into training and test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Define the MLP model mlp = MLPClassifier(hidden_layer_sizes=(5,), max_iter=1000, learning_rate_init=0.01) # Train the model mlp.fit(X_train, y_train) # Make predictions y_pred = mlp.predict(X_test) # Evaluate the model accuracy = accuracy_score(y_test, y_pred) print("Accuracy:", accuracy)

Conclusion

Early models of ANNs like the Perceptron, Adaline, and MLP have played a crucial role in the development of modern deep learning techniques. Understanding these foundational models helps in grasping the complexities of advanced neural network architectures. Through the examples provided, the basic concepts and implementations of these models have been illustrated, paving the way for further exploration into more sophisticated neural networks.

Comments

Popular posts from this blog

Machine Learning (Chapter 41): The ROC Curve

Machine Learning (Chapter 39): Bootstrapping & Cross-Validation

Machine Learning (Chapter 40): Class Evaluation Measures