Machine Learning (Chapter 23): ANN I

Machine Learning (Chapter 23): ANN I - Early Models

Introduction to Artificial Neural Networks (ANNs)

Artificial Neural Networks (ANNs) are computational models inspired by the human brain's structure and function. They consist of interconnected layers of nodes (neurons) that process information in a manner akin to biological neurons. This chapter delves into the early models of ANNs, particularly focusing on their mathematical foundations, structures, and simple implementations using Python.

Early Models of ANNs

The early models of ANNs, such as the Perceptron, Adaline (Adaptive Linear Neuron), and Multilayer Perceptron (MLP), laid the groundwork for modern neural networks.

1. The Perceptron Model

The Perceptron is one of the earliest and simplest forms of a neural network. It consists of a single layer of neurons, where each neuron receives multiple inputs, applies a linear combination, and passes the result through an activation function to produce an output.

Mathematical Formulation

Consider a Perceptron with $n$ inputs $x_1, x_2, \ldots, x_n$ , and corresponding weights $w_1, w_2, \ldots, w_n$ . The output $y$ is given by:

$y = \begin{cases} 1 & \text{if } \sum_{i=1}^{n} w_i x_i + b > 0 \\ 0 & \text{otherwise} \end{cases}$

Where:

$b$ is the bias term.
The output is binary, based on the threshold function (Heaviside step function).

Python Implementation

Below is an example implementation of a simple Perceptron using Python:

python:
import numpy as np

class Perceptron:
    def __init__(self, learning_rate=0.01, n_iters=1000):
        self.lr = learning_rate
        self.n_iters = n_iters
        self.activation_func = self._unit_step_func
        self.weights = None
        self.bias = None

    def _unit_step_func(self, x):
        return np.where(x >= 0, 1, 0)

    def fit(self, X, y):
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0

        for _ in range(self.n_iters):
            for idx, x_i in enumerate(X):
                linear_output = np.dot(x_i, self.weights) + self.bias
                y_predicted = self.activation_func(linear_output)

                update = self.lr * (y[idx] - y_predicted)
                self.weights += update * x_i
                self.bias += update

    def predict(self, X):
        linear_output = np.dot(X, self.weights) + self.bias
        y_predicted = self.activation_func(linear_output)
        return y_predicted

# Example usage:
if __name__ == "__main__":
    # AND logic gate inputs and outputs
    X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
    y = np.array([0, 0, 0, 1])

    perceptron = Perceptron(learning_rate=0.1, n_iters=10)
    perceptron.fit(X, y)
    predictions = perceptron.predict(X)

    print("Predictions:", predictions)

2. The Adaline Model

Adaline, or Adaptive Linear Neuron, is another early ANN model. Unlike the Perceptron, Adaline uses a linear activation function and employs a mean squared error (MSE) cost function for training.

Mathematical Formulation

Given inputs $x_1, x_2, \ldots, x_n$ and weights $w_1, w_2, \ldots, w_n$ , the Adaline model produces an output $y$ as:

$y = \sum_{i=1}^{n} w_i x_i + b$

The cost function (MSE) is defined as:

$J(w) = \frac{1}{2} \sum_{i=1}^{m} \left( y^{(i)} - \hat{y}^{(i)} \right)^2$

Where $\hat{y}^{(i)}$ is the predicted output and $y^{(i)}$ is the actual target value.

Python Implementation

Below is an example implementation of Adaline using Python:

python:
import numpy as np

class Adaline:
    def __init__(self, learning_rate=0.01, n_iters=1000):
        self.lr = learning_rate
        self.n_iters = n_iters
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0
        self.cost_ = []

        for _ in range(self.n_iters):
            linear_output = np.dot(X, self.weights) + self.bias
            errors = y - linear_output
            self.weights += self.lr * np.dot(X.T, errors)
            self.bias += self.lr * errors.sum()
            cost = (errors**2).sum() / 2.0
            self.cost_.append(cost)

    def predict(self, X):
        linear_output = np.dot(X, self.weights) + self.bias
        return linear_output

# Example usage:
if __name__ == "__main__":
    # Simple dataset
    X = np.array([[1, 1], [2, 2], [3, 3]])
    y = np.array([1, 2, 3])

    adaline = Adaline(learning_rate=0.01, n_iters=10)
    adaline.fit(X, y)
    predictions = adaline.predict(X)

    print("Predictions:", predictions)

3. Multilayer Perceptron (MLP)

The Multilayer Perceptron (MLP) extends the concept of the Perceptron by introducing multiple layers of neurons, including hidden layers. Each neuron in a layer is connected to every neuron in the next layer, enabling the network to learn complex, non-linear relationships.

Mathematical Formulation

For an MLP with one hidden layer:

Let $x$ be the input vector.
$W^{(1)}$ and $W^{(2)}$ are weight matrices for the first (input to hidden) and second (hidden to output) layers, respectively.
$b^{(1)}$ and $b^{(2)}$ are bias vectors for the hidden and output layers, respectively.

The output $y$ is calculated as:

$\text{hidden\_layer} = \sigma(W^{(1)}x + b^{(1)})$ $y = \sigma(W^{(2)} \cdot \text{hidden\_layer} + b^{(2)})$

Where $\sigma$ is the activation function, often a non-linear function like ReLU or sigmoid.

Python Implementation

Below is an example implementation of a simple MLP using Python and the scikit-learn library:

python:
from sklearn.neural_network import MLPClassifier
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Create dataset
X, y = make_moons(n_samples=100, noise=0.2, random_state=42)

# Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define the MLP model
mlp = MLPClassifier(hidden_layer_sizes=(5,), max_iter=1000, learning_rate_init=0.01)

# Train the model
mlp.fit(X_train, y_train)

# Make predictions
y_pred = mlp.predict(X_test)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Conclusion

Early models of ANNs like the Perceptron, Adaline, and MLP have played a crucial role in the development of modern deep learning techniques. Understanding these foundational models helps in grasping the complexities of advanced neural network architectures. Through the examples provided, the basic concepts and implementations of these models have been illustrated, paving the way for further exploration into more sophisticated neural networks.

Search This Blog

Machine learning and artificial intelligence