Machine Learning (Chapter 41): The ROC Curve


Machine Learning (Chapter 39): The ROC Curve

Introduction to ROC Curve

The ROC (Receiver Operating Characteristic) curve is a graphical representation used in binary classification problems to evaluate the performance of a classifier. It plots the True Positive Rate (TPR) against the False Positive Rate (FPR) at various threshold levels. The curve illustrates the trade-offs between sensitivity (recall) and specificity, helping to visualize a classifier’s capability.

A perfect classifier would have a curve that reaches the top-left corner of the plot, whereas a random classifier would result in a diagonal line from (0, 0) to (1, 1).

Key Definitions

Before diving into the mathematics, let's define key metrics:

  1. True Positive (TP): Correctly predicted positives.
  2. False Positive (FP): Incorrectly predicted positives.
  3. True Negative (TN): Correctly predicted negatives.
  4. False Negative (FN): Incorrectly predicted negatives.

From these, we calculate:

  • True Positive Rate (TPR), also known as Recall or Sensitivity:


  • False Positive Rate (FPR):


  • Threshold: The probability cut-off point at which a sample is classified as positive or negative.

ROC Curve Construction

For a classifier that outputs probabilities, the ROC curve is constructed as follows:

  1. Vary the threshold from 0 to 1.
  2. For each threshold, calculate the TPR and FPR.
  3. Plot TPR vs. FPR at each threshold.

The Area Under the ROC Curve (AUC-ROC) is a single scalar value that summarizes the overall performance of the classifier. A perfect classifier has an AUC of 1, while a random classifier has an AUC of 0.5.

Python Code Example

Here’s how you can plot an ROC curve using Python with the sklearn library:

import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import roc_curve, roc_auc_score # Generate synthetic data X, y = make_classification(n_samples=1000, n_classes=2, random_state=42) # Split into training and test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Train a logistic regression classifier model = LogisticRegression(), y_train) # Predict probabilities for the test set y_probs = model.predict_proba(X_test)[:, 1] # Calculate ROC curve fpr, tpr, thresholds = roc_curve(y_test, y_probs) # Calculate AUC score auc = roc_auc_score(y_test, y_probs) # Plot ROC curve plt.figure(figsize=(8, 6)) plt.plot(fpr, tpr, color='blue', label=f'AUC = {auc:.2f}') plt.plot([0, 1], [0, 1], color='gray', linestyle='--') plt.xlabel('False Positive Rate (FPR)') plt.ylabel('True Positive Rate (TPR)') plt.title('ROC Curve') plt.legend(loc='lower right') plt.grid()

Explanation of the Code

  1. Data Generation: A synthetic binary classification dataset is created using make_classification.
  2. Model Training: A logistic regression model is trained on the dataset.
  3. Probability Prediction: The probabilities of the positive class are predicted using predict_proba().
  4. ROC Calculation: The roc_curve() function calculates the FPR, TPR, and thresholds.
  5. AUC Calculation: The roc_auc_score() computes the AUC value.
  6. Plotting: The ROC curve is plotted, along with a diagonal line representing random classification.

Java Code Example

In Java, you can use libraries like Weka or Smile for machine learning. Below is an example using Weka to generate and plot an ROC curve.

import weka.classifiers.Evaluation; import weka.classifiers.functions.Logistic; import weka.core.Instances; import weka.core.converters.ConverterUtils.DataSource; import weka.core.Utils; public class ROCExample { public static void main(String[] args) throws Exception { // Load dataset DataSource source = new DataSource("data/your-dataset.arff"); Instances data = source.getDataSet(); data.setClassIndex(data.numAttributes() - 1); // Assuming class is the last attribute // Train Logistic Regression classifier Logistic logistic = new Logistic(); logistic.buildClassifier(data); // Evaluate the classifier Evaluation eval = new Evaluation(data); eval.crossValidateModel(logistic, data, 10, new java.util.Random(1)); // Output ROC curve and AUC System.out.println("AUC: " + eval.areaUnderROC(1)); // Plotting can be done using third-party libraries or exporting data // Example: Export to CSV for external plotting System.out.println(eval.toSummaryString("\nResults\n======\n", false)); } }

Explanation of the Java Code

  1. Data Loading: The dataset is loaded from an ARFF file using DataSource.
  2. Model Training: A logistic regression classifier is trained.
  3. Evaluation: The model is evaluated using 10-fold cross-validation, and the AUC is calculated.
  4. ROC Plotting: Weka can output evaluation metrics, including AUC. For plotting, you may export the ROC data and visualize it externally.

Mathematical Solution Example

Suppose we have the following confusion matrix at a certain threshold:

Predicted PositivePredicted Negative
Actual Positive5010
Actual Negative2070

We calculate TPR and FPR as follows:

  • True Positive Rate (TPR):

TPR=TPTP+FN=5050+10=50600.83TPR = \frac{TP}{TP + FN} = \frac{50}{50 + 10} = \frac{50}{60} \approx 0.83

  • False Positive Rate (FPR):

FPR=FPFP+TN=2020+70=20900.22FPR = \frac{FP}{FP + TN} = \frac{20}{20 + 70} = \frac{20}{90} \approx 0.22

For multiple thresholds, these values are computed and plotted on the ROC curve.


The ROC curve is an essential tool in evaluating the performance of binary classifiers. It helps to understand the trade-offs between sensitivity and specificity at different thresholds. By calculating the AUC, we can summarize the classifier’s effectiveness. This approach, supported by Python and Java implementations, is widely used in machine learning applications to compare different models or fine-tune classification thresholds.


Popular posts from this blog

Machine Learning (Chapter 35): Decision Trees - Multiway Splits

Machine Learning (Chapter 6): Statistical Decision Theory - Classification

Machine Learning (Chapter 32): Stopping Criteria & Pruning