{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "b1739f6e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"torch : 1.10.1\n",
"pytorch_lightning: 1.6.0.dev0\n",
"torchmetrics : 0.6.2\n",
"matplotlib : 3.3.4\n",
"coral_pytorch : 1.2.0\n",
"\n"
]
}
],
"source": [
"%load_ext watermark\n",
"%watermark -p torch,pytorch_lightning,torchmetrics,matplotlib,coral_pytorch"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "00079242",
"metadata": {},
"outputs": [],
"source": [
"%load_ext pycodestyle_magic\n",
"%flake8_on --ignore W291,W293,E703"
]
},
{
"cell_type": "markdown",
"id": "d198b6a3",
"metadata": {},
"source": [
"
\n",
"\n",
"# Squared-error reformulation for ordinal regression and deep learning -- cement strength dataset"
]
},
{
"cell_type": "markdown",
"id": "e0ffe6ee",
"metadata": {},
"source": [
"Implementation of a method for ordinal regression by Beckham and Pal 2016.\n",
"\n",
"**Paper reference:**\n",
"\n",
"- Beckham, Christopher, and Christopher Pal. \"[A simple squared-error reformulation for ordinal classification](https://arxiv.org/abs/1612.00775).\" arXiv preprint arXiv:1612.00775 (2016)."
]
},
{
"cell_type": "markdown",
"id": "654568ec",
"metadata": {},
"source": [
"**Note:**\n",
" \n",
"To keep the notation lean and minimal, this notebook only contains \"Squared-error reformulation\"-specific comments. For more comments on the PyTorch Lightning use, please see the cross-entropy baseline notebook [baseline-light_cement.ipynb](./baseline-light_cement.ipynb)."
]
},
{
"cell_type": "markdown",
"id": "13449160",
"metadata": {},
"source": [
"## General settings and hyperparameters"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "1acc2499",
"metadata": {},
"outputs": [],
"source": [
"BATCH_SIZE = 128\n",
"NUM_EPOCHS = 200\n",
"LEARNING_RATE = 0.005\n",
"NUM_WORKERS = 0\n",
"\n",
"DATA_BASEPATH = \".\""
]
},
{
"cell_type": "markdown",
"id": "1b7e4b44",
"metadata": {
"tags": []
},
"source": [
"## Converting a regular classifier into a *Reformulated Squared Error* ordinal regression model"
]
},
{
"cell_type": "markdown",
"id": "4af15d92",
"metadata": {},
"source": [
"Changing a classifier to a Reformulated Squared Error model for ordinal regression is actually really simple and only requires a few changes:\n",
"\n",
"**1)**\n",
"\n",
"We add an additional parameter layer `a`:\n",
"\n",
"```python\n",
"self.a = torch.nn.Parameter(torch.zeros(\n",
" num_classes).float().normal_(0.0, 0.1).view(-1, 1))\n",
"```\n",
"\n",
"\n",
"**2)**\n",
"\n",
"We convert the logits (unnormalized outputs of the neural network) to \"predictions\" using the following function:\n",
"\n",
"\n",
"```python\n",
"def beckham_logits_to_predictions(logits, model, num_classes):\n",
" probas = torch.softmax(logits, dim=1)\n",
" predictions = ((num_classes-1)\n",
" * torch.sigmoid(probas.mm(model.a).view(-1)))\n",
" return predictions\n",
"```\n",
"\n",
"**3)** \n",
"\n",
"We swap the cross entropy loss from PyTorch,\n",
"\n",
"```python\n",
"torch.nn.functional.cross_entropy(logits, true_labels)\n",
"```\n",
"\n",
"with the squared error loss):\n",
"\n",
"```python\n",
"def squared_error(targets, predictions):\n",
" return torch.mean((targets.float() - predictions)**2)\n",
"```\n",
"\n",
"**4)**\n",
"\n",
"In a regular classifier, we usually obtain the predicted class labels as follows:\n",
"\n",
"```python\n",
"predicted_labels = torch.argmax(logits, dim=1)\n",
"```\n",
"\n",
"In this method, we replace this with the following code to convert the predicted probabilities into the predicted labels:\n",
"\n",
"```python\n",
"def beckham_logits_to_labels(logits, model, num_classes):\n",
" predictions = beckham_logits_to_predictions(logits, model, num_classes)\n",
" return torch.round(predictions).float()\n",
"\n",
"predicted_labels = beckham_logits_to_labels(logits, model, model.num_classes)\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "debfdfd1",
"metadata": {},
"source": [
"## Implementing a `MultiLayerPerceptron` using PyTorch Lightning's `LightningModule`"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "1922e731",
"metadata": {},
"outputs": [],
"source": [
"import torch\n",
"\n",
"\n",
"class MultiLayerPerceptron(torch.nn.Module):\n",
" def __init__(self, input_size, hidden_units, num_classes):\n",
" super().__init__()\n",
"\n",
" self.num_classes = num_classes\n",
" \n",
" all_layers = []\n",
" for hidden_unit in hidden_units:\n",
" layer = torch.nn.Linear(input_size, hidden_unit)\n",
" all_layers.append(layer)\n",
" all_layers.append(torch.nn.ReLU())\n",
" input_size = hidden_unit\n",
"\n",
" output_layer = torch.nn.Linear(hidden_units[-1], num_classes)\n",
" \n",
" all_layers.append(output_layer)\n",
" self.model = torch.nn.Sequential(*all_layers)\n",
" \n",
" # -----------------------------------------------------\n",
" # Beckham 2016-specific parameter layer\n",
" self.a = torch.nn.Parameter(torch.zeros(\n",
" num_classes).float().normal_(0.0, 0.1).view(-1, 1))\n",
" # ----------------------------------------------------- \n",
" \n",
" def forward(self, x):\n",
" x = self.model(x)\n",
" return x"
]
},
{
"cell_type": "markdown",
"id": "ce770cc2",
"metadata": {},
"source": [
"Now, let's define the following Squared Error Reformulation-specific functions:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "05434da4",
"metadata": {},
"outputs": [],
"source": [
"def squared_error(targets, predictions):\n",
" return torch.mean((targets.float() - predictions)**2)\n",
"\n",
"\n",
"def beckham_logits_to_predictions(logits, model, num_classes):\n",
" probas = torch.softmax(logits, dim=1)\n",
" predictions = ((num_classes-1)\n",
" * torch.sigmoid(probas.mm(model.a).view(-1)))\n",
" return predictions\n",
"\n",
"\n",
"def beckham_logits_to_labels(logits, model, num_classes):\n",
" predictions = beckham_logits_to_predictions(logits, model, num_classes)\n",
" return torch.round(predictions).float()"
]
},
{
"cell_type": "markdown",
"id": "6562578a",
"metadata": {},
"source": [
"And then, we will use them below in the `LightningModule`. They are only used in the `_shared_step` method as indicated:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "1381f777",
"metadata": {},
"outputs": [],
"source": [
"from coral_pytorch.losses import corn_loss\n",
"from coral_pytorch.dataset import corn_label_from_logits\n",
"\n",
"import pytorch_lightning as pl\n",
"import torchmetrics\n",
"\n",
"\n",
"class LightningMLP(pl.LightningModule):\n",
" def __init__(self, model, learning_rate):\n",
" super().__init__()\n",
"\n",
" self.learning_rate = learning_rate\n",
" self.model = model\n",
"\n",
" self.save_hyperparameters(ignore=['model'])\n",
"\n",
" self.train_mae = torchmetrics.MeanAbsoluteError()\n",
" self.valid_mae = torchmetrics.MeanAbsoluteError()\n",
" self.test_mae = torchmetrics.MeanAbsoluteError()\n",
" \n",
" def forward(self, x):\n",
" return self.model(x)\n",
"\n",
" def _shared_step(self, batch):\n",
" features, true_labels = batch\n",
" logits = self(features)\n",
"\n",
" # Beckham 2016-specific functions-------------------------------### \n",
" predictions = beckham_logits_to_predictions(\n",
" logits, self.model, self.model.num_classes)\n",
" \n",
" loss = squared_error(predictions, true_labels)\n",
" \n",
" predicted_labels = beckham_logits_to_labels(\n",
" logits, self.model, self.model.num_classes)\n",
" # ---------------------------------------------------------------### \n",
" \n",
" return loss, true_labels, predicted_labels\n",
"\n",
" def training_step(self, batch, batch_idx):\n",
" loss, true_labels, predicted_labels = self._shared_step(batch)\n",
" self.log(\"train_loss\", loss)\n",
" self.train_mae(predicted_labels, true_labels)\n",
" self.log(\"train_mae\", self.train_mae, on_epoch=True, on_step=False)\n",
" return loss\n",
"\n",
" def validation_step(self, batch, batch_idx):\n",
" loss, true_labels, predicted_labels = self._shared_step(batch)\n",
" self.log(\"valid_loss\", loss)\n",
" self.valid_mae(predicted_labels, true_labels)\n",
" self.log(\"valid_mae\", self.valid_mae,\n",
" on_epoch=True, on_step=False, prog_bar=True)\n",
"\n",
" def test_step(self, batch, batch_idx):\n",
" loss, true_labels, predicted_labels = self._shared_step(batch)\n",
" self.test_mae(predicted_labels, true_labels)\n",
" self.log(\"test_mae\", self.test_mae, on_epoch=True, on_step=False)\n",
"\n",
" def configure_optimizers(self):\n",
" optimizer = torch.optim.Adam(self.parameters(), lr=self.learning_rate)\n",
" return optimizer"
]
},
{
"cell_type": "markdown",
"id": "0ae78a8e",
"metadata": {},
"source": [
"---\n",
"\n",
"# Note: There Are No Changes Compared To The Baseline Below\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "39ac693d",
"metadata": {},
"source": [
"## Setting up the dataset"
]
},
{
"cell_type": "markdown",
"id": "b070e95b",
"metadata": {},
"source": [
"### Inspecting the dataset"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "a26a6486",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
| \n", " | response | \n", "V1 | \n", "V2 | \n", "V3 | \n", "V4 | \n", "V5 | \n", "V6 | \n", "V7 | \n", "V8 | \n", "
|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "4 | \n", "540.0 | \n", "0.0 | \n", "0.0 | \n", "162.0 | \n", "2.5 | \n", "1040.0 | \n", "676.0 | \n", "28 | \n", "
| 1 | \n", "4 | \n", "540.0 | \n", "0.0 | \n", "0.0 | \n", "162.0 | \n", "2.5 | \n", "1055.0 | \n", "676.0 | \n", "28 | \n", "
| 2 | \n", "2 | \n", "332.5 | \n", "142.5 | \n", "0.0 | \n", "228.0 | \n", "0.0 | \n", "932.0 | \n", "594.0 | \n", "270 | \n", "
| 3 | \n", "2 | \n", "332.5 | \n", "142.5 | \n", "0.0 | \n", "228.0 | \n", "0.0 | \n", "932.0 | \n", "594.0 | \n", "365 | \n", "
| 4 | \n", "2 | \n", "198.6 | \n", "132.4 | \n", "0.0 | \n", "192.0 | \n", "0.0 | \n", "978.4 | \n", "825.5 | \n", "360 | \n", "
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
"┃ Test metric ┃ DataLoader 0 ┃\n",
"┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
"│ test_mae │ 0.3400000035762787 │\n",
"└───────────────────────────┴───────────────────────────┘\n",
"\n"
],
"text/plain": [
"┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
"┃\u001b[1m \u001b[0m\u001b[1m Test metric \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m DataLoader 0 \u001b[0m\u001b[1m \u001b[0m┃\n",
"┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
"│\u001b[36m \u001b[0m\u001b[36m test_mae \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.3400000035762787 \u001b[0m\u001b[35m \u001b[0m│\n",
"└───────────────────────────┴───────────────────────────┘\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"[{'test_mae': 0.3400000035762787}]"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"trainer.test(model=lightning_model, datamodule=data_module, ckpt_path='best')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}