Alternative definition of logistic regression model as multi-input and multi-output model in Keras

In [1]:
%matplotlib inline
In [2]:
import numpy as np

import matplotlib.pyplot as plt

import keras.backend as K

from keras.layers import Input, Dense
from keras.models import Sequential, Model

from sklearn.model_selection import train_test_split
from sklearn.datasets import make_blobs
Using TensorFlow backend.
/Users/tiao/.virtualenvs/anmoku/lib/python3.6/importlib/_bootstrap.py:205: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
  return f(*args, **kwds)
In [3]:
'TensorFlow version: ' + K.tf.__version__
Out[3]:
'TensorFlow version: 1.4.0'

Constants

In [4]:
n_samples = 100
n_features = 2
n_classes = 2
seed = 42
rng = np.random.RandomState(seed)

Toy Dataset ("Gaussian blobs")

In [5]:
x_test, y_test = make_blobs(n_samples=n_samples, centers=n_classes, random_state=rng)
In [6]:
# class labels are balanced
np.sum(y_test)
Out[6]:
50
In [7]:
fig, ax = plt.subplots(figsize=(7, 5))

cb = ax.scatter(*x_test.T, c=y_test, cmap='coolwarm')
fig.colorbar(cb, ax=ax)

ax.set_xlabel('$x_1$')
ax.set_ylabel('$x_2$')

plt.show()

Typical Model Specification for Logistic Regression

In [8]:
classifier = Sequential([
    Dense(16, input_dim=n_features, activation='relu'),
    Dense(32, activation='relu'),
    Dense(32, activation='relu'),
    Dense(1, activation='sigmoid')
])
classifier.compile(optimizer='rmsprop', loss='binary_crossentropy')
In [9]:
loss = classifier.evaluate(x_test, y_test)
loss
100/100 [==============================] - 0s 1ms/step
Out[9]:
0.66595284819602962

Alternative Specification

An alternative specification isolates the positive and negative samples and explicitly requires an input for each class. The loss that is optimized is then the sum of the binary cross-entropy losses for each class. Since we fix the corresponding target labels to be all ones or zeros accordingly, the binary cross-entropy losses for each class both result in one of the complementary terms, and averaging them would result in the usual binary cross-entropy loss on all samples with their corresponding labels.

This trick is crucial for many model specifications in keras-adversarial.

In [10]:
pos = Input(shape=(n_features,))
neg = Input(shape=(n_features,))

# make use of the classifier defined earlier
y_pred_pos = classifier(pos)
y_pred_neg = classifier(neg)

# define a multi-input, multi-output model
classifier_alt = Model([pos, neg], [y_pred_pos, y_pred_neg])
classifier_alt.compile(optimizer='rmsprop', loss='binary_crossentropy')
In [11]:
losses = classifier_alt.evaluate(
    [
        x_test[y_test == 1], 
        x_test[y_test == 0]
    ], 
    [
        np.ones(n_samples // 2), 
        np.zeros(n_samples // 2)
    ]
)
losses
50/50 [==============================] - 0s 1ms/step
Out[11]:
[1.3319057083129884, 1.0514429426193237, 0.28046274423599243]

The loss that actually gets optimized is the first value above, the sum of the subsequent values. The mean of them is the required binary cross-entropy loss.

In [12]:
.5 * losses[0]
Out[12]:
0.66595285415649419
In [13]:
# alternatively
np.mean(losses[1:])
Out[13]:
0.66595284342765804
In [14]:
np.allclose(loss, np.mean(losses[1:]))
Out[14]:
True

Comments

Comments powered by Disqus