Example 1:Hinge and Square Loss

This example try to show how to use our PLQ Composite Decomposition tool to decompose a given loss function and use ReHLine to do classification.

Given a loss function \(L(z)=\max(\max(1-z,0)),\frac{1}{2} (1-z)^2 )\)

[1]:
import numpy as np

z = np.linspace(-2, 2, 100)
L1 = np.maximum(1 - z, 0)
L2 = 0.5 * (1 - z) ** 2
[2]:
import matplotlib.pyplot as plt

plt.figure(figsize=(4, 3))
plt.plot(z, L1, marker='o', label='Hinge Loss')
plt.plot(z, L2, marker='s', label='Square Loss')
plt.plot(z, np.maximum(L1, L2), marker='^', label='PLQ Loss')
plt.legend()
plt.xlabel('z')
plt.ylabel('L(z)')
plt.title('Hinge Loss and Square Loss')
plt.show()
../_images/notebooks_ex1_hinge_square_2_0.png

1.Data Generation

[3]:
from plqcom import PLQLoss, plq_to_rehloss, affine_transformation
from rehline import ReHLine
from rehline._base import _relu, _rehu

Generate a random classification dataset, where \(\mathbf{X} \in \mathbb{R}^{1000 \times 3}\) and \(\mathbf{y} \in \mathbb\{ -1, 1\}^{1000}\)

[4]:
# Generate a random classification dataset
n, d, C = 1000, 3, 0.5
np.random.seed(1024)
X = np.random.randn(n, d)
beta = np.random.randn(d)
y = np.sign(X.dot(beta) + np.random.randn(n))

First 10 samples of the dataset

[5]:
X[:10], y[:10]
[5]:
(array([[ 2.12444863,  0.25264613,  1.45417876],
        [ 0.56923979,  0.45822365, -0.80933344],
        [ 0.86407349,  0.20170137, -1.87529904],
        [-0.56850693, -0.06510141,  0.80681666],
        [-0.5778176 ,  0.57306064, -0.33667496],
        [ 0.29700734, -0.37480416,  0.15510474],
        [ 0.70485719,  0.8452178 , -0.65818079],
        [ 0.56810558,  0.51538125, -0.61564998],
        [ 0.92611427, -1.28591285,  1.43014026],
        [-0.4254975 , -0.40257712,  0.60410409]]),
 array([ 1., -1., -1., -1., -1.,  1., -1., -1.,  1.,  1.]))
[6]:
X[:10].dot(beta)
[6]:
array([ 4.63095617, -1.53086335, -3.77183917,  1.49927261, -1.1677152 ,
        0.56448038, -1.12647624, -1.09186115,  3.93541798,  1.1463427 ])

2. Create and Decompose the PLQ Loss

Create the PLQ Loss object with the given quad_coef only (max form).

[7]:
plqloss_1 = PLQLoss(
    quad_coef={'a': np.array([0., 0., 0.5]), 'b': np.array([0., -1., -1.]), 'c': np.array([0., 1., 0.5])},
    form='max')

We can also create the PLQ Loss object with the given quad_coef and cutpoints.(plq form)

[8]:
plqloss_2 = PLQLoss(
    quad_coef={'a': np.array([0.5, 0., 0.5]), 'b': np.array([-1., -1., -1.]), 'c': np.array([0.5, 1., 0.5])},
    cutpoints=np.array([-1., 1.]),
    form='plq')

Convert the PLQ Loss to ReHLoss by calling its method plq_to_rehloss)

[9]:
rehloss_1 = plq_to_rehloss(plqloss_1)
rehloss_2 = plq_to_rehloss(plqloss_2)
[10]:
rehloss_1.relu_coef, rehloss_1.relu_intercept, rehloss_1.rehu_cut, rehloss_1.rehu_coef, rehloss_1.rehu_intercept
rehloss_2.relu_coef, rehloss_2.relu_intercept, rehloss_2.rehu_cut, rehloss_2.rehu_coef, rehloss_2.rehu_intercept
[10]:
(array([[-1.],
        [-1.]]),
 array([[ 1.],
        [-1.]]),
 array([[inf],
        [inf]]),
 array([[-1.],
        [ 1.]]),
 array([[-1.],
        [-1.]]))
[11]:
rehloss_1.rehu_coef == rehloss_2.rehu_coef
[11]:
array([[ True],
       [ True]])
[12]:
plt.figure(figsize=(8, 3))
plt.subplot(1, 2, 1)
Z = np.linspace(-2, 2, 1000)
L = plqloss_1(Z)
plt.plot(Z, L)
plt.title('Original PLQ Loss')

plt.subplot(1, 2, 2)
relu_coef, relu_intercept = rehloss_1.relu_coef, rehloss_1.relu_intercept
rehu_cut, rehu_coef, rehu_intercept = rehloss_1.rehu_cut, rehloss_1.rehu_coef, rehloss_1.rehu_intercept
Reh = np.sum(_rehu(rehu_coef * Z + rehu_intercept, rehu_cut), axis=0) + np.sum(_relu(relu_coef * Z + relu_intercept),
                                                                              axis=0)
plt.plot(Z, Reh)
plt.title('ReLU-ReHU Composition')
plt.show()
../_images/notebooks_ex1_hinge_square_19_0.png

3.Broadcast to all samples

For this classication problem, \(L(z)= \max(\max(1-z,0)),\frac{1}{2} (1-z)^2 )\) \(L_i(z_i)=\frac{1}{2} \max(\max(1-y_i z_i,0)),\frac{1}{2} (1-y_i z_i)^2 )\) Compare \(L_i(z_i)\) with \(L(z)\), we can find that \(L_i(z_i)=\frac{1}{2}L(y_i z_i)\) which is a affine transformation of \(L(z)\) with \(p=y_i, q=0,c=\frac{1}{2}\).

[13]:
rehloss = affine_transformation(rehloss_1, n=X.shape[0], c=C, p=y, q=0)
[14]:
print(rehloss.relu_coef, rehloss.relu_intercept, rehloss.rehu_cut, rehloss.rehu_coef, rehloss.rehu_intercept)
[[-0.5  0.5  0.5 ...  0.5  0.5 -0.5]
 [-0.5  0.5  0.5 ...  0.5  0.5 -0.5]] [[ 0.5  0.5  0.5 ...  0.5  0.5  0.5]
 [-0.5 -0.5 -0.5 ... -0.5 -0.5 -0.5]] [[inf inf inf ... inf inf inf]
 [inf inf inf ... inf inf inf]] [[-0.70710678  0.70710678  0.70710678 ...  0.70710678  0.70710678
  -0.70710678]
 [ 0.70710678 -0.70710678 -0.70710678 ... -0.70710678 -0.70710678
   0.70710678]] [[-0.70710678 -0.70710678 -0.70710678 ... -0.70710678 -0.70710678
  -0.70710678]
 [-0.70710678 -0.70710678 -0.70710678 ... -0.70710678 -0.70710678
  -0.70710678]]

4. Use ReHLine to solve the problem

[15]:
clf = ReHLine()
clf.U, clf.V, clf.Tau, clf.S, clf.T, clf.C = rehloss.relu_coef, rehloss.relu_intercept, rehloss.rehu_cut, rehloss.rehu_coef, rehloss.rehu_intercept, C
clf.fit(X=X)
print('sol privided by rehline: %s' % clf.coef_)
print('decision_function: %s' % clf.decision_function([[.1, .2, .3]]))
print('precision: %s' % (np.sum(np.sign(X.dot(clf.coef_)) == y) / n))
sol privided by rehline: [ 0.20339926 -0.00843538  0.69527244]
decision_function: [0.22723458]
precision: 0.868