Example 3: Ridge Regression¶

This notebook fits ridge / MSE regression via plqcom decomposition and shows both ReHLine calling styles in step 4.

Per-sample squared error: \(L_i(z_i)=(y_i-z_i)^2\), implemented as \(L(y_i-z_i)\) with \(p=-1\), \(q=y_i\). Set C = 1.0 as the ReHLine ERM weight.

1. Data Generation¶

Synthetic regression data with \(n{=}1000\) samples and \(d{=}5\) features.

[1]:

from plqcom import PLQLoss, plq_to_rehloss, affine_transformation
import numpy as np
from rehline import ReHLine

n_samples, n_features = 1000, 5
C = 1.0  # ReHLine ERM weight (ridge / MSE strength; see step 4)
rng = np.random.RandomState(0)
X = rng.randn(n_samples, n_features)
beta = rng.randn(n_features)
y = np.dot(X, beta) + rng.normal(scale=0.1, size=n_samples)

2. Create and Decompose the PLQ Loss¶

[2]:

plqloss = PLQLoss(quad_coef={'a': np.array([1.]), 'b': np.array([0.]), 'c': np.array([0.])},
                  cutpoints=np.array([]))

[3]:

rehloss = plq_to_rehloss(plqloss)

[4]:

rehloss.rehu_cut, rehloss.rehu_coef, rehloss.rehu_intercept

[4]:

(array([[inf],
        [inf]]),
 array([[-1.41421356],
        [ 1.41421356]]),
 array([[-0.],
        [ 0.]]))

3. Broadcast to All Samples¶

Regression form: \(L_i(z_i)=L(y_i-z_i)\) with form='regression' or equivalently p=-1, q=y. Use c=1; ERM strength via ReHLine(C=C) in step 4.

[5]:

# c=1: uniform weights; ERM strength via ReHLine(C=C) in step 4
rehloss = affine_transformation(rehloss, n=X.shape[0], c=1, p=-1, q=y)

4. Solve with ReHLine¶

rehline \(\geq\) 0.1.0 supports two calling styles:

4a. Low-level API — after plqcom decomposition, pass rehloss coefficients via _Tau, _S, _T, etc.
4b. Scikit-learn style — for built-in MSE / ridge, use plq_Ridge_Regressor with fit(X, y) directly.

4a. Low-level API (plqcom Decomposition)¶

[6]:

clf = ReHLine(C=C)
clf._Tau, clf._S, clf._T = rehloss.rehu_cut, rehloss.rehu_coef, rehloss.rehu_intercept
clf.fit(X=X)
print('sol provided by rehline: %s' % clf.coef_)

sol provided by rehline: [ 0.31029299 -0.73855141 -1.53883908 -0.56076169 -1.6047202 ]

4b. Scikit-learn Style (Built-in MSE Loss)¶

Skip plqcom steps 1–3. Set fit_intercept=False to match the low-level setup in 4a.

[ ]:

from rehline import plq_Ridge_Regressor

clf_sk = plq_Ridge_Regressor(loss={'name': 'MSE'}, C=C)
clf_sk.fit(X, y)
print('sol provided by plq_Ridge_Regressor: %s' % clf_sk.coef_)

Compare with the solution provided by sklearn

[7]:

from sklearn.linear_model import Ridge
clf1 = Ridge(alpha=0.5)
clf1.fit(X, y)
print('sol provided by sklearn: %s' % clf1.coef_)

sol provided by sklearn: [ 0.31017419 -0.73849146 -1.53893293 -0.56084128 -1.60476756]