A distance metric learning algorithm that learns minimizing the KL divergence to the maximally collapsing distribution.
MCML(num_dims = NULL, learning_rate = "adaptive", eta0 = 0.01, initial_metric = NULL, max_iter = 20, prec = 0.01, tol = 0.01, descent_method = "SDP", eta_thres = 1e-14, learn_inc = 1.01, learn_dec = 0.5)
num_dims | Number of dimensions for dimensionality reduction. Not supported yet. |
---|---|
learning_rate | Type of learning rate update for gradient descent. Possible values are: - 'adaptive' : the learning rate will increase if the gradient step is succesful, else it will decrease. - 'constant' : the learning rate will be constant during all the gradient steps. |
eta0 | The initial value for learning rate. Float. |
initial_metric | If array or matrix, it must be a positive semidefinite matrix with the starting metric for gradient descent, where d is the number of features. If None, euclidean distance will be used. If a string, the following values are allowed: - 'euclidean' : the euclidean distance. - 'scale' : a diagonal matrix that normalizes each attribute according to its range will be used. |
max_iter | Maximum number of iterations of gradient descent. Integer. |
prec | Precision stop criterion (gradient norm). Float. |
tol | Tolerance stop criterion (difference between two iterations). Float. |
descent_method | The descent method to use. Allowed values are: - 'SDP' : semidefinite programming, consisting of gradient descent with projections onto the PSD cone. |
eta_thres | A learning rate threshold stop criterion. |
learn_inc | Increase factor for learning rate. Ignored if learning_rate is not 'adaptive'. |
learn_dec | Decrease factor for learning rate. Ignored if learning_rate is not 'adaptive'. |
The MCML transformer, structured as a named list.
Amir Globerson and Sam T Roweis. “Metric learning by collapsing classes”. In: Advances in neural information processing systems. 2006, pages 451-458.