A distance metric learning algorithm that obtains a metric with target neighbors as near as possible and impostors as far as possible.
LMNN(num_dims = NULL, learning_rate = "adaptive", eta0 = 0.3, initial_metric = NULL, max_iter = 100, prec = 1e-08, tol = 1e-08, k = 3, mu = 0.5, soft_comp_interval = 1, learn_inc = 1.01, learn_dec = 0.5, eta_thres = 1e-14, solver = "SDP")
num_dims | Desired value for dimensionality reduction. Ignored if solver is 'SDP'. If NULL, all features will be kept. Integer. |
---|---|
learning_rate | Type of learning rate update for gradient descent. Possible values are: - 'adaptive' : the learning rate will increase if the gradient step is succesful, else it will decrease. - 'constant' : the learning rate will be constant during all the gradient steps. |
eta0 | The initial value for learning rate. |
initial_metric | If array or matrix, and solver is SDP, it must be a positive semidefinite matrix with the starting metric (d x d) for gradient descent, where d is the number of features. If None, euclidean distance will be used. If a string, the following values are allowed: - 'euclidean' : the euclidean distance. - 'scale' : a diagonal matrix that normalizes each attribute according to its range will be used. If solver is 'SGD', then the array or matrix will represent a linear map (d' x d), where d' is the dimension provided in num_dims. |
max_iter | Maximum number of iterations of gradient descent. Integer. |
prec | Precision stop criterion (gradient norm). Float. |
tol | Tolerance stop criterion (difference between two iterations). Float. |
k | Number of target neighbors to take. If this algorithm is used for nearest neighbors classification, a good choice is to take k as the number of neighbors. Integer. |
mu | The weight of the push error in the minimization algorithm. The objective function is composed of a push error, given by the impostors, with weight mu, and a pull error, given by the target neighbors, with weight (1-mu). It must be between 0.0 and 1.0. |
soft_comp_interval | Intervals of soft computation. The soft computation relaxes the gradient descent conditions, but makes the algorithm more efficient. This value provides the length of a soft computation interval. After soft_comp_interval iterations of gradient descent, a complete gradient step is performed. Integer. |
learn_inc | Increase factor for learning rate. Ignored if learning_rate is not 'adaptive'. Float. |
learn_dec | Decrease factor for learning rate. Ignored if learning_rate is not 'adaptive'. Float. |
eta_thres | A learning rate threshold stop criterion. Float. |
solver | The algorithm used for minimization. Allowed values are: - 'SDP' : semidefinite programming, consisting of gradient descent with projections onto the positive semidefinite cone. It learns a metric. - 'SGD' : stochastic gradient descent. It learns a linear transformer. |
The LMNN transformer, structured as a named list.
Kilian Q Weinberger and Lawrence K Saul. “Distance metric learning for large margin nearest neighbor classification”. In: Journal of Machine Learning Research 10.Feb (2009), pages 207-244.