Bibtek download is not availble in the pre-proceeding
Mathieu Blondel, Olivier Teboul, Quentin Berthet, Josip Djolonga
Sorting is an elementary building block of modern software. In machine learning and statistics, it is commonly used in robust statistics, order statistics and ranking metrics. However, sorting is a piecewise linear function and as a result includes many kinks at which it is non-differentiable. More problematic, the ranking operator is a piecewise constant function, meaning that its derivatives are null or undefined. While numerous works have proposed differentiable proxies to sorting and ranking, they do not achieve the $O(n \log n)$ time complexity one could expect from a sorting or ranking operation. In this paper, we propose the first differentiable sorting and ranking operators with $O(n \log n)$ time and $O(n)$ space complexity. Our proposal in addition enjoys exact computation and differentiation. We achieve this feat by casting differentiable sorting and ranking as projections onto a permutahedron, the convex hull of permutations, and using a reduction to isotonic optimization. Empirically, we confirm that our approach is an order of magnitude faster than existing approaches. We also showcase two novel applications: differentiable Spearman's rank coefficient and differentiable least trimmed squares.