SoftSort: A Differantiable Continuous Relaxation of the argsort Operator

Part of Proceedings of the International Conference on Machine Learning 1 pre-proceedings (ICML 2020)

Bibtex »Metadata »Paper »Supplemental »

Bibtek download is not availble in the pre-proceeding


Authors

Sebastian Prillo, Julian Eisenschlos

Abstract

<p>Sorting is an important procedure in computer science. However, the argsort operator - which takes as input a vector and returns its sorting per-mutation - has a discrete image and thus zero gradients almost everywhere. This prohibits end-to-end, gradient-based learning of models that rely on the argsort operator. A natural way to overcome this problem is to replace the argsort operator with a continuous relaxation. Recent work has shown a number of ways to do this. However, the relaxations proposed so far are computationally complex. In this work we propose a simple continuous relaxation for the argsort operator. Unlike previous works, our relaxation is straight-forward: it can be implemented in three lines of code, achieves state-of-the-art performance, is easy to reason about mathematically - substantially simplifying proofs - and is up to six times faster than competing approaches. We open-source the code to reproduce all of the experiments</p>