Tianyi Zhou, Shengjie Wang, Jeff Bilmes
Semi-supervised learning (SSL) leverages unlabeled data when training a model with insufficient labeled data. A common strategy for SSL is to enforce the consistency of model outputs between similar samples, e.g., neighbors or data augmentations of the same sample. However, model outputs can vary dramatically on unlabeled data over different training stages, e.g., when using large learning rates. This can introduce harmful noises and inconsistent objectives over time that may lead to concept drift and catastrophic forgetting. In this paper, we study the dynamics of neural net outputs in SSL and show that selecting and using first the unlabeled samples with more consistent outputs over the course of training (i.e., "time-consistency") can improve the final test accuracy and save computation. Under the time-consistent data selection, we design an SSL objective composed of two self-supervised losses, i.e., a consistency loss between a sample and its augmentation, and a contrastive loss encouraging different samples to have different outputs. Our approach achieves SOTA on several SSL benchmarks with much fewer computations.