A Chance-Constrained Generative Framework for Sequence Optimization

Part of Proceedings of the International Conference on Machine Learning 1 pre-proceedings (ICML 2020)

Bibtex »Metadata »Paper »Supplemental »

Bibtek download is not availble in the pre-proceeding


Xianggen Liu, Qiang Liu, Sen Song , Jian Peng


<p>Deep generative modeling has achieved many successes for continuous data generation, such as producing realistic images and controlling their properties (e.g., styles). However, the development of generative modeling techniques for optimizing discrete data, such as sequences or strings, still lags behind largely due to the challenges in modeling complex and long-range constraints, including both syntax and semantics, in discrete structures. For example, to generate a string representing a molecule structure or a mathematical expression with a desired quantitative property, we need to both ensure the validity of the generated string subject to a grammar and model the string representation so that it is predictive of the property. In this paper, we formulate the sequence optimization task as a chance-constrained sampling problem. The key idea is to enforce a high probability of generating valid sequences and also optimizes the property of interest. We propose a novel minmax algorithm based a tightening of the chance constraint, by jointly tightening a bound of the valid chance and optimizing the expected property. Extensive experimental results in three domains, including arithmetic expressions, Python programs, and SMILES strings for molecules, demonstrate the superiority of our approach over the existing sequence optimization methods. In particular, it is able to achieve the state-of-the-art performance in the molecule optimization task where the current best methods are graph-based.</p>