All Articles

Hot Papers 2020-07-06

1. A (Slightly) Improved Approximation Algorithm for Metric TSP

Anna R. Karlin, Nathan Klein, Shayan Oveis Gharan

For some ϵ>1036\epsilon > 10^{-36} we give a 3/2ϵ3/2-\epsilon approximation algorithm for metric TSP.

2. In Search of Lost Domain Generalization

Ishaan Gulrajani, David Lopez-Paz

The goal of domain generalization algorithms is to predict well on distributions different from those seen during training. While a myriad of domain generalization algorithms exist, inconsistencies in experimental conditions — datasets, architectures, and model selection criteria — render fair and realistic comparisons difficult. In this paper, we are interested in understanding how useful domain generalization algorithms are in realistic settings. As a first step, we realize that model selection is non-trivial for domain generalization tasks. Contrary to prior work, we argue that domain generalization algorithms without a model selection strategy should be regarded as incomplete. Next, we implement DomainBed, a testbed for domain generalization including seven multi-domain datasets, nine baseline algorithms, and three model selection criteria. We conduct extensive experiments using DomainBed and find that, when carefully implemented, empirical risk minimization shows state-of-the-art performance across all datasets. Looking forward, we hope that the release of DomainBed, along with contributions from fellow researchers, will streamline reproducible and rigorous research in domain generalization.

3. Online learning in MDPs with linear function approximation and bandit feedback

Gergely Neu, Julia Olkhovskaya

We consider an online learning problem where the learner interacts with a Markov decision process in a sequence of episodes, where the reward function is allowed to change between episodes in an adversarial manner and the learner only gets to observe the rewards associated with its actions. We allow the state space to be arbitrarily large, but we assume that all action-value functions can be represented as linear functions in terms of a known low-dimensional feature map, and that the learner has access to a simulator of the environment that allows generating trajectories from the true MDP dynamics. Our main contribution is developing a computationally efficient algorithm that we call MDP-LinExp3, and prove that its regret is bounded by O~(H2T2/3(dK)1/3)\widetilde{\mathcal{O}}\big(H^2 T^{2/3} (dK)^{1/3}\big), where TT is the number of episodes, HH is the number of steps in each episode, KK is the number of actions, and dd is the dimension of the feature map. We also show that the regret can be improved to O~(H2TdK)\widetilde{\mathcal{O}}\big(H^2 \sqrt{TdK}\big) under much stronger assumptions on the MDP dynamics. To our knowledge, MDP-LinExp3 is the first provably efficient algorithm for this problem setting.

4. Self-Supervised GAN Compression

Chong Yu, Jeff Pool

Deep learning’s success has led to larger and larger models to handle more and more complex tasks; trained models can contain millions of parameters. These large models are compute- and memory-intensive, which makes it a challenge to deploy them with minimized latency, throughput, and storage requirements. Some model compression methods have been successfully applied to image classification and detection or language models, but there has been very little work compressing generative adversarial networks (GANs) performing complex tasks. In this paper, we show that a standard model compression technique, weight pruning, cannot be applied to GANs using existing methods. We then develop a self-supervised compression technique which uses the trained discriminator to supervise the training of a compressed generator. We show that this framework has a compelling performance to high degrees of sparsity, can be easily applied to new tasks and models, and enables meaningful comparisons between different pruning granularities.