Hot Papers 2020-07-10

1. SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning

Kimin Lee, Michael Laskin, Aravind Srinivas, Pieter Abbeel

retweets: 44, favorites: 196 (07/11/2020 14:28:15)
links: abs | pdf
cs.LG | cs.AI | stat.ML

Model-free deep reinforcement learning (RL) has been successful in a range of challenging domains. However, there are some remaining issues, such as stabilizing the optimization of nonlinear function approximators, preventing error propagation due to the Bellman backup in Q-learning, and efficient exploration. To mitigate these issues, we present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy RL algorithms. SUNRISE integrates three key ingredients: (a) bootstrap with random initialization which improves the stability of the learning process by training a diverse ensemble of agents, (b) weighted Bellman backups, which prevent error propagation in Q-learning by reweighing sample transitions based on uncertainty estimates from the ensembles, and (c) an inference method that selects actions using highest upper-confidence bounds for efficient exploration. Our experiments show that SUNRISE significantly improves the performance of existing off-policy RL algorithms, such as Soft Actor-Critic and Rainbow DQN, for both continuous and discrete control tasks on both low-dimensional and high-dimensional environments. Our training code is available at https://github.com/pokaxpoka/sunrise.

Can ensemble improve off-policy RL by handling various issues?

Yes! We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy RL algorithms. New work with @MishaLaskin @AravSrinivas and @pabbeel

Paper: https://t.co/RMIGNVcqGU

1/N
— Kimin (@kimin_le2) July 10, 2020

SUNRISE can easily be applied to off-policy methods, like SAC and Rainbow DQN, improving their performance on OpenAI Gym, DM Control, and Atari.

Paper: https://t.co/LXGi6nAzTZ
Code: https://t.co/p4kVU5CkW1

w/@kimin_le2 @MishaLaskin @AravSrinivas
— Pieter Abbeel (@pabbeel) July 10, 2020

2. Reformulation of the No-Free-Lunch Theorem for Entangled Data Sets

Kunal Sharma, M. Cerezo, Zoë Holmes, Lukasz Cincio, Andrew Sornborger, Patrick J. Coles

retweets: 22, favorites: 134 (07/11/2020 14:28:15)
links: abs | pdf
quant-ph | cs.LG

The No-Free-Lunch (NFL) theorem is a celebrated result in learning theory that limits one’s ability to learn a function with a training data set. With the recent rise of quantum machine learning, it is natural to ask whether there is a quantum analog of the NFL theorem, which would restrict a quantum computer’s ability to learn a unitary process (the quantum analog of a function) with quantum training data. However, in the quantum setting, the training data can possess entanglement, a strong correlation with no classical analog. In this work, we show that entangled data sets lead to an apparent violation of the (classical) NFL theorem. This motivates a reformulation that accounts for the degree of entanglement in the training set. As our main result, we prove a quantum NFL theorem whereby the fundamental limit on the learnability of a unitary is reduced by entanglement. We employ Rigetti’s quantum computer to test both the classical and quantum NFL theorems. Our work establishes that entanglement is a commodity in quantum machine learning.

New work: Reformulating the Quantum No-Free-Lunch Theorem for Entangled Data Setshttps://t.co/EF4j5LMqiv

In collaboration w/ @kunal_phy, Z. Holmes, and L. Cincio, @sornborg, and @PatrickColes314

Our results where verified w/ @rigetti's device!

👇See thread for details 👇 pic.twitter.com/tB0cPrfTwb
— Marco Cerezo (@MvsCerezo) July 10, 2020

Our new paper is best summarized by the cartoon below.

Joint work with @kunal_phy, @MvsCerezo, Z. Holmes, L. Cincio, @sornborg.

Scirate link: https://t.co/Jlibl14cfj pic.twitter.com/gyDu9JAypP
— Patrick Coles (@PatrickColes314) July 10, 2020

check out our new results:

"Reformulating the Quantum No-Free-Lunch Theorem for Entangled Data Sets"https://t.co/NwAjQWI8Js

in collaboration with @MvsCerezo, Zoe, Lukasz, @sornborg, and @PatrickColes314

see @MvsCerezo's thread for a short summary of our results. 👇 https://t.co/ovAOiKnItx
— Kunal Sharma (@kunal_phy) July 10, 2020

3. Learning Graph Structure With A Finite-State Automaton Layer

Daniel D. Johnson, Hugo Larochelle, Daniel Tarlow

retweets: 28, favorites: 89 (07/11/2020 14:28:16)
links: abs | pdf
cs.LG | stat.ML

Graph-based neural network models are producing strong results in a number of domains, in part because graphs provide flexibility to encode domain knowledge in the form of relational structure (edges) between nodes in the graph. In practice, edges are used both to represent intrinsic structure (e.g., abstract syntax trees of programs) and more abstract relations that aid reasoning for a downstream task (e.g., results of relevant program analyses). In this work, we study the problem of learning to derive abstract relations from the intrinsic graph structure. Motivated by their power in program analyses, we consider relations defined by paths on the base graph accepted by a finite-state automaton. We show how to learn these relations end-to-end by relaxing the problem into learning finite-state automata policies on a graph-based POMDP and then training these policies using implicit differentiation. The result is a differentiable Graph Finite-State Automaton (GFSA) layer that adds a new edge type (expressed as a weighted adjacency matrix) to a base graph. We demonstrate that this layer can find shortcuts in grid-world graphs and reproduce simple static analyses on Python programs. Additionally, we combine the GFSA layer with a larger graph-based model trained end-to-end on the variable misuse program understanding task, and find that using the GFSA layer leads to better performance than using hand-engineered semantic edges or other baseline methods for adding learned edge types.

We are excited to present the Graph Finite-State Automaton (GFSA) layer, which learns to add long-distance edges to graphs end-to-end based on a downstream objective!https://t.co/o81LidfMRD

(With @numbercrunching and @hugo_larochelle. 1/9) pic.twitter.com/5IC4SLbwZd
— Daniel Johnson (@hexahedria) July 10, 2020

4. Words as Art Materials: Generating Paintings with Sequential GANs

Azmi Can Özgen, Hazım Kemal Ekenel

retweets: 17, favorites: 80 (07/11/2020 14:28:16)
links: abs | pdf
cs.CV | eess.IV

Converting text descriptions into images using Generative Adversarial Networks has become a popular research area. Visually appealing images have been generated successfully in recent years. Inspired by these studies, we investigated the generation of artistic images on a large variance dataset. This dataset includes images with variations, for example, in shape, color, and content. These variations in images provide originality which is an important factor for artistic essence. One major characteristic of our work is that we used keywords as image descriptions, instead of sentences. As the network architecture, we proposed a sequential Generative Adversarial Network model. The first stage of this sequential model processes the word vectors and creates a base image whereas the next stages focus on creating high-resolution artistic-style images without working on word vectors. To deal with the unstable nature of GANs, we proposed a mixture of techniques like Wasserstein loss, spectral normalization, and minibatch discrimination. Ultimately, we were able to generate painting images, which have a variety of styles. We evaluated our results by using the Fr’echet Inception Distance score and conducted a user study with 186 participants.

Words as Art Materials: Generating Paintings with Sequential GANs
pdf: https://t.co/pcAoNebBUY
abs: https://t.co/H7bW3MmhXe
github: https://t.co/6nsj0uiNWq
demo: https://t.co/iAyp4glCn9 pic.twitter.com/WElYGl1EUB
— AK (@ak92501) July 10, 2020

Chuang Gan, Jeremy Schwartz, Seth Alter, Martin Schrimpf, James Traer, Julian De Freitas, Jonas Kubilius, Abhishek Bhandwaldar, Nick Haber, Megumi Sano, Kuno Kim, Elias Wang, Damian Mrowca, Michael Lingelbach, Aidan Curtis, Kevin Feigelis, Daniel M. Bear, Dan Gutfreund, David Cox, James J. DiCarlo, Josh McDermott, Joshua B. Tenenbaum, Daniel L.K. Yamins

retweets: 20, favorites: 68 (07/11/2020 14:28:16)
links: abs | pdf
cs.CV | cs.GR | cs.LG | cs.RO

We introduce ThreeDWorld (TDW), a platform for interactive multi-modal physical simulation. With TDW, users can simulate high-fidelity sensory data and physical interactions between mobile agents and objects in a wide variety of rich 3D environments. TDW has several unique properties: 1) realtime near photo-realistic image rendering quality; 2) a library of objects and environments with materials for high-quality rendering, and routines enabling user customization of the asset library; 3) generative procedures for efficiently building classes of new environments 4) high-fidelity audio rendering; 5) believable and realistic physical interactions for a wide variety of material types, including cloths, liquid, and deformable objects; 6) a range of “avatar” types that serve as embodiments of AI agents, with the option for user avatar customization; and 7) support for human interactions with VR devices. TDW also provides a rich API enabling multiple agents to interact within a simulation and return a range of sensor and physics data representing the state of the world. We present initial experiments enabled by the platform around emerging research directions in computer vision, machine learning, and cognitive science, including multi-modal physical scene understanding, multi-agent interactions, models that “learn like a child”, and attention studies in humans and neural networks. The simulation platform will be made publicly available.

ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulationhttps://t.co/A5M4AUBdKJ pic.twitter.com/iZ6I23YpY8
— sim2real (@sim2realAIorg) July 10, 2020

Physics + Sound+ Interaction! Check out our TDW platform- ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation. Project page: https://t.co/yuryI8YQvS Paper link:https://t.co/UTboxDYGLu
— Chuang Gan (@gan_chuang) July 10, 2020

6. Finite mixture models are typically inconsistent for the number of components

Diana Cai, Trevor Campbell, Tamara Broderick

retweets: 12, favorites: 70 (07/11/2020 14:28:17)
links: abs | pdf
math.ST | stat.ME | stat.ML

Scientists and engineers are often interested in learning the number of subpopulations (or components) present in a data set. Practitioners commonly use a Dirichlet process mixture model (DPMM) for this purpose; in particular, they count the number of clusters---i.e. components containing at least one data point---in the DPMM posterior. But Miller and Harrison (2013) warn that the DPMM cluster-count posterior is severely inconsistent for the number of latent components when the data are truly generated from a finite mixture; that is, the cluster-count posterior probability on the true generating number of components goes to zero in the limit of infinite data. A potential alternative is to use a finite mixture model (FMM) with a prior on the number of components. Past work has shown the resulting FMM component-count posterior is consistent. But existing results crucially depend on the assumption that the component likelihoods are perfectly specified. In practice, this assumption is unrealistic, and empirical evidence (Miller and Dunson, 2019) suggests that the FMM posterior on the number of components is sensitive to the likelihood choice. In this paper, we add rigor to data-analysis folk wisdom by proving that under even the slightest model misspecification, the FMM posterior on the number of components is ultraseverely inconsistent: for any finite $k \in \mathbb{N}$ , the posterior probability that the number of components is $k$ converges to 0 in the limit of infinite data. We illustrate practical consequences of our theory on simulated and real data sets.

New preprint with Trevor Campbell and Tamara Broderick: https://t.co/1yiCsx0OZf

Finite mixture models are typically inconsistent for the posterior # of components -- we say "typically" because misspecification of the component distributions is almost unavoidable in practice
— Diana Cai (@dianarycai) July 10, 2020

7. Improving Style-Content Disentanglement in Image-to-Image Translation

Aviv Gabbay, Yedid Hoshen

retweets: 14, favorites: 61 (07/11/2020 14:28:17)
links: abs | pdf
cs.CV | cs.LG

Unsupervised image-to-image translation methods have achieved tremendous success in recent years. However, it can be easily observed that their models contain significant entanglement which often hurts the translation performance. In this work, we propose a principled approach for improving style-content disentanglement in image-to-image translation. By considering the information flow into each of the representations, we introduce an additional loss term which serves as a content-bottleneck. We show that the results of our method are significantly more disentangled than those produced by current methods, while further improving the visual quality and translation diversity.

Improving Style-Content Disentanglement in Image-to-Image Translation
pdf: https://t.co/86UflGKakj
abs: https://t.co/rBR4t1PdgT
project page: https://t.co/f2O8lOxF9q pic.twitter.com/OcghxoOIZz
— AK (@ak92501) July 10, 2020

Published 11 Jul 2020

ML Lead at Beatrust. (https://beatrust.com)Tatsuya Shirakawa on Twitter