Hot Papers 2020-07-09

1. NVAE: A Deep Hierarchical Variational Autoencoder

Arash Vahdat, Jan Kautz

retweets: 261, favorites: 1160 (07/10/2020 07:46:45)
links: abs | pdf
stat.ML | cs.CV | cs.LG

Normalizing flows, autoregressive models, variational autoencoders (VAEs), and deep energy-based models are among competing likelihood-based frameworks for deep generative learning. Among them, VAEs have the advantage of fast and tractable sampling and easy-to-access encoding networks. However, they are currently outperformed by other models such as normalizing flows and autoregressive models. While the majority of the research in VAEs is focused on the statistical challenges, we explore the orthogonal direction of carefully designing neural architectures for hierarchical VAEs. We propose Nouveau VAE (NVAE), a deep hierarchical VAE built for image generation using depth-wise separable convolutions and batch normalization. NVAE is equipped with a residual parameterization of Normal distributions and its training is stabilized by spectral regularization. We show that NVAE achieves state-of-the-art results among non-autoregressive likelihood-based models on the MNIST, CIFAR-10, and CelebA HQ datasets and it provides a strong baseline on FFHQ. For example, on CIFAR-10, NVAE pushes the state-of-the-art from 2.98 to 2.91 bits per dimension, and it produces high-quality images on CelebA HQ as shown in Fig. 1. To the best of our knowledge, NVAE is the first successful VAE applied to natural images as large as 256 $\times$ 256 pixels.

📢📢📢 Introducing NVAE 📢📢📢

We show that deep hierarchical VAEs w/ carefully designed network architecture, generate high-quality images & achieve SOTA likelihood, even when trained w/ original VAE loss.

paper: https://t.co/L4GuiIKci8
with @jankautz at @NVIDIAAI

(1/n) pic.twitter.com/g6GQT7jkdC
— Arash Vahdat (@ArashVahdat) July 9, 2020

NVAE: A Deep Hierarchical Variational Autoencoder
pdf: https://t.co/7rCHQUb57O
abs: https://t.co/Er1voAVvc2 pic.twitter.com/RHNTHffjQh
— AK (@ak92501) July 9, 2020

2. Decolonial AI: Decolonial Theory as Sociotechnical Foresight in Artificial Intelligence

Shakir Mohamed, Marie-Therese Png, William Isaac

retweets: 85, favorites: 294 (07/10/2020 07:46:46)
links: abs | pdf
cs.CY | cs.AI | cs.LG | stat.ML

This paper explores the important role of critical science, and in particular of post-colonial and decolonial theories, in understanding and shaping the ongoing advances in artificial intelligence. Artificial Intelligence (AI) is viewed as amongst the technological advances that will reshape modern societies and their relations. Whilst the design and deployment of systems that continually adapt holds the promise of far-reaching positive change, they simultaneously pose significant risks, especially to already vulnerable peoples. Values and power are central to this discussion. Decolonial theories use historical hindsight to explain patterns of power that shape our intellectual, political, economic, and social world. By embedding a decolonial critical approach within its technical practice, AI communities can develop foresight and tactics that can better align research and technology development with established ethical principles, centring vulnerable peoples who continue to bear the brunt of negative impacts of innovation and scientific progress. We highlight problematic applications that are instances of coloniality, and using a decolonial lens, submit three tactics that can form a decolonial field of artificial intelligence: creating a critical technical practice of AI, seeking reverse tutelage and reverse pedagogies, and the renewal of affective and political communities. The years ahead will usher in a wave of new scientific breakthroughs and technologies driven by AI research, making it incumbent upon AI communities to strengthen the social contract through ethical foresight and the multiplicity of intellectual perspectives available to us; ultimately supporting future technologies that enable greater well-being, with the goal of beneficence and justice for all.

Excited to share a new paper on Decolonisation and AI. With the amazing @png_marie @wsisaac 🤩 we explore why Decolonial theory matters for our field, and tactics to decolonise and reshape our field into a Decolonial AI. Feedback please 🙏🏾 Thread 👇🏾
https://t.co/oR7XD0ocuk pic.twitter.com/xv9lYtJwLW
— Shakir Mohamed (@shakir_za) July 9, 2020

I am extremely here for this academic paper, which applies post-colonial and decolonial critical theories to AI. I'm basically just reading it while occasionally yelling "Preach! Power dynamics matter!" https://t.co/BsZo0eOU50
— Eva (@evacide) July 9, 2020

Fantastic new paper on post-colonial theory and the ethics (and future) of artificial intelligence:https://t.co/saPxFeuHX8

Congratulations @shakir_za 🔥 @png_marie 🔥 and @wsisaac 🔥
— Iason Gabriel (@IasonGabriel) July 9, 2020

3. Pitfalls to Avoid when Interpreting Machine Learning Models

Christoph Molnar, Gunnar König, Julia Herbinger, Timo Freiesleben, Susanne Dandl, Christian A. Scholbeck, Giuseppe Casalicchio, Moritz Grosse-Wentrup, Bernd Bischl

retweets: 48, favorites: 142 (07/10/2020 07:46:46)
links: abs | pdf
stat.ML | cs.LG

Modern requirements for machine learning (ML) models include both high predictive performance and model interpretability. A growing number of techniques provide model interpretations, but can lead to wrong conclusions if applied incorrectly. We illustrate pitfalls of ML model interpretation such as bad model generalization, dependent features, feature interactions or unjustified causal interpretations. Our paper addresses ML practitioners by raising awareness of pitfalls and pointing out solutions for correct model interpretation, as well as ML researchers by discussing open issues for further research.

Our new paper "Pitfalls to Avoid when Interpreting Machine Learning Models" was accepted at the XXAI ICML workshop 🥳https://t.co/u8F2EmsJg7 https://t.co/dFrVbkhpa2
— Christoph Molnar (@ChristophMolnar) July 9, 2020

A thread about the pitfalls we identified when interpreting ML models. Featuring poorly drawn comics from me. https://t.co/avPcASMITx

👇
— Christoph Molnar (@ChristophMolnar) July 9, 2020

4. Self-Supervised Policy Adaptation during Deployment

Nicklas Hansen, Yu Sun, Pieter Abbeel, Alexei A. Efros, Lerrel Pinto, Xiaolong Wang

retweets: 26, favorites: 132 (07/10/2020 07:46:46)
links: abs | pdf
cs.LG | cs.CV | cs.RO | stat.ML

In most real world scenarios, a policy trained by reinforcement learning in one environment needs to be deployed in another, potentially quite different environment. However, generalization across different environments is known to be hard. A natural solution would be to keep training after deployment in the new environment, but this cannot be done if the new environment offers no reward signal. Our work explores the use of self-supervision to allow the policy to continue training after deployment without using any rewards. While previous methods explicitly anticipate changes in the new environment, we assume no prior knowledge of those changes yet still obtain significant improvements. Empirical evaluations are performed on diverse environments from DeepMind Control suite and ViZDoom. Our method improves generalization in 25 out of 30 environments across various tasks, and outperforms domain randomization on a majority of environments.

Self-Supervised Policy Adaptation during Deployment
pdf: https://t.co/cNeCKQqacH
abs: https://t.co/kyaEv4kNQU
project page: https://t.co/0LtChbJlr4
github: https://t.co/KspCOwciGE pic.twitter.com/vzGkTpv2CB
— AK (@ak92501) July 9, 2020

Another cool paper to add to this list:

Self-Supervised Policy Adaptation during Deployment, by @ncklashansen et al.https://t.co/eTDMJsyPcv https://t.co/HuaqwJFODY pic.twitter.com/mtLlzGxnh2
— hardmaru (@hardmaru) July 9, 2020

Self-Supervised Policy Adaptation during Deploymenthttps://t.co/JVAzxSubMf #Robotics pic.twitter.com/grw9HlYF7B
— Tomasz Malisiewicz (@quantombone) July 9, 2020

5. The Scattering Compositional Learner: Discovering Objects, Attributes, Relationships in Analogical Reasoning

Yuhuai Wu, Honghua Dong, Roger Grosse, Jimmy Ba

retweets: 25, favorites: 91 (07/10/2020 07:46:47)
links: abs | pdf
cs.LG | cs.AI | cs.LO | stat.ML

In this work, we focus on an analogical reasoning task that contains rich compositional structures, Raven’s Progressive Matrices (RPM). To discover compositional structures of the data, we propose the Scattering Compositional Learner (SCL), an architecture that composes neural networks in a sequence. Our SCL achieves state-of-the-art performance on two RPM datasets, with a 48.7% relative improvement on Balanced-RAVEN and 26.4% on PGM over the previous state-of-the-art. We additionally show that our model discovers compositional representations of objects’ attributes (e.g., shape color, size), and their relationships (e.g., progression, union). We also find that the compositional representation makes the SCL significantly more robust to test-time domain shifts and greatly improves zero-shot generalization to previously unseen analogies.

Can Neural Networks solve IQ tests? We propose Scattering Compositional Learner (SCL) for RPM Task. SCL improves SOTA from 63.9% to 95.0%. It is even capable of zero-shot generalization and learns disentangled representations!

paper: https://t.co/9p5CFbQJ4m

(1/n) pic.twitter.com/Adc1fzkKKi
— Yuhuai (Tony) Wu (@Yuhu_ai_) July 9, 2020

6. Language Modeling with Reduced Densities

Tai-Danae Bradley, Yiannis Vlassopoulos

retweets: 17, favorites: 75 (07/10/2020 07:46:47)
links: abs | pdf
cs.CL | cs.LG | math.CT | quant-ph

We present a framework for modeling words, phrases, and longer expressions in a natural language using reduced density operators. We show these operators capture something of the meaning of these expressions and, under the Loewner order on positive semidefinite operators, preserve both a simple form of entailment and the relevant statistics therein. Pulling back the curtain, the assignment is shown to be a functor between categories enriched over probabilities.

Today on the arXiv, I’ve a new paper with Yiannis Vlassopoulos: https://t.co/pnfKANW2ed. We describe an assignment of linear operators to expressions in a natural language that captures something of the statistics therein. https://t.co/8xhRXOGhJj
— Tai-Danae Bradley (@math3ma) July 9, 2020

7. A Free Viewpoint Portrait Generator with Dynamic Styling

Anpei Chen, Ruiyang Liu, Ling Xie, Jingyi Yu

retweets: 14, favorites: 70 (07/10/2020 07:46:48)
links: abs | pdf
cs.CV | cs.GR

Generating portrait images from a single latent space facing the problem of entangled attributes, making it difficult to explicitly adjust the generation on specific attributes, e.g., contour and viewpoint control or dynamic styling. Therefore, we propose to decompose the generation space into two subspaces: geometric and texture space. We first encode portrait scans with a semantic occupancy field (SOF), which represents semantic-embedded geometry structure and output free-viewpoint semantic segmentation maps. Then we design a semantic instance wised(SIW) StyleGAN to regionally styling the segmentation map. We capture 664 3D portrait scans for our SOF training and use real capture photos(FFHQ and CelebA-HQ) for SIW StyleGAN training. Adequate experiments show that our representations enable appearance consistent shape, pose, regional styles controlling, achieve state-of-the-art results, and generalize well in various application scenarios.

A Free Viewpoint Portrait Generator with Dynamic Styling
pdf: https://t.co/bd4KVxxu1b
abs: https://t.co/UP98OgmZMD pic.twitter.com/WGxHc3CQK3
— AK (@ak92501) July 9, 2020

Published 10 Jul 2020

ML Lead at Beatrust. (https://beatrust.com)Tatsuya Shirakawa on Twitter