Hot Papers 2021-01-11

1. A Tale of Fairness Revisited: Beyond Adversarial Learning for Deep Neural Network Fairness

Becky Mashaido, Winston Moh Tangongho

retweets: 1155, favorites: 15 (01/12/2021 13:59:29)
links: abs | pdf
cs.LG | cs.AI

Motivated by the need for fair algorithmic decision making in the age of automation and artificially-intelligent technology, this technical report provides a theoretical insight into adversarial training for fairness in deep learning. We build upon previous work in adversarial fairness, show the persistent tradeoff between fair predictions and model performance, and explore further mechanisms that help in offsetting this tradeoff.

A Tale of Fairness Revisited: Beyond Adversarial Learning for Deep Neural Network Fairness. #DeepLearning #BigData #Analytics #Python #RStats #DevCommunity #Serverless #Linux #Programming #IoT #womenwhocode #100DaysOfCode #DataScience #MachineLearning #AI https://t.co/X2hRsqWe8G pic.twitter.com/B0CLFmQbZj
— Marcus Borba (@marcusborba) January 11, 2021

2. SE(3)-Equivariant Graph Neural Networks for Data-Efficient and Accurate Interatomic Potentials

Simon Batzner, Tess E. Smidt, Lixin Sun, Jonathan P. Mailoa, Mordechai Kornbluth, Nicola Molinari, Boris Kozinsky

retweets: 971, favorites: 115 (01/12/2021 13:59:30)
links: abs | pdf
physics.comp-ph | cond-mat.mtrl-sci | cs.LG

This work presents Neural Equivariant Interatomic Potentials (NequIP), a SE(3)-equivariant neural network approach for learning interatomic potentials from ab-initio calculations for molecular dynamics simulations. While most contemporary symmetry-aware models use invariant convolutions and only act on scalars, NequIP employs SE(3)-equivariant convolutions for interactions of geometric tensors, resulting in a more information-rich and faithful representation of atomic environments. The method achieves state-of-the-art accuracy on a challenging set of diverse molecules and materials while exhibiting remarkable data efficiency. NequIP outperforms existing models with up to three orders of magnitude fewer training data, challenging the widely held belief that deep neural networks require massive training sets. The high data efficiency of the method allows for the construction of accurate potentials using high-order quantum chemical level of theory as reference and enables high-fidelity molecular dynamics simulations over long time scales.

We're excited to introduce NequIP, an equivariant Machine Learning Interatomic Potential that not only obtains SOTA on MD-17, but also outperforms existing potentials with up to 1000x fewer data! w/ @tesssmidt @Materials_Intel @bkoz37 #compchem👇🧵 1/N https://t.co/5njHPLCcyD pic.twitter.com/mnUbxqYgCc
— Simon Batzner (@simonbatzner) January 11, 2021

Ruohan Gao, Kristen Grauman

retweets: 241, favorites: 104 (01/12/2021 13:59:30)
links: abs | pdf
cs.CV | cs.SD | eess.IV

We introduce a new approach for audio-visual speech separation. Given a video, the goal is to extract the speech associated with a face in spite of simultaneous background sounds and/or other human speakers. Whereas existing methods focus on learning the alignment between the speaker’s lip movements and the sounds they generate, we propose to leverage the speaker’s face appearance as an additional prior to isolate the corresponding vocal qualities they are likely to produce. Our approach jointly learns audio-visual speech separation and cross-modal speaker embeddings from unlabeled video. It yields state-of-the-art results on five benchmark datasets for audio-visual speech separation and enhancement, and generalizes well to challenging real-world videos of diverse scenarios. Our video results and code: http://vision.cs.utexas.edu/projects/VisualVoice/.

VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
pdf: https://t.co/5qZYJY9Bnt
abs: https://t.co/JZJGfkAsLf
project page: https://t.co/3FgTlmKaSd pic.twitter.com/oDuMDzeLLr
— AK (@ak92501) January 11, 2021

4. The Distracting Control Suite — A Challenging Benchmark for Reinforcement Learning from Pixels

Austin Stone, Oscar Ramirez, Kurt Konolige, Rico Jonschkowski

retweets: 145, favorites: 36 (01/12/2021 13:59:30)
links: abs | pdf
cs.RO | cs.AI | cs.CV | cs.LG

Robots have to face challenging perceptual settings, including changes in viewpoint, lighting, and background. Current simulated reinforcement learning (RL) benchmarks such as DM Control provide visual input without such complexity, which limits the transfer of well-performing methods to the real world. In this paper, we extend DM Control with three kinds of visual distractions (variations in background, color, and camera pose) to produce a new challenging benchmark for vision-based control, and we analyze state of the art RL algorithms in these settings. Our experiments show that current RL methods for vision-based control perform poorly under distractions, and that their performance decreases with increasing distraction complexity, showing that new methods are needed to cope with the visual complexities of the real world. We also find that combinations of multiple distraction types are more difficult than a mere combination of their individual effects.

The Distracting Control Suite – A Challenging Benchmark
for Reinforcement Learning from Pixels
pdf: https://t.co/EfAFtqL6xo
abs: https://t.co/17VgeDj3li
github: https://t.co/QvKqSxuv1Q pic.twitter.com/KP3DY46J9O
— AK (@ak92501) January 11, 2021

5. InMoDeGAN: Interpretable Motion Decomposition Generative Adversarial Network for Video Generation

Yaohui Wang, Francois Bremond, Antitza Dantcheva

retweets: 74, favorites: 48 (01/12/2021 13:59:30)
links: abs | pdf
cs.CV

In this work, we introduce an unconditional video generative model, InMoDeGAN, targeted to (a) generate high quality videos, as well as to (b) allow for interpretation of the latent space. For the latter, we place emphasis on interpreting and manipulating motion. Towards this, we decompose motion into semantic sub-spaces, which allow for control of generated samples. We design the architecture of InMoDeGAN-generator in accordance to proposed Linear Motion Decomposition, which carries the assumption that motion can be represented by a dictionary, with related vectors forming an orthogonal basis in the latent space. Each vector in the basis represents a semantic sub-space. In addition, a Temporal Pyramid Discriminator analyzes videos at different temporal resolutions. Extensive quantitative and qualitative analysis shows that our model systematically and significantly outperforms state-of-the-art methods on the VoxCeleb2-mini and BAIR-robot datasets w.r.t. video quality related to (a). Towards (b) we present experimental results, confirming that decomposed sub-spaces are interpretable and moreover, generated motion is controllable.

InMoDeGAN: Interpretable Motion Decomposition Generative Adversarial Network for Video Generation
pdf: https://t.co/DqE4zZ3E1t
abs: https://t.co/o2wdsDQrGU
project page: https://t.co/7RZ5IWKJa8 pic.twitter.com/IegvXTlRvx
— AK (@ak92501) January 11, 2021

6. One-Class Classification: A Survey

Pramuditha Perera, Poojan Oza, Vishal M. Patel

retweets: 25, favorites: 37 (01/12/2021 13:59:30)
links: abs | pdf
cs.CV | cs.LG

One-Class Classification (OCC) is a special case of multi-class classification, where data observed during training is from a single positive class. The goal of OCC is to learn a representation and/or a classifier that enables recognition of positively labeled queries during inference. This topic has received considerable amount of interest in the computer vision, machine learning and biometrics communities in recent years. In this article, we provide a survey of classical statistical and recent deep learning-based OCC methods for visual recognition. We discuss the merits and drawbacks of existing OCC approaches and identify promising avenues for research in this field. In addition, we present a discussion of commonly used datasets and evaluation metrics for OCC.

1クラスのみの分類問題を解かせることで、次元圧縮された特徴ベクトルの空間などを取得し、その特徴空間の分布を用いて異常検出を行う様々な手法を紹介しているサーベイ論文。
One Class Classificationの手法の歴史が学べるので、時間のあるときに読んでみよう。https://t.co/hMqIaYDBTQ pic.twitter.com/gPH0l04Y8d
— 福田敦史 / Aillis CTO (@fukumimi014) January 11, 2021

7. A Novel Regression Loss for Non-Parametric Uncertainty Optimization

Joachim Sicking, Maram Akila, Maximilian Pintz, Tim Wirtz, Asja Fischer, Stefan Wrobel

retweets: 42, favorites: 11 (01/12/2021 13:59:31)
links: abs | pdf
cs.LG | stat.ML

Quantification of uncertainty is one of the most promising approaches to establish safe machine learning. Despite its importance, it is far from being generally solved, especially for neural networks. One of the most commonly used approaches so far is Monte Carlo dropout, which is computationally cheap and easy to apply in practice. However, it can underestimate the uncertainty. We propose a new objective, referred to as second-moment loss (SML), to address this issue. While the full network is encouraged to model the mean, the dropout networks are explicitly used to optimize the model variance. We intensively study the performance of the new objective on various UCI regression datasets. Comparing to the state-of-the-art of deep ensembles, SML leads to comparable prediction accuracies and uncertainty estimates while only requiring a single model. Under distribution shift, we observe moderate improvements. As a side result, we introduce an intuitive Wasserstein distance-based uncertainty measure that is non-saturating and thus allows to resolve quality differences between any two uncertainty estimates.

Published 12 Jan 2021

ML Lead at Beatrust. (https://beatrust.com)Tatsuya Shirakawa on Twitter