Hot Papers 2020-10-21

1. A Convenient Generalization of Schlick’s Bias and Gain Functions

Jonathan T. Barron

retweets: 9702, favorites: 522 (10/22/2020 10:00:16)
links: abs | pdf
cs.GR | cs.CV

We present a generalization of Schlick’s bias and gain functions — simple parametric curve-shaped functions for inputs in [0, 1]. Our single function includes both bias and gain as special cases, and is able to describe other smooth and monotonic curves with variable degrees of asymmetry.

My new favorite pet function: a tiny generalization of Schlick's "bias" and "gain" functions that gives you a handy way to craft nice curves for easing or annealing on [0,1]->[0,1]. I wrote up a micropaper here: https://t.co/g2LlQpiFEK pic.twitter.com/ZpIb5E1bx5
— Jon Barron (@jon_barron) October 21, 2020

2. SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving

Ming Zhou, Jun Luo, Julian Villela, Yaodong Yang, David Rusu, Jiayu Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Aurora Chongxi Huang, Ying Wen, Kimia Hassanzadeh, Daniel Graves, Dong Chen, Zhengbang Zhu, Nhat Nguyen, Mohamed Elsayed, Kun Shao, Sanjeevan Ahilan, Baokuan Zhang, Jiannan Wu, Zhengang Fu, Kasra Rezaee, Peyman Yadmellat, Mohsen Rohani, Nicolas Perez Nieves, Yihan Ni, Seyedershad Banijamali, Alexander Cowen Rivers, Zheng Tian, Daniel Palenicek, Haitham bou Ammar, Hongbo Zhang, Wulong Liu, Jianye Hao, Jun Wang

retweets: 482, favorites: 108 (10/22/2020 10:00:17)
links: abs | pdf
cs.MA | cs.AI | cs.GT | cs.LG | eess.SY

Multi-agent interaction is a fundamental aspect of autonomous driving in the real world. Despite more than a decade of research and development, the problem of how to competently interact with diverse road users in diverse scenarios remains largely unsolved. Learning methods have much to offer towards solving this problem. But they require a realistic multi-agent simulator that generates diverse and competent driving interactions. To meet this need, we develop a dedicated simulation platform called SMARTS (Scalable Multi-Agent RL Training School). SMARTS supports the training, accumulation, and use of diverse behavior models of road users. These are in turn used to create increasingly more realistic and diverse interactions that enable deeper and broader research on multi-agent interaction. In this paper, we describe the design goals of SMARTS, explain its basic architecture and its key features, and illustrate its use through concrete multi-agent experiments on interactive scenarios. We open-source the SMARTS platform and the associated benchmark tasks and evaluation metrics to encourage and empower research on multi-agent learning for autonomous driving.

SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving
pdf: https://t.co/9ktURWb0Ib
abs: https://t.co/36tLfS3geA
github: https://t.co/S9deDOAba5 pic.twitter.com/gxStWtpc4M
— AK (@ak92501) October 21, 2020

Happy to share our latest work on a new realistic autonomous driving simulator "SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving". @hbouammar https://t.co/gbit96UuLa pic.twitter.com/W8nlKBOwDq
— Alexander (Ali) Imani Cowen-Rivers. (@ImanisMind) October 21, 2020

3. Complete Multilingual Neural Machine Translation

Markus Freitag, Orhan Firat

retweets: 342, favorites: 102 (10/22/2020 10:00:17)
links: abs | pdf
cs.CL | cs.LG

Multilingual Neural Machine Translation (MNMT) models are commonly trained on a joint set of bilingual corpora which is acutely English-centric (i.e. English either as the source or target language). While direct data between two languages that are non-English is explicitly available at times, its use is not common. In this paper, we first take a step back and look at the commonly used bilingual corpora (WMT), and resurface the existence and importance of implicit structure that existed in it: multi-way alignment across examples (the same sentence in more than two languages). We set out to study the use of multi-way aligned examples to enrich the original English-centric parallel corpora. We reintroduce this direct parallel data from multi-way aligned corpora between all source and target languages. By doing so, the English-centric graph expands into a complete graph, every language pair being connected. We call MNMT with such connectivity pattern complete Multilingual Neural Machine Translation (cMNMT) and demonstrate its utility and efficacy with a series of experiments and analysis. In combination with a novel training data sampling strategy that is conditioned on the target language only, cMNMT yields competitive translation quality for all language pairs. We further study the size effect of multi-way aligned data, its transfer learning capabilities and how it eases adding a new language in MNMT. Finally, we stress test cMNMT at scale and demonstrate that we can train a cMNMT model with up to 111*112=12,432 language pairs that provides competitive translation quality for all language pairs.

Happy to announce our WMT20 paper “Complete Multilingual Machine Translation” (cMNMT). We show how to train a single NMT model that handles 12,432 language pairs without using English as an intermediary, aka direct translation. With @orf_bnw @GoogleAI https://t.co/GNa0EH2gFN 1/n pic.twitter.com/evGSYJL5rV
— Markus Freitag (@markuseful) October 21, 2020

4. Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition

Yu Zhang, James Qin, Daniel S. Park, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Quoc V. Le, Yonghui Wu

retweets: 262, favorites: 171 (10/22/2020 10:00:17)
links: abs | pdf
eess.AS | cs.LG | cs.SD

We employ a combination of recent developments in semi-supervised learning for automatic speech recognition to obtain state-of-the-art results on LibriSpeech utilizing the unlabeled audio of the Libri-Light dataset. More precisely, we carry out noisy student training with SpecAugment using giant Conformer models pre-trained using wav2vec 2.0 pre-training. By doing so, we are able to achieve word-error-rates (WERs) 1.4%/2.6% on the LibriSpeech test/test-other sets against the current state-of-the-art WERs 1.7%/3.3%.

Pretty amazing progress on speech recognition thanks to pre-training and self-training with unlabeled data.

Key ingredients: Large conformer architecture + wave2vec2.0 pretraining + Noisy Student Training

Link: https://t.co/X9Fj9Ltv1V pic.twitter.com/vBIlQWx2Us
— Quoc Le (@quocleix) October 21, 2020

5. SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images

Chen-Hsuan Lin, Chaoyang Wang, Simon Lucey

retweets: 272, favorites: 83 (10/22/2020 10:00:17)
links: abs | pdf
cs.CV | cs.AI | cs.LG

Dense 3D object reconstruction from a single image has recently witnessed remarkable advances, but supervising neural networks with ground-truth 3D shapes is impractical due to the laborious process of creating paired image-shape datasets. Recent efforts have turned to learning 3D reconstruction without 3D supervision from RGB images with annotated 2D silhouettes, dramatically reducing the cost and effort of annotation. These techniques, however, remain impractical as they still require multi-view annotations of the same object instance during training. As a result, most experimental efforts to date have been limited to synthetic datasets. In this paper, we address this issue and propose SDF-SRN, an approach that requires only a single view of objects at training time, offering greater utility for real-world scenarios. SDF-SRN learns implicit 3D shape representations to handle arbitrary shape topologies that may exist in the datasets. To this end, we derive a novel differentiable rendering formulation for learning signed distance functions (SDF) from 2D silhouettes. Our method outperforms the state of the art under challenging single-view supervision settings on both synthetic and real-world datasets.

SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images
pdf: https://t.co/NjQ7RURoVB
abs: https://t.co/TIjKBzMxjV
project page: https://t.co/oImIR4U4EN pic.twitter.com/VjfALvaNY1
— AK (@ak92501) October 21, 2020

6. Real-time Localized Photorealistic Video Style Transfer

Xide Xia, Tianfan Xue, Wei-sheng Lai, Zheng Sun, Abby Chang, Brian Kulis, Jiawen Chen

retweets: 225, favorites: 82 (10/22/2020 10:00:17)
links: abs | pdf
cs.CV

We present a novel algorithm for transferring artistic styles of semantically meaningful local regions of an image onto local regions of a target video while preserving its photorealism. Local regions may be selected either fully automatically from an image, through using video segmentation algorithms, or from casual user guidance such as scribbles. Our method, based on a deep neural network architecture inspired by recent work in photorealistic style transfer, is real-time and works on arbitrary inputs without runtime optimization once trained on a diverse dataset of artistic styles. By augmenting our video dataset with noisy semantic labels and jointly optimizing over style, content, mask, and temporal losses, our method can cope with a variety of imperfections in the input and produce temporally coherent videos without visual artifacts. We demonstrate our method on a variety of style images and target videos, including the ability to transfer different styles onto multiple objects simultaneously, and smoothly transition between styles in time.

Real-time Localized Photorealistic Video Style Transfer
pdf: https://t.co/JrZzOJy3gU
abs: https://t.co/6o3z0cHQZ1 pic.twitter.com/El4NTyYSJD
— AK (@ak92501) October 21, 2020

7. ABC-Di: Approximate Bayesian Computation for Discrete Data

Ilze Amanda Auzina, Jakub M. Tomczak

retweets: 227, favorites: 67 (10/22/2020 10:00:17)
links: abs | pdf
stat.ML | cs.LG

Many real-life problems are represented as a black-box, i.e., the internal workings are inaccessible or a closed-form mathematical expression of the likelihood function cannot be defined. For continuous random variables likelihood-free inference problems can be solved by a group of methods under the name of Approximate Bayesian Computation (ABC). However, a similar approach for discrete random variables is yet to be formulated. Here, we aim to fill this research gap. We propose to use a population-based MCMC ABC framework. Further, we present a valid Markov kernel, and propose a new kernel that is inspired by Differential Evolution. We assess the proposed approach on a problem with the known likelihood function, namely, discovering the underlying diseases based on a QMR-DT Network, and three likelihood-free inference problems: (i) the QMR-DT Network with the unknown likelihood function, (ii) learning binary neural network, and (iii) Neural Architecture Search. The obtained results indicate the high potential of the proposed framework and the superiority of the new Markov kernel.

I am extremely proud to share a work on Approximate Bayesian Computation for discrete data (w/ Ilze A. Auzina). Applications: learning QMR-DT Network, (toyish) binary neural nets, and Neural Architecture Search.
Paper: https://t.co/pER9hZuV4C
Code: https://t.co/26FHxTEig4 pic.twitter.com/s3t81lG8fX
— Jakub Tomczak (@jmtomczak) October 21, 2020

8. FLAP — A Federated Learning Framework for Attribute-based Access Control Policies

Amani Abu Jabal, Elisa Bertino, Jorge Lobo, Dinesh Verma, Seraphin Calo, Alessandra Russo

retweets: 212, favorites: 12 (10/22/2020 10:00:18)
links: abs | pdf
cs.CR

Technology advances in areas such as sensors, IoT, and robotics, enable new collaborative applications (e.g., autonomous devices). A primary requirement for such collaborations is to have a secure system which enables information sharing and information flow protection. Policy-based management system is a key mechanism for secure selective sharing of protected resources. However, policies in each party of such a collaborative environment cannot be static as they have to adapt to different contexts and situations. One advantage of collaborative applications is that each party in the collaboration can take advantage of knowledge of the other parties for learning or enhancing its own policies. We refer to this learning mechanism as policy transfer. The design of a policy transfer framework has challenges, including policy conflicts and privacy issues. Policy conflicts typically arise because of differences in the obligations of the parties, whereas privacy issues result because of data sharing constraints for sensitive data. Hence, the policy transfer framework should be able to tackle such challenges by considering minimal sharing of data and support policy adaptation to address conflict. In the paper we propose a framework that aims at addressing such challenges. We introduce a formal definition of the policy transfer problem for attribute-based policies. We then introduce the transfer methodology that consists of three sequential steps. Finally we report experimental results.

FLAP -- A Federated Learning Framework for Attribute-based Access Control Policies. #DataScience #BigData #Analytics #RStats #Python #Java #JavaScript #ReactJS #Serverless #IoT #Linux #Coding #100DaysOfCode #Programming #AI #DeepLearning #MachineLearning https://t.co/TOnMg8M414 pic.twitter.com/RajMTk22zQ
— Marcus Borba (@marcusborba) October 22, 2020

9. Semi-parametric $γ$ -ray modeling with Gaussian processes and variational inference

Siddharth Mishra-Sharma, Kyle Cranmer

retweets: 156, favorites: 59 (10/22/2020 10:00:18)
links: abs | pdf
astro-ph.HE | astro-ph.CO | astro-ph.IM | hep-ph | stat.ML

Mismodeling the uncertain, diffuse emission of Galactic origin can seriously bias the characterization of astrophysical gamma-ray data, particularly in the region of the Inner Milky Way where such emission can make up over 80% of the photon counts observed at ~GeV energies. We introduce a novel class of methods that use Gaussian processes and variational inference to build flexible background and signal models for gamma-ray analyses with the goal of enabling a more robust interpretation of the make-up of the gamma-ray sky, particularly focusing on characterizing potential signals of dark matter in the Galactic Center with data from the Fermi telescope.

New work led by Siddharth Mishra-Sharma @kdqg1!
There is an excess of gamma-rays coming from the galactic center, which may be a signal for dark matter. We aim to improve the modeling using Gaussian Processes and variational inference. https://t.co/Lp1e37hm4v pic.twitter.com/6pB7rV1TAr
— Kyle Cranmer (@KyleCranmer) October 21, 2020

10. LT-GAN: Self-Supervised GAN with Latent Transformation Detection

Parth Patel, Nupur Kumari, Mayank Singh, Balaji Krishnamurthy

retweets: 118, favorites: 59 (10/22/2020 10:00:18)
links: abs | pdf
cs.CV

Generative Adversarial Networks (GANs) coupled with self-supervised tasks have shown promising results in unconditional and semi-supervised image generation. We propose a self-supervised approach (LT-GAN) to improve the generation quality and diversity of images by estimating the GAN-induced transformation (i.e. transformation induced in the generated images by perturbing the latent space of generator). Specifically, given two pairs of images where each pair comprises of a generated image and its transformed version, the self-supervision task aims to identify whether the latent transformation applied in the given pair is same to that of the other pair. Hence, this auxiliary loss encourages the generator to produce images that are distinguishable by the auxiliary network, which in turn promotes the synthesis of semantically consistent images with respect to latent transformations. We show the efficacy of this pretext task by improving the image generation quality in terms of FID on state-of-the-art models for both conditional and unconditional settings on CIFAR-10, CelebA-HQ and ImageNet datasets. Moreover, we empirically show that LT-GAN helps in improving controlled image editing for CelebA-HQ and ImageNet over baseline models. We experimentally demonstrate that our proposed LT self-supervision task can be effectively combined with other state-of-the-art training techniques for added benefits. Consequently, we show that our approach achieves the new state-of-the-art FID score of 9.8 on conditional CIFAR-10 image generation.

LT-GAN: Self-Supervised GAN with Latent Transformation Detection
pdf: https://t.co/lrq8BfqYNA
abs: https://t.co/mNJZpYgPjv pic.twitter.com/RUFe0CIFUI
— AK (@ak92501) October 21, 2020

11. CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters

Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne, Hiroshi Noji, Pierre Zweigenbaum, Junichi Tsujii

retweets: 25, favorites: 32 (10/22/2020 10:00:18)
links: abs | pdf
cs.CL

Due to the compelling improvements brought by BERT, many recent representation models adopted the Transformer architecture as their main building block, consequently inheriting the wordpiece tokenization system despite it not being intrinsically linked to the notion of Transformers. While this system is thought to achieve a good balance between the flexibility of characters and the efficiency of full words, using predefined wordpiece vocabularies from the general domain is not always suitable, especially when building models for specialized domains (e.g., the medical domain). Moreover, adopting a wordpiece tokenization shifts the focus from the word level to the subword level, making the models conceptually more complex and arguably less convenient in practice. For these reasons, we propose CharacterBERT, a new variant of BERT that drops the wordpiece system altogether and uses a Character-CNN module instead to represent entire words by consulting their characters. We show that this new model improves the performance of BERT on a variety of medical domain tasks while at the same time producing robust, word-level and open-vocabulary representations.

Published 22 Oct 2020

ML Lead at Beatrust. (https://beatrust.com)Tatsuya Shirakawa on Twitter