Hot Papers 2020-11-04

1. Learning Deformable Tetrahedral Meshes for 3D Reconstruction

Jun Gao, Wenzheng Chen, Tommy Xiang, Alec Jacobson, Morgan McGuire, Sanja Fidler

retweets: 1625, favorites: 310 (11/05/2020 11:13:31)
links: abs | pdf
cs.CV

3D shape representations that accommodate learning-based 3D reconstruction are an open problem in machine learning and computer graphics. Previous work on neural 3D reconstruction demonstrated benefits, but also limitations, of point cloud, voxel, surface mesh, and implicit function representations. We introduce Deformable Tetrahedral Meshes (DefTet) as a particular parameterization that utilizes volumetric tetrahedral meshes for the reconstruction problem. Unlike existing volumetric approaches, DefTet optimizes for both vertex placement and occupancy, and is differentiable with respect to standard 3D reconstruction loss functions. It is thus simultaneously high-precision, volumetric, and amenable to learning-based neural architectures. We show that it can represent arbitrary, complex topology, is both memory and computationally efficient, and can produce high-fidelity reconstructions with a significantly smaller grid size than alternative volumetric approaches. The predicted surfaces are also inherently defined as tetrahedral meshes, thus do not require post-processing. We demonstrate that DefTet matches or exceeds both the quality of the previous best approaches and the performance of the fastest ones. Our approach obtains high-quality tetrahedral meshes computed directly from noisy point clouds, and is the first to showcase high-quality 3D tet-mesh results using only a single image as input.

Learning Deformable Tetrahedral Meshes for 3D Reconstruction
pdf: https://t.co/Y4JtTegNQM
abs: https://t.co/SFOP0bq7nm
project page: https://t.co/JC7OYZLBOy pic.twitter.com/noZqIAIJyi
— AK (@ak92501) November 4, 2020

Excited to share our @NVIDIAAI work DefTet, which makes tetrahedral meshes, one of the major 3D representations in graphics and physics-based simulation, amenable to Deep Learning.@ChenWenzheng @TommyX058 @_AlecJacobson @CasualEffects @FidlerSanja
Paper: https://t.co/8DFyKsvMGs pic.twitter.com/JmGBuiBqlu
— Jun Gao (@JunGao33210520) November 4, 2020

2. Sample-efficient reinforcement learning using deep Gaussian processes

Charles Gadd, Markus Heinonen, Harri Lähdesmäki, Samuel Kaski

retweets: 545, favorites: 30 (11/05/2020 11:13:32)
links: abs | pdf
stat.ML | cs.LG

Reinforcement learning provides a framework for learning to control which actions to take towards completing a task through trial-and-error. In many applications observing interactions is costly, necessitating sample-efficient learning. In model-based reinforcement learning efficiency is improved by learning to simulate the world dynamics. The challenge is that model inaccuracies rapidly accumulate over planned trajectories. We introduce deep Gaussian processes where the depth of the compositions introduces model complexity while incorporating prior knowledge on the dynamics brings smoothness and structure. Our approach is able to sample a Bayesian posterior over trajectories. We demonstrate highly improved early sample-efficiency over competing methods. This is shown across a number of continuous control tasks, including the half-cheetah whose contact dynamics have previously posed an insurmountable problem for earlier sample-efficient Gaussian process based models.

Sample-efficient reinforcement learning using deep Gaussian processes. #AI #DataScience #BigData #Analytics #Cloud #IoT #Python #RStats #JavaScript #ReactJS #Serverless #Linux #100DaysOfCode #Developers #Programming #Coding #MachineLearning #DeepLearning https://t.co/Hbdztsjpo6 pic.twitter.com/m0pAy570Ro
— Marcus Borba (@marcusborba) November 5, 2020

3. StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization

Ahmed Mustafa, Nicola Pia, Guillaume Fuchs

retweets: 144, favorites: 65 (11/05/2020 11:13:32)
links: abs | pdf
eess.AS | cs.LG | cs.SD | eess.SP

In recent years, neural vocoders have surpassed classical speech generation approaches in naturalness and perceptual quality of the synthesized speech. Computationally heavy models like WaveNet and WaveGlow achieve best results, while lightweight GAN models, e.g. MelGAN and Parallel WaveGAN, remain inferior in terms of perceptual quality. We therefore propose StyleMelGAN, a lightweight neural vocoder allowing synthesis of high-fidelity speech with low computational complexity. StyleMelGAN employs temporal adaptive normalization to style a low-dimensional noise vector with the acoustic features of the target speech. For efficient training, multiple random-window discriminators adversarially evaluate the speech signal analyzed by a filter bank, with regularization provided by a multi-scale spectral reconstruction loss. The highly parallelizable speech generation is several times faster than real-time on CPUs and GPUs. MUSHRA and P.800 listening tests show that StyleMelGAN outperforms prior neural vocoders in copy-synthesis and Text-to-Speech scenarios.

StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization
pdf: https://t.co/chyZhDQhGJ
abs: https://t.co/XmxbdQzQ2X
project page: https://t.co/gliYnFPdS4 pic.twitter.com/1az7JdQaq2
— AK (@ak92501) November 4, 2020

4. Recommendations for Bayesian hierarchical model specifications for case-control studies in mental health

Vincent Valton, Toby Wise, Oliver J. Robinson

retweets: 98, favorites: 49 (11/05/2020 11:13:32)
links: abs | pdf
cs.CY | cs.LG | stat.AP | stat.ME

Hierarchical model fitting has become commonplace for case-control studies of cognition and behaviour in mental health. However, these techniques require us to formalise assumptions about the data-generating process at the group level, which may not be known. Specifically, researchers typically must choose whether to assume all subjects are drawn from a common population, or to model them as deriving from separate populations. These assumptions have profound implications for computational psychiatry, as they affect the resulting inference (latent parameter recovery) and may conflate or mask true group-level differences. To test these assumptions we ran systematic simulations on synthetic multi-group behavioural data from a commonly used multi-armed bandit task (reinforcement learning task). We then examined recovery of group differences in latent parameter space under the two commonly used generative modelling assumptions: (1) modelling groups under a common shared group-level prior (assuming all participants are generated from a common distribution, and are likely to share common characteristics); (2) modelling separate groups based on symptomatology or diagnostic labels, resulting in separate group-level priors. We evaluated the robustness of these approaches to variations in data quality and prior specifications on a variety of metrics. We found that fitting groups separately (assumptions 2), provided the most accurate and robust inference across all conditions. Our results suggest that when dealing with data from multiple clinical groups, researchers should analyse patient and control groups separately as it provides the most accurate and robust recovery of the parameters of interest.

🎉 Extended Abs. paper got accepted at the @NeurIPSConf #ML4H workshop! 🎊

In this work, we quantify and demonstrate how hierarchical modelling assumptions & data quality impacts parameter recovery for case-control studies in mental health. #neurips2020 https://t.co/NImFCAp5MN
— vincent valton (@vincentvalton) November 4, 2020

A brief welcome distraction...our work on the importance of prior specification in hierarchical modelling (which we turned into a paper in a day just before Covid🙃) has been expanded and accepted for the @NeurIPSConf #ML4H workshop: https://t.co/5EURUDyCiy https://t.co/sBSkj72KPw
— Oliver Robinson (@olijrobinson) November 4, 2020

5. The Complexity of Gradient Descent: CLS = PPAD $\cap$ PLS

John Fearnley, Paul W. Goldberg, Alexandros Hollender, Rahul Savani

retweets: 86, favorites: 57 (11/05/2020 11:13:32)
links: abs | pdf
cs.CC | cs.LG | math.OC

We study search problems that can be solved by performing Gradient Descent on a bounded convex polytopal domain and show that this class is equal to the intersection of two well-known classes: PPAD and PLS. As our main underlying technical contribution, we show that computing a Karush-Kuhn-Tucker (KKT) point of a continuously differentiable function over the domain $[0,1]^2$ is PPAD $\cap$ PLS-complete. This is the first natural problem to be shown complete for this class. Our results also imply that the class CLS (Continuous Local Search) - which was defined by Daskalakis and Papadimitriou as a more “natural” counterpart to PPAD $\cap$ PLS and contains many interesting problems - is itself equal to PPAD $\cap$ PLS.

A beautiful paper (https://t.co/QcsEFUpYGS) solving a great open problem in the complexity theory of search problems. @paulwgoldberg @rahul__savani and co show the first natural PLS-complete problem: gradient descent.
— Arnab Bhattacharyya (@abhatt2) November 4, 2020

Thanks, Arnab. In the following thread I provide some more discussion of our paper.

The Complexity of Gradient Descent: CLS = PPAD∩PLShttps://t.co/8dA0GkxeeW

with my wonderful co-authors John Fearnley, Alexandros Hollender, and @rahul__savani
1/7 https://t.co/4fbUqFSoRM
— Paul Goldberg (@paulwgoldberg) November 4, 2020

6. CharBERT: Character-aware Pre-trained Language Model

Wentao Ma, Yiming Cui, Chenglei Si, Ting Liu, Shijin Wang, Guoping Hu

retweets: 90, favorites: 50 (11/05/2020 11:13:33)
links: abs | pdf
cs.CL

Most pre-trained language models (PLMs) construct word representations at subword level with Byte-Pair Encoding (BPE) or its variations, by which OOV (out-of-vocab) words are almost avoidable. However, those methods split a word into subword units and make the representation incomplete and fragile. In this paper, we propose a character-aware pre-trained language model named CharBERT improving on the previous methods (such as BERT, RoBERTa) to tackle these problems. We first construct the contextual word embedding for each token from the sequential character representations, then fuse the representations of characters and the subword representations by a novel heterogeneous interaction module. We also propose a new pre-training task named NLM (Noisy LM) for unsupervised character representation learning. We evaluate our method on question answering, sequence labeling, and text classification tasks, both on the original datasets and adversarial misspelling test sets. The experimental results show that our method can significantly improve the performance and robustness of PLMs simultaneously. Pretrained models, evaluation sets, and code are available at https://github.com/wtma/CharBERT

CharBERT: Character-aware Pre-trained Language Model
pdf: https://t.co/YeO9iMJMuZ
abs: https://t.co/oBL69ehtL9
github: https://t.co/4gbBtF5cnw pic.twitter.com/NbxSwneu0T
— AK (@ak92501) November 4, 2020

7. Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning

Beliz Gunel, Jingfei Du, Alexis Conneau, Ves Stoyanov

retweets: 62, favorites: 40 (11/05/2020 11:13:33)
links: abs | pdf
cs.CL | cs.LG

State-of-the-art natural language understanding classification models follow two-stages: pre-training a large language model on an auxiliary task, and then fine-tuning the model on a task-specific labeled dataset using cross-entropy loss. Cross-entropy loss has several shortcomings that can lead to sub-optimal generalization and instability. Driven by the intuition that good generalization requires capturing the similarity between examples in one class and contrasting them with examples in other classes, we propose a supervised contrastive learning (SCL) objective for the fine-tuning stage. Combined with cross-entropy, the SCL loss we propose obtains improvements over a strong RoBERTa-Large baseline on multiple datasets of the GLUE benchmark in both the high-data and low-data regimes, and it does not require any specialized architecture, data augmentation of any kind, memory banks, or additional unsupervised data. We also demonstrate that the new objective leads to models that are more robust to different levels of noise in the training data, and can generalize better to related tasks with limited labeled task data.

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning
pdf: https://t.co/FLAq94ZWnX
abs: https://t.co/88UnjqmaKr pic.twitter.com/aU5JA9tjzx
— AK (@ak92501) November 4, 2020

8. Transforming Gaussian Processes With Normalizing Flows

Juan Maroñas, Oliver Hamelijnck, Jeremias Knoblauch, Theodoros Damoulas

retweets: 64, favorites: 35 (11/05/2020 11:13:33)
links: abs | pdf
cs.LG

Gaussian Processes (GPs) can be used as flexible, non-parametric function priors. Inspired by the growing body of work on Normalizing Flows, we enlarge this class of priors through a parametric invertible transformation that can be made input-dependent. Doing so also allows us to encode interpretable prior knowledge (e.g., boundedness constraints). We derive a variational approximation to the resulting Bayesian inference problem, which is as fast as stochastic variational GP regression (Hensman et al., 2013; Dezfouli and Bonilla,2015). This makes the model a computationally efficient alternative to other hierarchical extensions of GP priors (Lazaro-Gredilla,2012; Damianou and Lawrence, 2013). The resulting algorithm’s computational and inferential performance is excellent, and we demonstrate this on a range of data sets. For example, even with only 5 inducing points and an input-dependent flow, our method is consistently competitive with a standard sparse GP fitted using 100 inducing points.

ガウス過程とNormalizing flow組み合わせれば、より柔軟な関数の事前分布が構築できますよという話っぽいhttps://t.co/KM7VDlZMis
— Ryuji WATANABE (@ae14watanabe) November 4, 2020

9. Revisiting Adaptive Convolutions for Video Frame Interpolation

Simon Niklaus, Long Mai, Oliver Wang

retweets: 56, favorites: 28 (11/05/2020 11:13:33)
links: abs | pdf
cs.CV

Video frame interpolation, the synthesis of novel views in time, is an increasingly popular research direction with many new papers further advancing the state of the art. But as each new method comes with a host of variables that affect the interpolation quality, it can be hard to tell what is actually important for this task. In this work, we show, somewhat surprisingly, that it is possible to achieve near state-of-the-art results with an older, simpler approach, namely adaptive separable convolutions, by a subtle set of low level improvements. In doing so, we propose a number of intuitive but effective techniques to improve the frame interpolation quality, which also have the potential to other related applications of adaptive convolutions such as burst image denoising, joint image filtering, or video prediction.

Revisiting Adaptive Convolutions for Video Frame Interpolation
pdf: https://t.co/ECd2PwPyZE
abs: https://t.co/2DTzM1NNKp
project page: https://t.co/4XE6GXI4qt pic.twitter.com/smTDTE7IZq
— AK (@ak92501) November 4, 2020

10. Levels of Coupling in Dyadic Interaction: An Analysis of Neural and Behavioral Complexity

Georgina Montserrat Reséndiz-Benhumea, Ekaterina Sangati, Tom Froese

retweets: 30, favorites: 20 (11/05/2020 11:13:33)
links: abs | pdf
cs.MA

From an enactive approach, some previous studies have demonstrated that social interaction plays a fundamental role in the dynamics of neural and behavioral complexity of embodied agents. In particular, it has been shown that agents with a limited internal structure (2-neuron brains) that evolve in interaction can overcome this limitation and exhibit chaotic neural activity, typically associated with more complex dynamical systems (at least 3-dimensional). In the present paper we make two contributions to this line of work. First, we propose a conceptual distinction in levels of coupling between agents that could have an effect on neural and behavioral complexity. Second, we test the generalizability of previous results by testing agents with richer internal structure and evolving them in a richer, yet non-social, environment. We demonstrate that such agents can achieve levels of complexity comparable to agents that evolve in interactive settings. We discuss the significance of this result for the study of interaction.

New preprint of paper for the IEEE Symposium on Artificial Life: we used an agent-based model to investigate the relationship between neural and behavioral complexity in solitary and social scenarios. https://t.co/21sRBsabaM
— Embodied Cognitive Science Unit (@EcsuOist) November 4, 2020

Published 5 Nov 2020

ML Lead at Beatrust. (https://beatrust.com)Tatsuya Shirakawa on Twitter