All Articles

Hot Papers 2020-12-14

1. Discriminating Between Similar Nordic Languages

René Haas, Leon Derczynski

  • retweets: 342, favorites: 83 (12/15/2020 10:16:52)
  • links: abs | pdf
  • cs.CL

Automatic language identification is a challenging problem. Discriminating between closely related languages is especially difficult. This paper presents a machine learning approach for automatic language identification for the Nordic languages, which often suffer miscategorisation by existing state-of-the-art tools. Concretely we will focus on discrimination between six Nordic languages: Danish, Swedish, Norwegian (Nynorsk), Norwegian (Bokm{\aa}l), Faroese and Icelandic.

2. Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?

Malik Boudiaf, Hoel Kervadec, Ziko Imtiaz Masud, Pablo Piantanida, Ismail Ben Ayed, Jose Dolz

  • retweets: 102, favorites: 54 (12/15/2020 10:16:53)
  • links: abs | pdf
  • cs.CV

Few-shot segmentation has recently attracted substantial interest, with the popular meta-learning paradigm widely dominating the literature. We show that the way inference is performed for a given few-shot segmentation task has a substantial effect on performances, an aspect that has been overlooked in the literature. We introduce a transductive inference, which leverages the statistics of the unlabeled pixels of a task by optimizing a new loss containing three complementary terms: (i) a standard cross-entropy on the labeled pixels; (ii) the entropy of posteriors on the unlabeled query pixels; and (iii) a global KL-divergence regularizer based on the proportion of the predicted foreground region. Our inference uses a simple linear classifier of the extracted features, has a computational load comparable to inductive inference and can be used on top of any base training. Using standard cross-entropy training on the base classes, our inference yields highly competitive performances on well-known few-shot segmentation benchmarks. On PASCAL-5i, it brings about 5% improvement over the best performing state-of-the-art method in the 5-shot scenario, while being on par in the 1-shot setting. Even more surprisingly, this gap widens as the number of support samples increases, reaching up to 6% in the 10-shot scenario. Furthermore, we introduce a more realistic setting with domain shift, where the base and novel classes are drawn from different datasets. In this setting, we found that our method achieves the best performances.

3. Detailed 3D Human Body Reconstruction from Multi-view Images Combining Voxel Super-Resolution and Learned Implicit Representation

Zhongguo Li, Magnus Oskarsson, Anders Heyden

  • retweets: 90, favorites: 54 (12/15/2020 10:16:53)
  • links: abs | pdf
  • cs.CV

The task of reconstructing detailed 3D human body models from images is interesting but challenging in computer vision due to the high freedom of human bodies. In order to tackle the problem, we propose a coarse-to-fine method to reconstruct a detailed 3D human body from multi-view images combining voxel super-resolution based on learning the implicit representation. Firstly, the coarse 3D models are estimated by learning an implicit representation based on multi-scale features which are extracted by multi-stage hourglass networks from the multi-view images. Then, taking the low resolution voxel grids which are generated by the coarse 3D models as input, the voxel super-resolution based on an implicit representation is learned through a multi-stage 3D convolutional neural network. Finally, the refined detailed 3D human body models can be produced by the voxel super-resolution which can preserve the details and reduce the false reconstruction of the coarse 3D models. Benefiting from the implicit representation, the training process in our method is memory efficient and the detailed 3D human body produced by our method from multi-view images is the continuous decision boundary with high-resolution geometry. In addition, the coarse-to-fine method based on voxel super-resolution can remove false reconstructions and preserve the appearance details in the final reconstruction, simultaneously. In the experiments, our method quantitatively and qualitatively achieves the competitive 3D human body reconstructions from images with various poses and shapes on both the real and synthetic datasets.

4. Quantum-accelerated multilevel Monte Carlo methods for stochastic differential equations in mathematical finance

Dong An, Noah Linden, Jin-Peng Liu, Ashley Montanaro, Changpeng Shao, Jiasu Wang

Inspired by recent progress in quantum algorithms for ordinary and partial differential equations, we study quantum algorithms for stochastic differential equations (SDEs). Firstly we provide a quantum algorithm that gives a quadratic speed-up for multilevel Monte Carlo methods in a general setting. As applications, we apply it to compute expection values determined by classical solutions of SDEs, with improved dependence on precision. We demonstrate the use of this algorithm in a variety of applications arising in mathematical finance, such as the Black-Scholes and Local Volatility models, and Greeks. We also provide a quantum algorithm based on sublinear binomial sampling for the binomial option pricing model with the same improvement.

5. Sublinear classical and quantum algorithms for general matrix games

Tongyang Li, Chunhao Wang, Shouvanik Chakrabarti, Xiaodi Wu

We investigate sublinear classical and quantum algorithms for matrix games, a fundamental problem in optimization and machine learning, with provable guarantees. Given a matrix ARn×dA\in\mathbb{R}^{n\times d}, sublinear algorithms for the matrix game minxXmaxyYyAx\min_{x\in\mathcal{X}}\max_{y\in\mathcal{Y}} y^{\top} Ax were previously known only for two special cases: (1) Y\mathcal{Y} being the 1\ell_{1}-norm unit ball, and (2) X\mathcal{X} being either the 1\ell_{1}- or the 2\ell_{2}-norm unit ball. We give a sublinear classical algorithm that can interpolate smoothly between these two cases: for any fixed q(1,2]q\in (1,2], we solve the matrix game where X\mathcal{X} is a q\ell_{q}-norm unit ball within additive error ϵ\epsilon in time O~((n+d)/ϵ2)\tilde{O}((n+d)/{\epsilon^{2}}). We also provide a corresponding sublinear quantum algorithm that solves the same task in time O~((n+d)poly(1/ϵ))\tilde{O}((\sqrt{n}+\sqrt{d})\textrm{poly}(1/\epsilon)) with a quadratic improvement in both nn and dd. Both our classical and quantum algorithms are optimal in the dimension parameters nn and dd up to poly-logarithmic factors. Finally, we propose sublinear classical and quantum algorithms for the approximate Carath’eodory problem and the q\ell_{q}-margin support vector machines as applications.

6. Characterizing Twitter users behaviour during the Spanish Covid-19 first wave

Bernat Esquirol, Luce Prignano, Albert Díaz-Guilera, Emanuele Cozzo

  • retweets: 90, favorites: 14 (12/15/2020 10:16:53)
  • links: abs | pdf
  • cs.SI

People use Online Social Media to make sense of crisis events. A pandemic crisis like the Covid-19 outbreak is a complex event, involving many aspects of the social lives on many temporal scales. Focusing on the Spanish Twittersphere, we characterize users activity behaviour across the different phases of the Covid-19 first wave. By performing stepwise segmented regression analysis and Bayesian switchpoint analysis on a sample of Spanish Twittersphere users timeline, we observe that generic Spanish Twitter users and journalists experience an abrupt positive relative increment of their tweeting activity between March the 9th and the 14th, in coincidence with control measures being announced by regional and State level authorities. On the contrary, politicians follow a completely endogenous dynamic determined by institutional agenda.

7. Relighting Images in the Wild with a Self-Supervised Siamese Auto-Encoder

Yang Liu, Alexandros Neophytou, Sunando Sengupta, Eric Sommerlade

  • retweets: 38, favorites: 35 (12/15/2020 10:16:53)
  • links: abs | pdf
  • cs.CV

We propose a self-supervised method for image relighting of single view images in the wild. The method is based on an auto-encoder which deconstructs an image into two separate encodings, relating to the scene illumination and content, respectively. In order to disentangle this embedding information without supervision, we exploit the assumption that some augmentation operations do not affect the image content and only affect the direction of the light. A novel loss function, called spherical harmonic loss, is introduced that forces the illumination embedding to convert to a spherical harmonic vector. We train our model on large-scale datasets such as Youtube 8M and CelebA. Our experiments show that our method can correctly estimate scene illumination and realistically re-light input images, without any supervision or a prior shape model. Compared to supervised methods, our approach has similar performance and avoids common lighting artifacts.

8. Towards Neural Programming Interfaces

Zachary C. Brown, Nathaniel Robinson, David Wingate, Nancy Fulda

  • retweets: 36, favorites: 35 (12/15/2020 10:16:53)
  • links: abs | pdf
  • cs.CL | cs.AI

It is notoriously difficult to control the behavior of artificial neural networks such as generative neural language models. We recast the problem of controlling natural language generation as that of learning to interface with a pretrained language model, just as Application Programming Interfaces (APIs) control the behavior of programs by altering hyperparameters. In this new paradigm, a specialized neural network (called a Neural Programming Interface or NPI) learns to interface with a pretrained language model by manipulating the hidden activations of the pretrained model to produce desired outputs. Importantly, no permanent changes are made to the weights of the original model, allowing us to re-purpose pretrained models for new tasks without overwriting any aspect of the language model. We also contribute a new data set construction algorithm and GAN-inspired loss function that allows us to train NPI models to control outputs of autoregressive transformers. In experiments against other state-of-the-art approaches, we demonstrate the efficacy of our methods using OpenAI’s GPT-2 model, successfully controlling noun selection, topic aversion, offensive speech filtering, and other aspects of language while largely maintaining the controlled model’s fluency under deterministic settings.

9. Mapping the Space of Chemical Reactions Using Attention-Based Neural Networks

Philippe Schwaller, Daniel Probst, Alain C. Vaucher, Vishnu H. Nair, David Kreutter, Teodoro Laino, Jean-Louis Reymond

Organic reactions are usually assigned to classes containing reactions with similar reagents and mechanisms. Reaction classes facilitate the communication of complex concepts and efficient navigation through chemical reaction space. However, the classification process is a tedious task. It requires the identification of the corresponding reaction class template via annotation of the number of molecules in the reactions, the reaction center, and the distinction between reactants and reagents. This work shows that transformer-based models can infer reaction classes from non-annotated, simple text-based representations of chemical reactions. Our best model reaches a classification accuracy of 98.2%. We also show that the learned representations can be used as reaction fingerprints that capture fine-grained differences between reaction classes better than traditional reaction fingerprints. The insights into chemical reaction space enabled by our learned fingerprints are illustrated by an interactive reaction atlas providing visual clustering and similarity searching.

10. The Three Ghosts of Medical AI: Can the Black-Box Present Deliver?

Thomas P. Quinn, Stephan Jacobs, Manisha Senadeera, Vuong Le, Simon Coghlan

  • retweets: 42, favorites: 14 (12/15/2020 10:16:53)
  • links: abs | pdf
  • cs.AI

Our title alludes to the three Christmas ghosts encountered by Ebenezer Scrooge in \textit{A Christmas Carol}, who guide Ebenezer through the past, present, and future of Christmas holiday events. Similarly, our article will take readers through a journey of the past, present, and future of medical AI. In doing so, we focus on the crux of modern machine learning: the reliance on powerful but intrinsically opaque models. When applied to the healthcare domain, these models fail to meet the needs for transparency that their clinician and patient end-users require. We review the implications of this failure, and argue that opaque models (1) lack quality assurance, (2) fail to elicit trust, and (3) restrict physician-patient dialogue. We then discuss how upholding transparency in all aspects of model design and model validation can help ensure the reliability of medical AI.