Hot Papers 2020-12-23

1. YolactEdge: Real-time Instance Segmentation on the Edge (Jetson AGX Xavier: 30 FPS, RTX 2080 Ti: 170 FPS)

Haotian Liu, Rafael A. Rivera Soto, Fanyi Xiao, Yong Jae Lee

retweets: 3778, favorites: 313 (12/24/2020 09:49:07)
links: abs | pdf
cs.CV | cs.AI | cs.LG | cs.RO

We propose YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds. Specifically, YolactEdge runs at up to 30.8 FPS on a Jetson AGX Xavier (and 172.7 FPS on an RTX 2080 Ti) with a ResNet-101 backbone on 550x550 resolution images. To achieve this, we make two improvements to the state-of-the-art image-based real-time method YOLACT: (1) TensorRT optimization while carefully trading off speed and accuracy, and (2) a novel feature warping module to exploit temporal redundancy in videos. Experiments on the YouTube VIS and MS COCO datasets demonstrate that YolactEdge produces a 3-5x speed up over existing real-time methods while producing competitive mask and box detection accuracy. We also conduct ablation studies to dissect our design choices and modules. Code and models are available at https://github.com/haotian-liu/yolact_edge.

YolactEdge: Real-time Instance Segmentation on the Edge (Jetson AGX Xavier: 30 FPS, RTX 2080 Ti: 170 FPS)
pdf: https://t.co/QOLDDnd1hl
abs: https://t.co/elA7UiMks0
github: https://t.co/kp8DIc8t2G pic.twitter.com/TqQX0iU8re
— AK (@ak92501) December 23, 2020

2. Informer: Transformer Likes Informed Attention

Ruining He, Anirudh Ravula, Bhargav Kanagal, Joshua Ainslie

retweets: 780, favorites: 195 (12/24/2020 09:49:07)
links: abs | pdf
cs.LG

Transformer is the backbone of modern NLP models. In this paper, we propose Informer, a simple architecture that significantly outperforms canonical Transformers on a spectrum of tasks including Masked Language Modeling, GLUE, and SQuAD. Qualitatively, Informer is easy to implement and requires minimal hyper-parameter tuning. It also stabilizes training and leads to models with sparser attentions. Code will be open-sourced upon paper acceptance.

Informer: Transformer Likes Informed Attention
pdf: https://t.co/oyo8Gaz3Tw
abs: https://t.co/A3HPsPkaiC pic.twitter.com/BggzpSYwEQ
— AK (@ak92501) December 23, 2020

3. Time-Travel Rephotography

Xuan Luo, Xuaner Zhang, Paul Yoo, Ricardo Martin-Brualla, Jason Lawrence, Steven M. Seitz

retweets: 625, favorites: 158 (12/24/2020 09:49:07)
links: abs | pdf
cs.CV

Many historical people are captured only in old, faded, black and white photos, that have been distorted by the limitations of early cameras and the passage of time. This paper simulates traveling back in time with a modern camera to rephotograph famous subjects. Unlike conventional image restoration filters which apply independent operations like denoising, colorization, and superresolution, we leverage the StyleGAN2 framework to project old photos into the space of modern high-resolution photos, achieving all of these effects in a unified framework. A unique challenge with this approach is capturing the identity and pose of the photo’s subject and not the many artifacts in low-quality antique photos. Our comparisons to current state-of-the-art restoration filters show significant improvements and compelling results for a variety of important historical people.

Time-Travel Rephotography
pdf: https://t.co/rLSJuYcewp
abs: https://t.co/uC71rwdwRQ
project page: https://t.co/tKDXPFOsmn pic.twitter.com/TXaMOineZL
— AK (@ak92501) December 23, 2020

4. Knowledge Graphs Evolution and Preservation — A Technical Report from ISWS 2019

Nacira Abbas, Kholoud Alghamdi, Mortaza Alinam, Francesca Alloatti, Glenda Amaral, Claudia d’Amato, Luigi Asprino, Martin Beno, Felix Bensmann, Russa Biswas, Ling Cai, Riley Capshaw, Valentina Anita Carriero, Irene Celino, Amine Dadoun, Stefano De Giorgis, Harm Delva, John Domingue, Michel Dumontier, Vincent Emonet, Marieke van Erp, Paola Espinoza Arias, Omaima Fallatah, Sebastián Ferrada, Marc Gallofré Ocaña, Michalis Georgiou, Genet Asefa Gesese, Frances Gillis-Webber, Francesca Giovannetti, Marìa Granados Buey, Ismail Harrando, Ivan Heibi, Vitor Horta, Laurine Huber, Federico Igne, Mohamad Yaser Jaradeh, Neha Keshan, Aneta Koleva, Bilal Koteich, Kabul Kurniawan, Mengya Liu, Chuangtao Ma, Lientje Maas, Martin Mansfield, Fabio Mariani, Eleonora Marzi, Sepideh Mesbah

retweets: 364, favorites: 69 (12/24/2020 09:49:08)
links: abs | pdf
cs.AI

One of the grand challenges discussed during the Dagstuhl Seminar “Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web” and described in its report is that of a: “Public FAIR Knowledge Graph of Everything: We increasingly see the creation of knowledge graphs that capture information about the entirety of a class of entities. […] This grand challenge extends this further by asking if we can create a knowledge graph of “everything” ranging from common sense concepts to location based entities. This knowledge graph should be “open to the public” in a FAIR manner democratizing this mass amount of knowledge.” Although linked open data (LOD) is one knowledge graph, it is the closest realisation (and probably the only one) to a public FAIR Knowledge Graph (KG) of everything. Surely, LOD provides a unique testbed for experimenting and evaluating research hypotheses on open and FAIR KG. One of the most neglected FAIR issues about KGs is their ongoing evolution and long term preservation. We want to investigate this problem, that is to understand what preserving and supporting the evolution of KGs means and how these problems can be addressed. Clearly, the problem can be approached from different perspectives and may require the development of different approaches, including new theories, ontologies, metrics, strategies, procedures, etc. This document reports a collaborative effort performed by 9 teams of students, each guided by a senior researcher as their mentor, attending the International Semantic Web Research School (ISWS 2019). Each team provides a different perspective to the problem of knowledge graph evolution substantiated by a set of research questions as the main subject of their investigation. In addition, they provide their working definition for KG preservation and evolution.

Finally, we are happy to announce the publication of our @isws_semweb ISWS2019 report on #KnowledgeGraphs Evolution and Preservation with contributions of all the #ISWS2019 students. Happy holidays & Thanks to everybody for making this happen!https://t.co/D2PsyFIUAG pic.twitter.com/8xRaOi0vYQ
— Harald Sack (@lysander07) December 23, 2020

5. Pre-Training a Language Model Without Human Language

Cheng-Han Chiang, Hung-yi Lee

retweets: 166, favorites: 165 (12/24/2020 09:49:08)
links: abs | pdf
cs.CL

In this paper, we study how the intrinsic nature of pre-training data contributes to the fine-tuned downstream performance. To this end, we pre-train different transformer-based masked language models on several corpora with certain features, and we fine-tune those language models on GLUE benchmarks. We find that models pre-trained on unstructured data beat those trained directly from scratch on downstream tasks. Our results also show that pre-training on structured data does not always make the model acquire ability that can be transferred to natural language downstream tasks. To our great astonishment, we uncover that pre-training on certain non-human language data gives GLUE performance close to performance pre-trained on another non-English language.

They pretrained a Roberta on amino acids and Javascript and fine-tuned on GLUE 😂https://t.co/puYwj6mN2s pic.twitter.com/bVset6FoWP
— Aran Komatsuzaki (@arankomatsuzaki) December 23, 2020

Pre-Training a Language Model Without Human Language
pdf: https://t.co/WuLKaYMN2D
abs: https://t.co/nN29pJS23b pic.twitter.com/YbnDuyhWhr
— AK (@ak92501) December 23, 2020

6. Recognizing Emotion Cause in Conversations

Soujanya Poria, Navonil Majumder, Devamanyu Hazarika, Deepanway Ghosal, Rishabh Bhardwaj, Samson Yu Bai Jian, Romila Ghosh, Niyati Chhaya, Alexander Gelbukh, Rada Mihalcea

retweets: 99, favorites: 53 (12/24/2020 09:49:08)
links: abs | pdf
cs.CL

Recognizing the cause behind emotions in text is a fundamental yet under-explored area of research in NLP. Advances in this area hold the potential to improve interpretability and performance in affect-based models. Identifying emotion causes at the utterance level in conversations is particularly challenging due to the intermingling dynamic among the interlocutors. To this end, we introduce the task of recognizing emotion cause in conversations with an accompanying dataset named RECCON. Furthermore, we define different cause types based on the source of the causes and establish strong transformer-based baselines to address two different sub-tasks of RECCON: 1) Causal Span Extraction and 2) Causal Emotion Entailment. The dataset is available at https://github.com/declare-lab/RECCON.

Introducing RECCON -- a new dataset for the task of recognizing emotion cause in conversations. We also discuss different emotion cause types.

Preprint: https://t.co/59HMXxfvtH
Download the dataset and codes for transformer-based baselines: https://t.co/hRG47tv5Xp #NLProc https://t.co/49S0KDUyA8
— Soujanya Poria (@soujanyaporia) December 23, 2020

7. Few-Shot Text Generation with Pattern-Exploiting Training

Timo Schick, Hinrich Schütze

retweets: 62, favorites: 59 (12/24/2020 09:49:08)
links: abs | pdf
cs.CL | cs.LG

Providing pretrained language models with simple task descriptions or prompts in natural language yields impressive few-shot results for a wide range of text classification tasks when combined with gradient-based learning from examples. In this paper, we show that the underlying idea can also be applied to text generation tasks: We adapt Pattern-Exploiting Training (PET), a recently proposed few-shot approach, for finetuning generative language models on text generation tasks. On several text summarization and headline generation datasets, our proposed variant of PET gives consistent improvements over a strong baseline in few-shot settings.

Few-Shot Text Generation with Pattern-Exploiting Training
pdf: https://t.co/rXl3koKS2i
abs: https://t.co/p1g0TJEAjU pic.twitter.com/bchPhJ5y3w
— AK (@ak92501) December 23, 2020

🎄 New paper 🎄 Looking for something to do over the Xmas holidays? Check out our preprint (w/@HinrichSchuetze) on text generation with PET. We show that providing short task descriptions greatly improves text generation perf. in few-shot settings: https://t.co/ThXtWXsntF #NLProc pic.twitter.com/zuEoKM353K
— Timo Schick (@timo_schick) December 23, 2020

8. Latent Feature Representation via Unsupervised Learning for Pattern Discovery in Massive Electron Microscopy Image Volumes

Gary B Huang, Huei-Fang Yang, Shin-ya Takemura, Pat Rivlin, Stephen M Plaza

retweets: 90, favorites: 15 (12/24/2020 09:49:09)
links: abs | pdf
cs.CV | cs.LG | q-bio.QM

We propose a method to facilitate exploration and analysis of new large data sets. In particular, we give an unsupervised deep learning approach to learning a latent representation that captures semantic similarity in the data set. The core idea is to use data augmentations that preserve semantic meaning to generate synthetic examples of elements whose feature representations should be close to one another. We demonstrate the utility of our method applied to nano-scale electron microscopy data, where even relatively small portions of animal brains can require terabytes of image data. Although supervised methods can be used to predict and identify known patterns of interest, the scale of the data makes it difficult to mine and analyze patterns that are not known a priori. We show the ability of our learned representation to enable query by example, so that if a scientist notices an interesting pattern in the data, they can be presented with other locations with matching patterns. We also demonstrate that clustering of data in the learned space correlates with biologically-meaningful distinctions. Finally, we introduce a visualization tool and software ecosystem to facilitate user-friendly interactive analysis and uncover interesting biological patterns. In short, our work opens possible new avenues in understanding of and discovery in large data sets, arising in domains such as EM analysis.

9. Unadversarial Examples: Designing Objects for Robust Vision

Hadi Salman, Andrew Ilyas, Logan Engstrom, Sai Vemprala, Aleksander Madry, Ashish Kapoor

retweets: 42, favorites: 25 (12/24/2020 09:49:09)
links: abs | pdf
cs.CV | cs.LG

We study a class of realistic computer vision settings wherein one can influence the design of the objects being recognized. We develop a framework that leverages this capability to significantly improve vision models’ performance and robustness. This framework exploits the sensitivity of modern machine learning algorithms to input perturbations in order to design “robust objects,” i.e., objects that are explicitly optimized to be confidently detected or classified. We demonstrate the efficacy of the framework on a wide variety of vision-based tasks ranging from standard benchmarks, to (in-simulation) robotics, to real-world experiments. Our code can be found at https://git.io/unadversarial .

Unadversarial Examples: Designing Objects for Robust Vision
pdf: https://t.co/AvurvaxQx0
abs: https://t.co/YSC770nF55
github: https://t.co/pCE7nU5zIZ pic.twitter.com/33RHb7kwyb
— AK (@ak92501) December 23, 2020

10. Self-Imitation Advantage Learning

Johan Ferret, Olivier Pietquin, Matthieu Geist

retweets: 36, favorites: 27 (12/24/2020 09:49:09)
links: abs | pdf
cs.LG

Self-imitation learning is a Reinforcement Learning (RL) method that encourages actions whose returns were higher than expected, which helps in hard exploration and sparse reward problems. It was shown to improve the performance of on-policy actor-critic methods in several discrete control tasks. Nevertheless, applying self-imitation to the mostly action-value based off-policy RL methods is not straightforward. We propose SAIL, a novel generalization of self-imitation learning for off-policy RL, based on a modification of the Bellman optimality operator that we connect to Advantage Learning. Crucially, our method mitigates the problem of stale returns by choosing the most optimistic return estimate between the observed return and the current action-value for self-imitation. We demonstrate the empirical effectiveness of SAIL on the Arcade Learning Environment, with a focus on hard exploration games.

Happy to announce that our paper, "Self-Imitation Advantage Learning", was accepted at AAMAS 2021!

arxiv link: https://t.co/PoVD8fL3Ym
joint work w/ O. Pietquin & M. Geist

We propose SAIL, an extension of Self-Imitation Learning for off-policy RL ⛵️
— Johan Ferret (@johanferret) December 23, 2020

Published 24 Dec 2020

ML Lead at Beatrust. (https://beatrust.com)Tatsuya Shirakawa on Twitter