1. YolactEdge: Real-time Instance Segmentation on the Edge (Jetson AGX Xavier: 30 FPS, RTX 2080 Ti: 170 FPS)
Haotian Liu, Rafael A. Rivera Soto, Fanyi Xiao, Yong Jae Lee
We propose YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds. Specifically, YolactEdge runs at up to 30.8 FPS on a Jetson AGX Xavier (and 172.7 FPS on an RTX 2080 Ti) with a ResNet-101 backbone on 550x550 resolution images. To achieve this, we make two improvements to the state-of-the-art image-based real-time method YOLACT: (1) TensorRT optimization while carefully trading off speed and accuracy, and (2) a novel feature warping module to exploit temporal redundancy in videos. Experiments on the YouTube VIS and MS COCO datasets demonstrate that YolactEdge produces a 3-5x speed up over existing real-time methods while producing competitive mask and box detection accuracy. We also conduct ablation studies to dissect our design choices and modules. Code and models are available at https://github.com/haotian-liu/yolact_edge.
YolactEdge: Real-time Instance Segmentation on the Edge (Jetson AGX Xavier: 30 FPS, RTX 2080 Ti: 170 FPS)
— AK (@ak92501) December 23, 2020
pdf: https://t.co/QOLDDnd1hl
abs: https://t.co/elA7UiMks0
github: https://t.co/kp8DIc8t2G pic.twitter.com/TqQX0iU8re
2. Informer: Transformer Likes Informed Attention
Ruining He, Anirudh Ravula, Bhargav Kanagal, Joshua Ainslie
Transformer is the backbone of modern NLP models. In this paper, we propose Informer, a simple architecture that significantly outperforms canonical Transformers on a spectrum of tasks including Masked Language Modeling, GLUE, and SQuAD. Qualitatively, Informer is easy to implement and requires minimal hyper-parameter tuning. It also stabilizes training and leads to models with sparser attentions. Code will be open-sourced upon paper acceptance.
Informer: Transformer Likes Informed Attention
— AK (@ak92501) December 23, 2020
pdf: https://t.co/oyo8Gaz3Tw
abs: https://t.co/A3HPsPkaiC pic.twitter.com/BggzpSYwEQ
3. Time-Travel Rephotography
Xuan Luo, Xuaner Zhang, Paul Yoo, Ricardo Martin-Brualla, Jason Lawrence, Steven M. Seitz
Many historical people are captured only in old, faded, black and white photos, that have been distorted by the limitations of early cameras and the passage of time. This paper simulates traveling back in time with a modern camera to rephotograph famous subjects. Unlike conventional image restoration filters which apply independent operations like denoising, colorization, and superresolution, we leverage the StyleGAN2 framework to project old photos into the space of modern high-resolution photos, achieving all of these effects in a unified framework. A unique challenge with this approach is capturing the identity and pose of the photo’s subject and not the many artifacts in low-quality antique photos. Our comparisons to current state-of-the-art restoration filters show significant improvements and compelling results for a variety of important historical people.
Time-Travel Rephotography
— AK (@ak92501) December 23, 2020
pdf: https://t.co/rLSJuYcewp
abs: https://t.co/uC71rwdwRQ
project page: https://t.co/tKDXPFOsmn pic.twitter.com/TXaMOineZL
4. Knowledge Graphs Evolution and Preservation — A Technical Report from ISWS 2019
Nacira Abbas, Kholoud Alghamdi, Mortaza Alinam, Francesca Alloatti, Glenda Amaral, Claudia d’Amato, Luigi Asprino, Martin Beno, Felix Bensmann, Russa Biswas, Ling Cai, Riley Capshaw, Valentina Anita Carriero, Irene Celino, Amine Dadoun, Stefano De Giorgis, Harm Delva, John Domingue, Michel Dumontier, Vincent Emonet, Marieke van Erp, Paola Espinoza Arias, Omaima Fallatah, Sebastián Ferrada, Marc Gallofré Ocaña, Michalis Georgiou, Genet Asefa Gesese, Frances Gillis-Webber, Francesca Giovannetti, Marìa Granados Buey, Ismail Harrando, Ivan Heibi, Vitor Horta, Laurine Huber, Federico Igne, Mohamad Yaser Jaradeh, Neha Keshan, Aneta Koleva, Bilal Koteich, Kabul Kurniawan, Mengya Liu, Chuangtao Ma, Lientje Maas, Martin Mansfield, Fabio Mariani, Eleonora Marzi, Sepideh Mesbah
One of the grand challenges discussed during the Dagstuhl Seminar “Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web” and described in its report is that of a: “Public FAIR Knowledge Graph of Everything: We increasingly see the creation of knowledge graphs that capture information about the entirety of a class of entities. […] This grand challenge extends this further by asking if we can create a knowledge graph of “everything” ranging from common sense concepts to location based entities. This knowledge graph should be “open to the public” in a FAIR manner democratizing this mass amount of knowledge.” Although linked open data (LOD) is one knowledge graph, it is the closest realisation (and probably the only one) to a public FAIR Knowledge Graph (KG) of everything. Surely, LOD provides a unique testbed for experimenting and evaluating research hypotheses on open and FAIR KG. One of the most neglected FAIR issues about KGs is their ongoing evolution and long term preservation. We want to investigate this problem, that is to understand what preserving and supporting the evolution of KGs means and how these problems can be addressed. Clearly, the problem can be approached from different perspectives and may require the development of different approaches, including new theories, ontologies, metrics, strategies, procedures, etc. This document reports a collaborative effort performed by 9 teams of students, each guided by a senior researcher as their mentor, attending the International Semantic Web Research School (ISWS 2019). Each team provides a different perspective to the problem of knowledge graph evolution substantiated by a set of research questions as the main subject of their investigation. In addition, they provide their working definition for KG preservation and evolution.
Finally, we are happy to announce the publication of our @isws_semweb ISWS2019 report on #KnowledgeGraphs Evolution and Preservation with contributions of all the #ISWS2019 students. Happy holidays & Thanks to everybody for making this happen!https://t.co/D2PsyFIUAG pic.twitter.com/8xRaOi0vYQ
— Harald Sack (@lysander07) December 23, 2020
5. Pre-Training a Language Model Without Human Language
Cheng-Han Chiang, Hung-yi Lee
In this paper, we study how the intrinsic nature of pre-training data contributes to the fine-tuned downstream performance. To this end, we pre-train different transformer-based masked language models on several corpora with certain features, and we fine-tune those language models on GLUE benchmarks. We find that models pre-trained on unstructured data beat those trained directly from scratch on downstream tasks. Our results also show that pre-training on structured data does not always make the model acquire ability that can be transferred to natural language downstream tasks. To our great astonishment, we uncover that pre-training on certain non-human language data gives GLUE performance close to performance pre-trained on another non-English language.
They pretrained a Roberta on amino acids and Javascript and fine-tuned on GLUE 😂https://t.co/puYwj6mN2s pic.twitter.com/bVset6FoWP
— Aran Komatsuzaki (@arankomatsuzaki) December 23, 2020
Pre-Training a Language Model Without Human Language
— AK (@ak92501) December 23, 2020
pdf: https://t.co/WuLKaYMN2D
abs: https://t.co/nN29pJS23b pic.twitter.com/YbnDuyhWhr
6. Recognizing Emotion Cause in Conversations
Soujanya Poria, Navonil Majumder, Devamanyu Hazarika, Deepanway Ghosal, Rishabh Bhardwaj, Samson Yu Bai Jian, Romila Ghosh, Niyati Chhaya, Alexander Gelbukh, Rada Mihalcea
Recognizing the cause behind emotions in text is a fundamental yet under-explored area of research in NLP. Advances in this area hold the potential to improve interpretability and performance in affect-based models. Identifying emotion causes at the utterance level in conversations is particularly challenging due to the intermingling dynamic among the interlocutors. To this end, we introduce the task of recognizing emotion cause in conversations with an accompanying dataset named RECCON. Furthermore, we define different cause types based on the source of the causes and establish strong transformer-based baselines to address two different sub-tasks of RECCON: 1) Causal Span Extraction and 2) Causal Emotion Entailment. The dataset is available at https://github.com/declare-lab/RECCON.
Introducing RECCON -- a new dataset for the task of recognizing emotion cause in conversations. We also discuss different emotion cause types.
— Soujanya Poria (@soujanyaporia) December 23, 2020
Preprint: https://t.co/59HMXxfvtH
Download the dataset and codes for transformer-based baselines: https://t.co/hRG47tv5Xp#NLProc https://t.co/49S0KDUyA8
7. Few-Shot Text Generation with Pattern-Exploiting Training
Timo Schick, Hinrich Schütze
Providing pretrained language models with simple task descriptions or prompts in natural language yields impressive few-shot results for a wide range of text classification tasks when combined with gradient-based learning from examples. In this paper, we show that the underlying idea can also be applied to text generation tasks: We adapt Pattern-Exploiting Training (PET), a recently proposed few-shot approach, for finetuning generative language models on text generation tasks. On several text summarization and headline generation datasets, our proposed variant of PET gives consistent improvements over a strong baseline in few-shot settings.
Few-Shot Text Generation with Pattern-Exploiting Training
— AK (@ak92501) December 23, 2020
pdf: https://t.co/rXl3koKS2i
abs: https://t.co/p1g0TJEAjU pic.twitter.com/bchPhJ5y3w
🎄 New paper 🎄 Looking for something to do over the Xmas holidays? Check out our preprint (w/@HinrichSchuetze) on text generation with PET. We show that providing short task descriptions greatly improves text generation perf. in few-shot settings: https://t.co/ThXtWXsntF #NLProc pic.twitter.com/zuEoKM353K
— Timo Schick (@timo_schick) December 23, 2020
8. Latent Feature Representation via Unsupervised Learning for Pattern Discovery in Massive Electron Microscopy Image Volumes
Gary B Huang, Huei-Fang Yang, Shin-ya Takemura, Pat Rivlin, Stephen M Plaza
We propose a method to facilitate exploration and analysis of new large data sets. In particular, we give an unsupervised deep learning approach to learning a latent representation that captures semantic similarity in the data set. The core idea is to use data augmentations that preserve semantic meaning to generate synthetic examples of elements whose feature representations should be close to one another. We demonstrate the utility of our method applied to nano-scale electron microscopy data, where even relatively small portions of animal brains can require terabytes of image data. Although supervised methods can be used to predict and identify known patterns of interest, the scale of the data makes it difficult to mine and analyze patterns that are not known a priori. We show the ability of our learned representation to enable query by example, so that if a scientist notices an interesting pattern in the data, they can be presented with other locations with matching patterns. We also demonstrate that clustering of data in the learned space correlates with biologically-meaningful distinctions. Finally, we introduce a visualization tool and software ecosystem to facilitate user-friendly interactive analysis and uncover interesting biological patterns. In short, our work opens possible new avenues in understanding of and discovery in large data sets, arising in domains such as EM analysis.
9. Unadversarial Examples: Designing Objects for Robust Vision
Hadi Salman, Andrew Ilyas, Logan Engstrom, Sai Vemprala, Aleksander Madry, Ashish Kapoor
We study a class of realistic computer vision settings wherein one can influence the design of the objects being recognized. We develop a framework that leverages this capability to significantly improve vision models’ performance and robustness. This framework exploits the sensitivity of modern machine learning algorithms to input perturbations in order to design “robust objects,” i.e., objects that are explicitly optimized to be confidently detected or classified. We demonstrate the efficacy of the framework on a wide variety of vision-based tasks ranging from standard benchmarks, to (in-simulation) robotics, to real-world experiments. Our code can be found at https://git.io/unadversarial .
Unadversarial Examples: Designing Objects for Robust Vision
— AK (@ak92501) December 23, 2020
pdf: https://t.co/AvurvaxQx0
abs: https://t.co/YSC770nF55
github: https://t.co/pCE7nU5zIZ pic.twitter.com/33RHb7kwyb
10. Self-Imitation Advantage Learning
Johan Ferret, Olivier Pietquin, Matthieu Geist
Self-imitation learning is a Reinforcement Learning (RL) method that encourages actions whose returns were higher than expected, which helps in hard exploration and sparse reward problems. It was shown to improve the performance of on-policy actor-critic methods in several discrete control tasks. Nevertheless, applying self-imitation to the mostly action-value based off-policy RL methods is not straightforward. We propose SAIL, a novel generalization of self-imitation learning for off-policy RL, based on a modification of the Bellman optimality operator that we connect to Advantage Learning. Crucially, our method mitigates the problem of stale returns by choosing the most optimistic return estimate between the observed return and the current action-value for self-imitation. We demonstrate the empirical effectiveness of SAIL on the Arcade Learning Environment, with a focus on hard exploration games.
Happy to announce that our paper, "Self-Imitation Advantage Learning", was accepted at AAMAS 2021!
— Johan Ferret (@johanferret) December 23, 2020
arxiv link: https://t.co/PoVD8fL3Ym
joint work w/ O. Pietquin & M. Geist
We propose SAIL, an extension of Self-Imitation Learning for off-policy RL ⛵️