1. Deep Single Image Manipulation
Yael Vinker, Eliahu Horwitz, Nir Zabari, Yedid Hoshen
Image manipulation has attracted much research over the years due to the popularity and commercial importance of the task. In recent years, deep neural network methods have been proposed for many image manipulation tasks. A major issue with deep methods is the need to train on large amounts of data from the same distribution as the target image, whereas collecting datasets encompassing the entire long-tail of images is impossible. In this paper, we demonstrate that simply training a conditional adversarial generator on the single target image is sufficient for performing complex image manipulations. We find that the key for enabling single image training is extensive augmentation of the input image and provide a novel augmentation method. Our network learns to map between a primitive representation of the image (e.g. edges) to the image itself. At manipulation time, our generator allows for making general image changes by modifying the primitive input representation and mapping it through the network. We extensively evaluate our method and find that it provides remarkable performance.
Deep Single Image Manipulation
— AK (@ak92501) July 3, 2020
pdf: https://t.co/v5hYKYjTw5
abs: https://t.co/5Lk9UIWzk4
project page: https://t.co/etX9ikl0h6 pic.twitter.com/xIQhdpjEje
2. ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network
Dongyoon Han, Sangdoo Yun, Byeongho Heo, YoungJoon Yoo
This paper addresses representational bottleneck in a network and propose a set of design principles that improves model performance significantly. We argue that a representational bottleneck may happen in a network designed by a conventional design and results in degrading the model performance. To investigate the representational bottleneck, we study the matrix rank of the features generated by ten thousand random networks. We further study the entire layer’s channel configuration towards designing more accurate network architectures. Based on the investigation, we propose simple yet effective design principles to mitigate the representational bottleneck. Slight changes on baseline networks by following the principle leads to achieving remarkable performance improvements on ImageNet classification. Additionally, COCO object detection results and transfer learning results on several datasets provide other backups of the link between diminishing representational bottleneck of a network and improving performance. Code and pretrained models are available at https://github.com/clovaai/rexnet.
Happy to announce that #rexnet, a new lightweight image backbone of #ClovaAI, was released with the official pytorch code and some pretrained models. Please check it out!
— Jung-Woo Ha (@JungWooHa2) July 3, 2020
Paper: https://t.co/E5gUhtnjBo
Github: https://t.co/PcdFXOVeSl
3. Processing South Asian Languages Written in the Latin Script: the Dakshina Dataset
Brian Roark, Lawrence Wolf-Sonkin, Christo Kirov, Sabrina J. Mielke, Cibu Johny, Isin Demirsahin, Keith Hall
This paper describes the Dakshina dataset, a new resource consisting of text in both the Latin and native scripts for 12 South Asian languages. The dataset includes, for each language: 1) native script Wikipedia text; 2) a romanization lexicon; and 3) full sentence parallel data in both a native script of the language and the basic Latin alphabet. We document the methods used for preparation and selection of the Wikipedia text in each language; collection of attested romanizations for sampled lexicons; and manual romanization of held-out sentences from the native script collections. We additionally provide baseline results on several tasks made possible by the dataset, including single word transliteration, full sentence transliteration, and language modeling of native script and romanized text. Keywords: romanization, transliteration, South Asian languages
Proud to have worked on this last summer @GoogleAI:
— Sabrina J. Mielke (@sjmielke) July 3, 2020
"Dakshina:" 2GB of transliterated sentences & transliteration lexica (preserving task-inherent ambiguity, giving you frequencies for variants!) in 12 South Asian languages!
➡️ https://t.co/iPdCwZtFt5
➡️ https://t.co/SgnyvfhcXX https://t.co/tJKxYrkwl8
4. Can We Achieve More with Less? Exploring Data Augmentation for Toxic Comment Classification
Chetanya Rastogi, Nikka Mofid, Fang-I Hsiao
This paper tackles one of the greatest limitations in Machine Learning: Data Scarcity. Specifically, we explore whether high accuracy classifiers can be built from small datasets, utilizing a combination of data augmentation techniques and machine learning algorithms. In this paper, we experiment with Easy Data Augmentation (EDA) and Backtranslation, as well as with three popular learning algorithms, Logistic Regression, Support Vector Machine (SVM), and Bidirectional Long Short-Term Memory Network (Bi-LSTM). For our experimentation, we utilize the Wikipedia Toxic Comments dataset so that in the process of exploring the benefits of data augmentation, we can develop a model to detect and classify toxic speech in comments to help fight back against cyberbullying and online harassment. Ultimately, we found that data augmentation techniques can be used to significantly boost the performance of classifiers and are an excellent strategy to combat lack of data in NLP problems.
文書分類タスクにおけるデータ水増しの効果を検証している研究。アルゴリズム(LR, SVM, Bi-LSTM)×水増し手法の組み合わせを実験している。
— u++ (@upura0) July 3, 2020
Can We Achieve More with Less? Exploring Data Augmentation for Toxic Comment Classificationhttps://t.co/VJMsUoD0Ac
5. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering
Gautier Izacard, Edouard Grave
Generative models for open domain question answering have proven to be competitive, without resorting to external knowledge. While promising, this approach requires to use models with billions of parameters, which are expensive to train and query. In this paper, we investigate how much these models can benefit from retrieving text passages, potentially containing evidence. We obtain state-of-the-art results on the Natural Questions and TriviaQA open benchmarks. Interestingly, we observe that the performance of this method significantly improves when increasing the number of retrieved passages. This is evidence that generative models are good at aggregating and combining evidence from multiple passages.
New work w/ @gizacard (Gautier Izacard): how much do generative models for open domain QA benefit from retrieval? A lot! Retrieving 100 passages, we get 51.4 EM on NaturalQuestions, 67.6 EM on TriviaQA. 1/3
— Edouard Grave (@EXGRV) July 3, 2020
Paper: https://t.co/ftRwYSry1R pic.twitter.com/vyr7cBqoUs