Hot Papers 2021-03-16

1. Revisiting ResNets: Improved Training and Scaling Strategies

Irwan Bello, William Fedus, Xianzhi Du, Ekin D. Cubuk, Aravind Srinivas, Tsung-Yi Lin, Jonathon Shlens, Barret Zoph

retweets: 4111, favorites: 51 (03/17/2021 09:09:48)
links: abs | pdf
cs.CV

Novel computer vision architectures monopolize the spotlight, but the impact of the model architecture is often conflated with simultaneous changes to training methodology and scaling strategies. Our work revisits the canonical ResNet (He et al., 2015) and studies these three aspects in an effort to disentangle them. Perhaps surprisingly, we find that training and scaling strategies may matter more than architectural changes, and further, that the resulting ResNets match recent state-of-the-art models. We show that the best performing scaling strategy depends on the training regime and offer two new scaling strategies: (1) scale model depth in regimes where overfitting can occur (width scaling is preferable otherwise); (2) increase image resolution more slowly than previously recommended (Tan & Le, 2019). Using improved training and scaling strategies, we design a family of ResNet architectures, ResNet-RS, which are 1.7x - 2.7x faster than EfficientNets on TPUs, while achieving similar accuracies on ImageNet. In a large-scale semi-supervised learning setup, ResNet-RS achieves 86.2% top-1 ImageNet accuracy, while being 4.7x faster than EfficientNet NoisyStudent. The training techniques improve transfer performance on a suite of downstream tasks (rivaling state-of-the-art self-supervised algorithms) and extend to video classification on Kinetics-400. We recommend practitioners use these simple revised ResNets as baselines for future research.

You don't need EfficientNets. Simple tricks make ResNets better and faster than EfficientNets

Revisiting ResNets: Improved Training and Scaling Strategies

🤙https://t.co/poXZtzH4Bh pic.twitter.com/YSqzKkCfRd
— Artsiom Sanakoyeu (@artsiom_s) March 16, 2021

2. Approximating How Single Head Attention Learns

Charlie Snell, Ruiqi Zhong, Dan Klein, Jacob Steinhardt

retweets: 1480, favorites: 173 (03/17/2021 09:09:48)
links: abs | pdf
cs.CL | cs.AI

Why do models often attend to salient words, and how does this evolve throughout training? We approximate model training as a two stage process: early on in training when the attention weights are uniform, the model learns to translate individual input word i to o if they co-occur frequently. Later, the model learns to attend to i while the correct output is $o$ because it knows i translates to o. To formalize, we define a model property, Knowledge to Translate Individual Words (KTIW) (e.g. knowing that i translates to o), and claim that it drives the learning of the attention. This claim is supported by the fact that before the attention mechanism is learned, KTIW can be learned from word co-occurrence statistics, but not the other way around. Particularly, we can construct a training distribution that makes KTIW hard to learn, the learning of the attention fails, and the model cannot even learn the simple task of copying the input words to the output. Our approximation explains why models sometimes attend to salient words, and inspires a toy example where a multi-head attention model can overcome the above hard training distribution by improving learning dynamics rather than expressiveness.

Approximating How Single Head Attention Learns
pdf: https://t.co/nxuHyGGiw2
abs: https://t.co/j0twraCq4P pic.twitter.com/Vv4Kjs5EJQ
— AK (@ak92501) March 16, 2021

3. Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang

retweets: 1368, favorites: 215 (03/17/2021 09:09:48)
links: abs | pdf
cs.CV

We present Modular interactive VOS (MiVOS) framework which decouples interaction-to-mask and mask propagation, allowing for higher generalizability and better performance. Trained separately, the interaction module converts user interactions to an object mask, which is then temporally propagated by our propagation module using a novel top- $k$ filtering strategy in reading the space-time memory. To effectively take the user’s intent into account, a novel difference-aware module is proposed to learn how to properly fuse the masks before and after each interaction, which are aligned with the target frames by employing the space-time memory. We evaluate our method both qualitatively and quantitatively with different forms of user interactions (e.g., scribbles, clicks) on DAVIS to show that our method outperforms current state-of-the-art algorithms while requiring fewer frame interactions, with the additional advantage in generalizing to different types of user interactions. We contribute a large-scale synthetic VOS dataset with pixel-accurate segmentation of 4.8M frames to accompany our source codes to facilitate future research.

Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion
pdf: https://t.co/hTVNplXBQY
abs: https://t.co/930WeSQ8Np
project page: https://t.co/iyklauxO6V pic.twitter.com/DX8QPVxjCr
— AK (@ak92501) March 16, 2021

4. Multi-view Subword Regularization

Xinyi Wang, Sebastian Ruder, Graham Neubig

retweets: 405, favorites: 163 (03/17/2021 09:09:49)
links: abs | pdf
cs.CL

Multilingual pretrained representations generally rely on subword segmentation algorithms to create a shared multilingual vocabulary. However, standard heuristic algorithms often lead to sub-optimal segmentation, especially for languages with limited amounts of data. In this paper, we take two major steps towards alleviating this problem. First, we demonstrate empirically that applying existing subword regularization methods(Kudo, 2018; Provilkov et al., 2020) during fine-tuning of pre-trained multilingual representations improves the effectiveness of cross-lingual transfer. Second, to take full advantage of different possible input segmentations, we propose Multi-view Subword Regularization (MVR), a method that enforces the consistency between predictions of using inputs tokenized by the standard and probabilistic segmentations. Results on the XTREME multilingual benchmark(Hu et al., 2020) show that MVR brings consistent improvements of up to 2.5 points over using standard segmentation algorithms.

Subword segmentation for multilingual pretrained models are suboptimal, especially for under-represented languages. Our NAACL 2021 paper(https://t.co/L9UhV2j8dT) proposes a simple fix at fine-tuning time for better cross-lingual transfer. Joint work with @seb_ruder @gneubig pic.twitter.com/26lpco4A1V
— Xinyi Wang (Cindy) (@cindyxinyiwang) March 16, 2021

Multi-view subword regularization is simple but yields consistent improvements over pre-trained multilingual models. The best thing: It only needs to be applied during fine-tuning.

Paper: https://t.co/gxTgbzVvWN
Code: https://t.co/FqUyZgEnOQ https://t.co/sTFxot6yan
— Sebastian Ruder (@seb_ruder) March 16, 2021

5. The Public Life of Data: Investigating Reactions to Visualizations on Reddit

Tobias Kauer, Arran Ridley, Marian Dörk, Benjamin Bach

retweets: 323, favorites: 53 (03/17/2021 09:09:49)
links: abs | pdf
cs.HC

This research investigates how people engage with data visualizations when commenting on the social platform Reddit. There has been considerable research on collaborative sensemaking with visualizations and the personal relation of people with data. Yet, little is known about how public audiences without specific expertise and shared incentives openly express their thoughts, feelings, and insights in response to data visualizations. Motivated by the extensive social exchange around visualizations in online communities, this research examines characteristics and motivations of people’s reactions to posts featuring visualizations. Following a Grounded Theory approach, we study 475 reactions from the /r/dataisbeautiful community, identify ten distinguishable reaction types, and consider their contribution to the discourse. A follow-up survey with 168 Reddit users clarified their intentions to react. Our results help understand the role of personal perspectives on data and inform future interfaces that integrate audience reactions into visualizations to foster a public discourse about data.

We’re excited to present our paper “The public life of data: Investigating Reactions to Visualizations on Reddit” at #CHI2021 with @nrchtct @arranarranarran @benjbach: https://t.co/VDd75JErMD pic.twitter.com/FQPzFqpM7C
— Tobias Kauer (@tobi_vierzwo) March 16, 2021

6. Wav2vec-C: A Self-supervised Model for Speech Representation Learning

Samik Sadhu, Di He, Che-Wei Huang, Sri Harish Mallidi, Minhua Wu, Ariya Rastrow, Andreas Stolcke, Jasha Droppo, Roland Maas

retweets: 272, favorites: 65 (03/17/2021 09:09:49)
links: abs | pdf
eess.AS | cs.LG | cs.SD

Wav2vec-C introduces a novel representation learning technique combining elements from wav2vec 2.0 and VQ-VAE. Our model learns to reproduce quantized representations from partially masked speech encoding using a contrastive loss in a way similar to Wav2vec 2.0. However, the quantization process is regularized by an additional consistency network that learns to reconstruct the input features to the wav2vec 2.0 network from the quantized representations in a way similar to a VQ-VAE model. The proposed self-supervised model is trained on 10k hours of unlabeled data and subsequently used as the speech encoder in a RNN-T ASR model and fine-tuned with 1k hours of labeled data. This work is one of only a few studies of self-supervised learning on speech tasks with a large volume of real far-field labeled data. The Wav2vec-C encoded representations achieves, on average, twice the error reduction over baseline and a higher codebook utilization in comparison to wav2vec 2.0

Wav2vec-C: A Self-supervised Model for Speech Representation Learning
pdf: https://t.co/Fhn3sD6XCf
abs: https://t.co/Nxru1UbZF3 pic.twitter.com/wR0W5R842Z
— AK (@ak92501) March 16, 2021

7. Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning

Jason Wei, Chengyu Huang, Soroush Vosoughi, Yu Cheng, Shiqi Xu

retweets: 196, favorites: 73 (03/17/2021 09:09:49)
links: abs | pdf
cs.CL

Few-shot text classification is a fundamental NLP task in which a model aims to classify text into a large number of categories, given only a few training examples per category. This paper explores data augmentation — a technique particularly suitable for training with limited data — for this few-shot, highly-multiclass text classification setting. On four diverse text classification tasks, we find that common data augmentation techniques can improve the performance of triplet networks by up to 3.0% on average. To further boost performance, we present a simple training strategy called curriculum data augmentation, which leverages curriculum learning by first training on only original examples and then introducing augmented data as training progresses. We explore a two-stage and a gradual schedule, and find that, compared with standard single-stage training, curriculum data augmentation trains faster, improves performance, and remains robust to high amounts of noising from augmentation.

Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning
pdf: https://t.co/uyzxL1tu4L
abs: https://t.co/qX90yC7nAH pic.twitter.com/Y15zxoaJo8
— AK (@ak92501) March 16, 2021

8. Binary R Packages for Linux: Past, Present and Future

Iñaki Ucar, Dirk Eddelbuettel

retweets: 156, favorites: 26 (03/17/2021 09:09:49)
links: abs | pdf
stat.CO | cs.SE

Pre-compiled binary packages provide a very convenient way of efficiently distributing software that has been adopted by most Linux package management systems. However, the heterogeneity of the Linux ecosystem, combined with the growing number of R extensions available, poses a scalability problem. As a result, efforts to bring binary R packages to Linux have been scattered, and lack a proper mechanism to fully integrate them with R’s package manager. This work reviews past and present of binary distribution for Linux, and presents a path forward by showcasing the `cran2copr’ project, an RPM-based proof-of-concept implementation of an automated scalable binary distribution system with the capability of building, maintaining and distributing thousands of packages, while providing a portable and extensible bridge to the system package manager.

#arXiv [https://t.co/7PoS7wuZWd] Binary R Packages for Linux: Past, Present and Future, by @eddelbuettel and myself, https://t.co/Z22HHvPeLM. We review efforts over the last 20+ years, and present a PoC of an automated and scalable system with full #rstats integration via BSPM. pic.twitter.com/rgXWeCD6I2
— Iñaki Úcar (@Enchufa2) March 16, 2021

9. Efficient estimation of Pauli observables by derandomization

Hsin-Yuan Huang, Richard Kueng, John Preskill

retweets: 63, favorites: 95 (03/17/2021 09:09:49)
links: abs | pdf
quant-ph | cs.DS

We consider the problem of jointly estimating expectation values of many Pauli observables, a crucial subroutine in variational quantum algorithms. Starting with randomized measurements, we propose an efficient derandomization procedure that iteratively replaces random single-qubit measurements with fixed Pauli measurements; the resulting deterministic measurement procedure is guaranteed to perform at least as well as the randomized one. In particular, for estimating any $L$ low-weight Pauli observables, a deterministic measurement on only of order $\log(L)$ copies of a quantum state suffices. In some cases, for example when some of the Pauli observables have a high weight, the derandomized procedure is substantially better than the randomized one. Specifically, numerical experiments highlight the advantages of our derandomized protocol over various previous methods for estimating the ground-state energies of small molecules.

Is randomness necessary to estimate M observables from only log M quantum measurements, e.g., as in https://t.co/eZIUYy2Rc0? In https://t.co/fglWrO6EXy, we show that randomness could be removed to yield even better performance (with application to quantum chemistry).
— Hsin-Yuan (Robert) Huang (@RobertHuangHY) March 16, 2021

10. Diagrammatic Differentiation for Quantum Machine Learning

Alexis Toumi, Richie Yeung, Giovanni de Felice

retweets: 82, favorites: 75 (03/17/2021 09:09:50)
links: abs | pdf
quant-ph | cs.LG | math.CT

We introduce diagrammatic differentiation for tensor calculus by generalising the dual number construction from rigs to monoidal categories. Applying this to ZX diagrams, we show how to calculate diagrammatically the gradient of a linear map with respect to a phase parameter. For diagrams of parametrised quantum circuits, we get the well-known parameter-shift rule at the basis of many variational quantum algorithms. We then extend our method to the automatic differentation of hybrid classical-quantum circuits, using diagrams with bubbles to encode arbitrary non-linear operators. Moreover, diagrammatic differentiation comes with an open-source implementation in DisCoPy, the Python library for monoidal categories. Diagrammatic gradients of classical-quantum circuits can then be simplified using the PyZX library and executed on quantum hardware via the tket compiler. This opens the door to many practical applications harnessing both the structure of string diagrams and the computational power of quantum machine learning.

Our paper with @richie_yeung and @gio_defel on diagrammatic differentiation for QML is out on the arXiv! We give rules for computing the gradients of ZX diagrams, quantum circuits and their classical post processing. https://t.co/jJ2x3M1mGk pic.twitter.com/cJVLXqTYcv
— alexis.toumi (@AlexisToumi) March 16, 2021

11. Accelerating the timeline for climate action in California

Daniel M Kammen, Teenie Matlock, Manuel Pastor, David Pellow, Veerabhadran Ramanathan, Tom Steyer, Leah Stokes, Feliz Ventura

retweets: 87, favorites: 40 (03/17/2021 09:09:50)
links: abs | pdf
eess.SY

The climate emergency increasingly threatens our communities, ecosystems, food production, health, and economy. It disproportionately impacts lower income communities, communities of color, and the elderly. Assessments since the 2018 IPCC 1.5 Celsius report show that current national and sub-national commitments and actions are insufficient. Fortunately, a suite of solutions exists now to mitigate the climate crisis if we initiate and sustain actions today. California, which has a strong set of current targets in place and is home to clean energy and high technology innovation, has fallen behind in its climate ambition compared to a number of major governments. California, a catalyst for climate action globally, can and should ramp up its leadership by aligning its climate goals with the most recent science, coordinating actions to make 2030 a point of significant accomplishment. This entails dramatically accelerating its carbon neutrality and net-negative emissions goal from 2045 to 2030, including advancing clean energy and clean transportation standards, and accelerating nature-based solutions on natural and working lands. It also means changing its current greenhouse gas reduction goals both in the percentage and the timing: cutting emissions by 80 percent (instead of 40 percent) below 1990 levels much closer to 2030 than 2050. These actions will enable California to save lives, benefit underserved and frontline communities, and save trillions of dollars. This rededication takes heed of the latest science, accelerating equitable, job-creating climate policies. While there are significant challenges to achieving these goals, California can establish policy now that will unleash innovation and channel market forces, as has happened with solar, and catalyze positive upward-scaling tipping points for accelerated global climate action.

Time to up our game, California and get back in the lead on energy, climate and justice innovation! @AirResources @californiapuc @GavinNewsom @TeenieMatlock, @Prof_MPastor, @david_pellow, V.Ramanathan, @TomSteyer, @leahstokes

Open access version here: https://t.co/y29OSor2iR https://t.co/lKSNjdYEsa
— Daniel M Kammen (@dan_kammen) March 16, 2021

12. Toward a Union-Find decoder for quantum LDPC codes

Nicolas Delfosse, Vivien Londe, Michael Beverland

retweets: 49, favorites: 53 (03/17/2021 09:09:50)
links: abs | pdf
quant-ph | cs.IT

Quantum LDPC codes are a promising direction for low overhead quantum computing. In this paper, we propose a generalization of the Union-Find decoder as adecoder for quantum LDPC codes. We prove that this decoder corrects all errors with weight up to An^{\alpha} for some A, {\alpha} > 0 for different classes of quantum LDPC codes such as toric codes and hyperbolic codes in any dimension D \geq 3 and quantum expander codes. To prove this result, we introduce a notion of covering radius which measures the spread of an error from its syndrome. We believe this notion could find application beyond the decoding problem. We also perform numerical simulations, which show that our Union-Find decoder outperforms the belief propagation decoder in the low error rate regime in the case of a quantum LDPC code with length 3600.

Quantum LDPC codes are the future and they need better decoders.

Check out our new paper on a Union-Find decoder for LDPC codes with @vivien_londe and Michael Beverland.https://t.co/0f7l3ahmtM
— Nicolas Delfosse (@nic_delfosse) March 16, 2021

13. S2AND: A Benchmark and Evaluation System for Author Name Disambiguation

Shivashankar Subramanian, Daniel King, Doug Downey, Sergey Feldman

retweets: 90, favorites: 11 (03/17/2021 09:09:50)
links: abs | pdf
cs.DL

Author Name Disambiguation (AND) is the task of resolving which author mentions in a bibliographic database refer to the same real-world person, and is a critical ingredient of digital library applications such as search and citation analysis. While many AND algorithms have been proposed, comparing them is difficult because they often employ distinct features and are evaluated on different datasets. In response to this challenge, we present S2AND, a unified benchmark dataset for AND on scholarly papers, as well as an open-source reference model implementation. Our dataset harmonizes eight disparate AND datasets into a uniform format, with a single rich feature set drawn from the Semantic Scholar S2 database. Our evaluation suite for S2AND reports performance split by facets like publication year and number of papers, allowing researchers to track both global performance and measures of fairness across facet values. Our experiments show that because previous datasets tend to cover idiosyncratic and biased slices of the literature, algorithms trained to perform well on one on them may generalize poorly to others. By contrast, we show how training on a union of datasets in S2AND results in more robust models that perform well even on datasets unseen in training. The resulting AND model also substantially improves over the production algorithm in S2, reducing error by over 50% in terms of B^3 F1. We release our unified dataset, model code, trained models, and evaluation suite to the research community. https://github.com/allenai/S2AND/

14. Automated Fact-Checking for Assisting Human Fact-Checkers

Preslav Nakov, David Corney, Maram Hasanain, Firoj Alam, Tamer Elsayed, Alberto Barrón-Cedeño, Paolo Papotti, Shaden Shaar, Giovanni Da San Martino

retweets: 26, favorites: 52 (03/17/2021 09:09:50)
links: abs | pdf
cs.AI | cs.CL | cs.CR | cs.IR | cs.LG

The reporting and analysis of current events around the globe has expanded from professional, editor-lead journalism all the way to citizen journalism. Politicians and other key players enjoy direct access to their audiences through social media, bypassing the filters of official cables or traditional media. However, the multiple advantages of free speech and direct communication are dimmed by the misuse of the media to spread inaccurate or misleading claims. These phenomena have led to the modern incarnation of the fact-checker — a professional whose main aim is to examine claims using available evidence to assess their veracity. As in other text forensics tasks, the amount of information available makes the work of the fact-checker more difficult. With this in mind, starting from the perspective of the professional fact-checker, we survey the available intelligent technologies that can support the human expert in the different steps of her fact-checking endeavor. These include identifying claims worth fact-checking; detecting relevant previously fact-checked claims; retrieving relevant evidence to fact-check a claim; and actually verifying a claim. In each case, we pay attention to the challenges in future work and the potential impact on real-world fact-checking.

"Automated Fact-Checking for Assisting Human Fact-Checkers" https://t.co/XwqYDeOs1u -- We survey AI to can support the human expert in identifying claims worth fact-checking, detecting previously fact-checked claims, retrieving evidence, and verifying a claim #fakenews pic.twitter.com/XG9FSUEPhR
— Preslav Nakov (@preslav_nakov) March 16, 2021

15. Rotation Coordinate Descent for Fast Globally Optimal Rotation Averaging

Álvaro Parra, Shin-Fang Chng, Tat-Jun Chin, Anders Eriksson, Ian Reid

retweets: 20, favorites: 58 (03/17/2021 09:09:50)
links: abs | pdf
cs.CV

Under mild conditions on the noise level of the measurements, rotation averaging satisfies strong duality, which enables global solutions to be obtained via semidefinite programming (SDP) relaxation. However, generic solvers for SDP are rather slow in practice, even on rotation averaging instances of moderate size, thus developing specialised algorithms is vital. In this paper, we present a fast algorithm that achieves global optimality called rotation coordinate descent (RCD). Unlike block coordinate descent (BCD) which solves SDP by updating the semidefinite matrix in a row-by-row fashion, RCD directly maintains and updates all valid rotations throughout the iterations. This obviates the need to store a large dense semidefinite matrix. We mathematically prove the convergence of our algorithm and empirically show its superior efficiency over state-of-the-art global methods on a variety of problem configurations. Maintaining valid rotations also facilitates incorporating local optimisation routines for further speed-ups. Moreover, our algorithm is simple to implement; see supplementary material for a demonstration program.

Our paper on rotation averaging accepted as oral at #CVPR21!

RCD is able to efficiently find the optimal camera orientations on large problems. https://t.co/aSVjEIPyaj pic.twitter.com/KWDgvzS1B3
— Álvaro Parra (@DrAlvaroParra) March 16, 2021

16. PhotoApp: Photorealistic Appearance Editing of Head Portraits

Mallikarjun B R, Ayush Tewari, Abdallah Dib, Tim Weyrich, Bernd Bickel, Hans-Peter Seidel, Hanspeter Pfister, Wojciech Matusik, Louis Chevallier, Mohamed Elgharib, Christian Theobalt

retweets: 36, favorites: 36 (03/17/2021 09:09:50)
links: abs | pdf
cs.CV | cs.GR | cs.LG

Photorealistic editing of portraits is a challenging task as humans are very sensitive to inconsistencies in faces. We present an approach for high-quality intuitive editing of the camera viewpoint and scene illumination in a portrait image. This requires our method to capture and control the full reflectance field of the person in the image. Most editing approaches rely on supervised learning using training data captured with setups such as light and camera stages. Such datasets are expensive to acquire, not readily available and do not capture all the rich variations of in-the-wild portrait images. In addition, most supervised approaches only focus on relighting, and do not allow camera viewpoint editing. Thus, they only capture and control a subset of the reflectance field. Recently, portrait editing has been demonstrated by operating in the generative model space of StyleGAN. While such approaches do not require direct supervision, there is a significant loss of quality when compared to the supervised approaches. In this paper, we present a method which learns from limited supervised training data. The training images only include people in a fixed neutral expression with eyes closed, without much hair or background variations. Each person is captured under 150 one-light-at-a-time conditions and under 8 camera poses. Instead of training directly in the image space, we design a supervised problem which learns transformations in the latent space of StyleGAN. This combines the best of supervised learning and generative adversarial modeling. We show that the StyleGAN prior allows for generalisation to different expressions, hairstyles and backgrounds. This produces high-quality photorealistic results for in-the-wild images and significantly outperforms existing methods. Our approach can edit the illumination and pose simultaneously, and runs at interactive rates.

PhotoApp: Photorealistic Appearance Editing of Head Portraits
pdf: https://t.co/2PudYMHkk0
abs: https://t.co/rkFtRgqmBW
project page: https://t.co/wLRQp70brf pic.twitter.com/86Q98riFdF
— AK (@ak92501) March 16, 2021

17. Unsupervised Image Transformation Learning via Generative Adversarial Networks

Kaiwen Zha, Yujun Shen, Bolei Zhou

retweets: 16, favorites: 49 (03/17/2021 09:09:50)
links: abs | pdf
cs.CV

In this work, we study the image transformation problem by learning the underlying transformations from a collection of images using Generative Adversarial Networks (GANs). Specifically, we propose an unsupervised learning framework, termed as TrGAN, to project images onto a transformation space that is shared by the generator and the discriminator. Any two points in this projected space define a transformation that can guide the image generation process, leading to continuous semantic change. By projecting a pair of images onto the transformation space, we are able to adequately extract the semantic variation between them and further apply the extracted semantic to facilitating image editing, including not only transferring image styles (e.g., changing day to night) but also manipulating image contents (e.g., adding clouds in the sky). Code and models are available at https://genforce.github.io/trgan.

Unsupervised Image Transformation Learning via Generative Adversarial Networks
pdf: https://t.co/pFe5XPXrYb
abs: https://t.co/Mwcg3bKv16 pic.twitter.com/wMhWf0C0nd
— AK (@ak92501) March 16, 2021

18. Learning One Representation to Optimize All Rewards

Ahmed Touati, Yann Ollivier

retweets: 16, favorites: 46 (03/17/2021 09:09:51)
links: abs | pdf
cs.LG | cs.AI | math.OC

We introduce the forward-backward (FB) representation of the dynamics of a reward-free Markov decision process. It provides explicit near-optimal policies for any reward specified a posteriori. During an unsupervised phase, we use reward-free interactions with the environment to learn two representations via off-the-shelf deep learning methods and temporal difference (TD) learning. In the test phase, a reward representation is estimated either from observations or an explicit reward description (e.g., a target state). The optimal policy for that reward is directly obtained from these representations, with no planning. The unsupervised FB loss is well-principled: if training is perfect, the policies obtained are provably optimal for any reward function. With imperfect training, the sub-optimality is proportional to the unsupervised approximation error. The FB representation learns long-range relationships between states and actions, via a predictive occupancy map, without having to synthesize states as in model-based approaches. This is a step towards learning controllable agents in arbitrary black-box stochastic environments. This approach compares well to goal-oriented RL algorithms on discrete and continuous mazes, pixel-based MsPacman, and the FetchReach virtual robot arm. We also illustrate how the agent can immediately adapt to new tasks beyond goal-oriented RL.

Learning One Representation to Optimize All
Rewards
pdf: https://t.co/o2xjtwyWej
abs: https://t.co/yNXG5LvBQP
github: https://t.co/U3gJyL68uC pic.twitter.com/AGGCgB5JR3
— AK (@ak92501) March 16, 2021

Published 17 Mar 2021

ML Lead at Beatrust. (https://beatrust.com)Tatsuya Shirakawa on Twitter