1. From Motor Control to Team Play in Simulated Humanoid Football
Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess
Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to serve goals defined on much longer timescales, and in terms of relations that extend far beyond the body itself, ultimately involving coordination with other agents. Recent research in artificial intelligence has shown the promise of learning-based approaches to the respective problems of complex movement, longer-term planning and multi-agent coordination. However, there is limited research aimed at their integration. We study this problem by training teams of physically simulated humanoid avatars to play football in a realistic virtual environment. We develop a method that combines imitation learning, single- and multi-agent reinforcement learning and population-based training, and makes use of transferable representations of behaviour for decision making at different levels of abstraction. In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds. We investigate the emergence of behaviours at different levels of abstraction, as well as the representations that underlie these behaviours using several analysis techniques, including statistics from real-world sports analytics. Our work constitutes a complete demonstration of integrated decision-making at multiple scales in a physically embodied multi-agent setting. See project video at https://youtu.be/KHMwq9pv7mg.
From Motor Control to Team Play in Simulated Humanoid Football
— AK (@ak92501) May 27, 2021
pdf: https://t.co/rCLmDoicRd
abs: https://t.co/bOccFcOhf9
video: https://t.co/pebXTv7f7T pic.twitter.com/Iq3RqeUTbj
2. Aggregating Nested Transformers
Zizhao Zhang, Han Zhang, Long Zhao, Ting Chen, Tomas Pfister
Although hierarchical structures are popular in recent vision transformers, they require sophisticated designs and massive datasets to work well. In this work, we explore the idea of nesting basic local transformers on non-overlapping image blocks and aggregating them in a hierarchical manner. We find that the block aggregation function plays a critical role in enabling cross-block non-local information communication. This observation leads us to design a simplified architecture with minor code changes upon the original vision transformer and obtains improved performance compared to existing methods. Our empirical results show that the proposed method NesT converges faster and requires much less training data to achieve good generalization. For example, a NesT with 68M parameters trained on ImageNet for 100/300 epochs achieves accuracy evaluated on image size, outperforming previous methods with up to parameter reduction. Training a NesT with 6M parameters from scratch on CIFAR10 achieves accuracy using a single GPU, setting a new state of the art for vision transformers. Beyond image classification, we extend the key idea to image generation and show NesT leads to a strong decoder that is 8 faster than previous transformer based generators. Furthermore, we also propose a novel method for visually interpreting the learned model.
Aggregating Nested Transformers
— AK (@ak92501) May 27, 2021
pdf: https://t.co/qbGFpbmGE0
abs: https://t.co/Ju3GLt1l7M
68M achieves 82.3%/83.8% accuracy, NesT with 6M parameters from scratch on CIFAR10 achieves 96% accuracy using a single GPU, new SOTA, strong decoder 8× faster pic.twitter.com/P20cmHlLVM
Aggregating Nested Transformers
— Aran Komatsuzaki (@arankomatsuzaki) May 27, 2021
NesT outperforms previous methods with up to 57% parameter reduction on Imagenet and leads to a strong generative model that is 8x faster than previous transformer based generators.https://t.co/1psqq7stsY pic.twitter.com/LyI22Y0sfR
3. Neural Radiosity
Saeed Hadadan, Shuhong Chen, Matthias Zwicker
We introduce Neural Radiosity, an algorithm to solve the rendering equation by minimizing the norm of its residual similar as in traditional radiosity techniques. Traditional basis functions used in radiosity techniques, such as piecewise polynomials or meshless basis functions are typically limited to representing isotropic scattering from diffuse surfaces. Instead, we propose to leverage neural networks to represent the full four-dimensional radiance distribution, directly optimizing network parameters to minimize the norm of the residual. Our approach decouples solving the rendering equation from rendering (perspective) images similar as in traditional radiosity techniques, and allows us to efficiently synthesize arbitrary views of a scene. In addition, we propose a network architecture using geometric learnable features that improves convergence of our solver compared to previous techniques. Our approach leads to an algorithm that is simple to implement, and we demonstrate its effectiveness on a variety of scenes with non-diffuse surfaces.
Neural Radiosity
— AK (@ak92501) May 27, 2021
pdf: https://t.co/9euDT2JXuA
abs: https://t.co/JsVLXoHRf9
leverage nns to represent the full four-dimensional radiance distribution, directly optimizing network parameters to minimize the norm of the residual pic.twitter.com/JJ9EaD688x
4. SimNet: Learning Reactive Self-driving Simulations from Real-world Observations
Luca Bergamini, Yawei Ye, Oliver Scheel, Long Chen, Chih Hu, Luca Del Pero, Blazej Osinski, Hugo Grimmett, Peter Ondruska
In this work, we present a simple end-to-end trainable machine learning system capable of realistically simulating driving experiences. This can be used for the verification of self-driving system performance without relying on expensive and time-consuming road testing. In particular, we frame the simulation problem as a Markov Process, leveraging deep neural networks to model both state distribution and transition function. These are trainable directly from the existing raw observations without the need for any handcrafting in the form of plant or kinematic models. All that is needed is a dataset of historical traffic episodes. Our formulation allows the system to construct never seen scenes that unfold realistically reacting to the self-driving car’s behaviour. We train our system directly from 1,000 hours of driving logs and measure both realism, reactivity of the simulation as the two key properties of the simulation. At the same time, we apply the method to evaluate the performance of a recently proposed state-of-the-art ML planning system trained from human driving logs. We discover this planning system is prone to previously unreported causal confusion issues that are difficult to test by non-reactive simulation. To the best of our knowledge, this is the first work that directly merges highly realistic data-driven simulations with a closed-loop evaluation for self-driving vehicles. We make the data, code, and pre-trained models publicly available to further stimulate simulation development.
SimNet: Learning Reactive Self-driving Simulations
— AK (@ak92501) May 27, 2021
from Real-world Observations
pdf: https://t.co/ImQFTUEdsR
abs: https://t.co/VcULUHDRZV
project page: https://t.co/AGzDUEA65H
code: https://t.co/M4IDEjxjQM
colab: https://t.co/iaXgqvCv8m
video: https://t.co/kYsT9ZroHw pic.twitter.com/gClzozfhrt
5. Provable Representation Learning for Imitation with Contrastive Fourier Features
Ofir Nachum, Mengjiao Yang
In imitation learning, it is common to learn a behavior policy to match an unknown target policy via max-likelihood training on a collected set of target demonstrations. In this work, we consider using offline experience datasets - potentially far from the target distribution - to learn low-dimensional state representations that provably accelerate the sample-efficiency of downstream imitation learning. A central challenge in this setting is that the unknown target policy itself may not exhibit low-dimensional behavior, and so there is a potential for the representation learning objective to alias states in which the target policy acts differently. Circumventing this challenge, we derive a representation learning objective which provides an upper bound on the performance difference between the target policy and a lowdimensional policy trained with max-likelihood, and this bound is tight regardless of whether the target policy itself exhibits low-dimensional structure. Moving to the practicality of our method, we show that our objective can be implemented as contrastive learning, in which the transition dynamics are approximated by either an implicit energy-based model or, in some special cases, an implicit linear model with representations given by random Fourier features. Experiments on both tabular environments and high-dimensional Atari games provide quantitative evidence for the practical benefits of our proposed objective.
Provable Representation Learning for Imitation with Contrastive Fourier Features
— AK (@ak92501) May 27, 2021
pdf: https://t.co/BRjzpQRU9b
abs: https://t.co/diwNGPfRhD pic.twitter.com/K5vkvgd47d
6. Entropy and complexity unveil the landscape of memes evolution
Carlo Michele Valensise, Alessandra Serra, Alessandro Galeazzi, Gabriele Etta, Matteo Cinelli, Walter Quattrociocchi
- retweets: 56, favorites: 36 (05/28/2021 06:58:12)
- links: abs | pdf
- physics.soc-ph | cs.CY
On the Internet, information circulates fast and widely, and the form of content adapts to comply with users’ cognitive abilities. Memes are an emerging aspect of the internet system of signification, and their visual schemes evolve by adapting to a heterogeneous context. A fundamental question is whether they present culturally and temporally transcendent characteristics in their organizing principles. In this work, we study the evolution of 2 million visual memes from Reddit over ten years, from 2011 to 2020, in terms of their statistical complexity and entropy. We find support for the hypothesis that memes are part of an emerging form of internet metalanguage: on one side, we observe an exponential growth with a doubling time of approximately 6 months; on the other side, the complexity of memes contents increases, allowing and adapting to represent social trends and attitudes.
Our latest work describing the evolution of 10 years of memes on @reddit in terms of their visual complexity. Humor is layered and memes have to adapt to this stratified world. https://t.co/V585cq836c@valensic_ @Walter4C @DeveloperGale @gbrtte_ @AlessandraSerr1 pic.twitter.com/aSGbvgAf9S
— Matteo Cinelli (@matteo_cinelli) May 27, 2021
7. Networks of climate change: Connecting causes and consequences
Petter Holme, Juan C. Rocha
- retweets: 42, favorites: 36 (05/28/2021 06:58:12)
- links: abs | pdf
- physics.soc-ph | cs.SI | physics.ao-ph | q-bio.PE
Understanding the causes and consequences of, and devising countermeasures to, global warming is a profoundly complex problem. Even when researchers narrow down the focus to a publishable investigation, their analysis often contains enough interacting components to require a network visualization. Networks are thus both necessary and natural elements of climate science. Furthermore, networks form a mathematical foundation for a multitude of computational and analytical techniques. We are only beginning to see the benefits of this connection between the sciences of climate change and networks. In this review, we cover use-cases of networks in the climate-change literature — what they represent, how they are analyzed, and what insights they bring. We also discuss network data, tools, and problems yet to be explored.
Networks of climate change
— Petter Holme (@pholme) May 27, 2021
Finally past the scrutiny of arXiv moderators: https://t.co/owvYm2aSWH
The arXiv version doesn't have a link to the code/data: https://t.co/tNZjMoCo4O
8. Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks
Ruchir Puri, David S. Kung, Geert Janssen, Wei Zhang, Giacomo Domeniconi, Vladmir Zolotov, Julian Dolby, Jie Chen, Mihir Choudhury, Lindsey Decker, Veronika Thost, Luca Buratti, Saurabh Pujar, Ulrich Finkler
Advancements in deep learning and machine learning algorithms have enabled breakthrough progress in computer vision, speech recognition, natural language processing and beyond. In addition, over the last several decades, software has been built into the fabric of every aspect of our society. Together, these two trends have generated new interest in the fast-emerging research area of AI for Code. As software development becomes ubiquitous across all industries and code infrastructure of enterprise legacy applications ages, it is more critical than ever to increase software development productivity and modernize legacy applications. Over the last decade, datasets like ImageNet, with its large scale and diversity, have played a pivotal role in algorithmic advancements from computer vision to language and speech understanding. In this paper, we present Project CodeNet, a first-of-its-kind, very large scale, diverse, and high-quality dataset to accelerate the algorithmic advancements in AI for Code. It consists of 14M code samples and about 500M lines of code in 55 different programming languages. Project CodeNet is not only unique in its scale, but also in the diversity of coding tasks it can help benchmark: from code similarity and classification for advances in code recommendation algorithms, and code translation between a large variety programming languages, to advances in code performance (both runtime, and memory) improvement techniques. CodeNet also provides sample input and output test sets for over 7M code samples, which can be critical for determining code equivalence in different languages. As a usability feature, we provide several preprocessing tools in Project CodeNet to transform source codes into representations that can be readily used as inputs into machine learning models.
Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks
— AK (@ak92501) May 27, 2021
pdf: https://t.co/OxnMLbyngD
abs: https://t.co/rcV0RWqCjk
14M code samples and about 500M lines of code in 55 different programming languages pic.twitter.com/VqkDGi0cuD
9. IntelliCAT: Intelligent Machine Translation Post-Editing with Quality Estimation and Translation Suggestion
Dongjun Lee, Junhyeong Ahn, Heesoo Park, Jaemin Jo
We present IntelliCAT, an interactive translation interface with neural models that streamline the post-editing process on machine translation output. We leverage two quality estimation (QE) models at different granularities: sentence-level QE, to predict the quality of each machine-translated sentence, and word-level QE, to locate the parts of the machine-translated sentence that need correction. Additionally, we introduce a novel translation suggestion model conditioned on both the left and right contexts, providing alternatives for specific words or phrases for correction. Finally, with word alignments, IntelliCAT automatically preserves the original document’s styles in the translated document. The experimental results show that post-editing based on the proposed QE and translation suggestions can significantly improve translation quality. Furthermore, a user study reveals that three features provided in IntelliCAT significantly accelerate the post-editing task, achieving a 52.9% speedup in translation time compared to translating from scratch. The interface is publicly available at https://intellicat.beringlab.com/.