1. Object-Centric Learning with Slot Attention
Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf
Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features. Yet, most deep learning approaches learn distributed representations that do not capture the compositional properties of natural scenes. In this paper, we present the Slot Attention module, an architectural component that interfaces with perceptual representations such as the output of a convolutional neural network and produces a set of task-dependent abstract representations which we call slots. These slots are exchangeable and can bind to any object in the input by specializing through a competitive procedure over multiple rounds of attention. We empirically demonstrate that Slot Attention can extract object-centric representations that enable generalization to unseen compositions when trained on unsupervised object discovery and supervised property prediction tasks.
Excited to share our work @GoogleAI on Object-centric Learning with Slot Attention!
— Thomas Kipf (@thomaskipf) June 29, 2020
Slot Attention is a simple module for structure discovery and set prediction: it uses iterative attention to group perceptual inputs into a set of slots.
Paper: https://t.co/xfpjQwMuLP
[1/7] pic.twitter.com/0CzWO9B1fV
Super excited to share what I’ve been working on in the past months during my internship at Google Brain in Amsterdam: "Object-Centric Learning with Slot Attention" https://t.co/nkhPtVTLLJ @GoogleAI [1/7] pic.twitter.com/1aNYLm4exj
— Francesco Locatello (@FrancescoLocat8) June 29, 2020
2. Evaluation of Text Generation: A Survey
Asli Celikyilmaz, Elizabeth Clark, Jianfeng Gao
The paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years. We group NLG evaluation methods into three categories: (1) human-centric evaluation metrics, (2) automatic metrics that require no training, and (3) machine-learned metrics. For each category, we discuss the progress that has been made and the challenges still being faced, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models. We then present two case studies of automatic text summarization and long text generation, and conclude the paper by proposing future research directions.
Evaluation of Text Generation: A Survey: https://t.co/d666nSGT7Y
— Denny Britz (@dennybritz) June 29, 2020
This looks like a neat survey of human, automated and learned metrics. It also has one of the longest bibliography sections I’ve ever seen, 16 pages….
Evaluation is an important topic in NLP, particularly when it deals with models that generate text.
— elvis (@omarsar0) June 29, 2020
This is an impressive 50+ pages survey on the evaluation of text generation -- from human-centric to machine-learning centric methods.https://t.co/fTd0tBcDRv pic.twitter.com/ZIJYHZSP9N
3. A Framework for Reinforcement Learning and Planning
Thomas M. Moerland, Joost Broekens, Catholijn M. Jonker
Sequential decision making, commonly formalized as Markov Decision Process optimization, is a key challenge in artificial intelligence. Two successful approaches to MDP optimization are planning and reinforcement learning. Both research fields largely have their own research communities. However, if both research fields solve the same problem, then we should be able to disentangle the common factors in their solution approaches. Therefore, this paper presents a unifying framework for reinforcement learning and planning (FRAP), which identifies the underlying dimensions on which any planning or learning algorithm has to decide. At the end of the paper, we compare - in a single table - a variety of well-known planning, model-free and model-based RL algorithms along the dimensions of our framework, illustrating the validity of the framework. Altogether, FRAP provides deeper insight into the algorithmic space of planning and reinforcement learning, and also suggests new approaches to integration of both fields.
FRAP is a systematic approach to categorize and compare planning and reinforcement learning approaches. It puts planning algorithms like A* and RL algorithms like Q-learning into one underlying framework: https://t.co/fA8MseIsEs
— Denny Britz (@dennybritz) June 29, 2020
4. Critic Regularized Regression
Ziyu Wang, Alexander Novikov, Konrad Żołna, Jost Tobias Springenberg, Scott Reed, Bobak Shahriari, Noah Siegel, Josh Merel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas
Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy optimization from large pre-recorded datasets without online environment interaction. It addresses challenges with regard to the cost of data collection and safety, both of which are particularly pertinent to real-world applications of RL. Unfortunately, most off-policy algorithms perform poorly when learning from a fixed dataset. In this paper, we propose a novel offline RL algorithm to learn policies from data using a form of critic-regularized regression (CRR). We find that CRR performs surprisingly well and scales to tasks with high-dimensional state and action spaces — outperforming several state-of-the-art offline RL algorithms by a significant margin on a wide range of benchmark tasks.
Check out our offline RL method: CRR. Main idea is to only train policy with supervised loss on behavior data (so that policy doesn't produce out-of-distribution actions), but weighting loss with exponent of advantage function to take rewards into accounthttps://t.co/ZBv69p2d7u pic.twitter.com/0aKb5I2bth
— Alexander Novikov (@SashaVNovikov) June 29, 2020
5. Pre-training via Paraphrasing
Mike Lewis, Marjan Ghazvininejad, Gargi Ghosh, Armen Aghajanyan, Sida Wang, Luke Zettlemoyer
We introduce MARGE, a pre-trained sequence-to-sequence model learned with an unsupervised multi-lingual multi-document paraphrasing objective. MARGE provides an alternative to the dominant masked language modeling paradigm, where we self-supervise the reconstruction of target text by retrieving a set of related texts (in many languages) and conditioning on them to maximize the likelihood of generating the original. We show it is possible to jointly learn to do retrieval and reconstruction, given only a random initialization. The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks. For example, with no additional task-specific training we achieve BLEU scores of up to 35.8 for document translation. We further show that fine-tuning gives strong performance on a range of discriminative and generative tasks in many languages, making MARGE the most generally applicable pre-training method to date.
6. Homotopy Theoretic and Categorical Models of Neural Information Networks
Yuri Manin, Matilde Marcolli
In this paper we develop a novel mathematical formalism for the modeling of neural information networks endowed with additional structure in the form of assignments of resources, either computational or metabolic or informational. The starting point for this construction is the notion of summing functors and of Segal’s Gamma-spaces in homotopy theory. The main results in this paper include functorial assignments of concurrent/distributed computing architectures and associated binary codes to networks and their subsystems, a categorical form of the Hopfield network dynamics, which recovers the usual Hopfield equations when applied to a suitable category of weighted codes, a functorial assignment to networks of corresponding information structures and information cohomology, and a cohomological version of integrated information.
New preprint by Yuri Manin (!) and Matilde Marcolli https://t.co/38qwIUYgYu pic.twitter.com/5Sds7gXbFf
— Maxim Raginsky (@mraginsky) June 29, 2020
7. Can 3D Adversarial Logos Cloak Humans?
Tianlong Chen, Yi Wang, Jingyang Zhou, Sijia Liu, Shiyu Chang, Chandrajit Bajaj, Zhangyang Wang
With the trend of adversarial attacks, researchers attempt to fool trained object detectors in 2D scenes. Among many of them, an intriguing new form of attack with potential real-world usage is to append adversarial patches (e.g. logos) to images. Nevertheless, much less have we known about adversarial attacks from 3D rendering views, which is essential for the attack to be persistently strong in the physical world. This paper presents a new 3D adversarial logo attack: we construct an arbitrary shape logo from a 2D texture image and map this image into a 3D adversarial logo via a texture mapping called logo transformation. The resulting 3D adversarial logo is then viewed as an adversarial texture enabling easy manipulation of its shape and position. This greatly extends the versatility of adversarial training for computer graphics synthesized imagery. Contrary to the traditional adversarial patch, this new form of attack is mapped into the 3D object world and back-propagates to the 2D image domain through differentiable rendering. In addition, and unlike existing adversarial patches, our new 3D adversarial logo is shown to fool state-of-the-art deep object detectors robustly under model rotations, leading to one step further for realistic attacks in the physical world. Our codes are available at https://github.com/TAMU-VITA/3D_Adversarial_Logo.
Can 3D Adversarial Logos Cloak Humans?
— roadrunner01 (@ak92501) June 29, 2020
pdf: https://t.co/HO7MRWhWUz
abs: https://t.co/g0lgIvSWvf
github: https://t.co/s5nacxTlF6 pic.twitter.com/FFE9ioPN5b
🤔 https://t.co/PngQeMY2K2 pic.twitter.com/qy79gPQ0DL
— Hiroharu Kato (@hiroharu_kato) June 29, 2020
8. SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans
Angela Dai, Yawar Siddiqui, Justus Thies, Julien Valentin, Matthias Nießner
We present SPSG, a novel approach to generate high-quality, colored 3D models of scenes from RGB-D scan observations by learning to infer unobserved scene geometry and color in a self-supervised fashion. Our self-supervised approach learns to jointly inpaint geometry and color by correlating an incomplete RGB-D scan with a more complete version of that scan. Notably, rather than relying on 3D reconstruction losses to inform our 3D geometry and color reconstruction, we propose adversarial and perceptual losses operating on 2D renderings in order to achieve high-resolution, high-quality colored reconstructions of scenes. This exploits the high-resolution, self-consistent signal from individual raw RGB-D frames, in contrast to fused 3D reconstructions of the frames which exhibit inconsistencies from view-dependent effects, such as color balancing or pose inconsistencies. Thus, by informing our 3D scene generation directly through 2D signal, we produce high-quality colored reconstructions of 3D scenes, outperforming state of the art on both synthetic and real data.
SPSG: Self-Supervised Photometric Scene Generation
— roadrunner01 (@ak92501) June 29, 2020
from RGB-D Scans
pdf: https://t.co/iM742nYqJt
abs: https://t.co/tc8uX1mgCj pic.twitter.com/H1GLo1z355
9. Layerwise learning for quantum neural networks
Andrea Skolik, Jarrod R. McClean, Masoud Mohseni, Patrick van der Smagt, Martin Leib
With the increased focus on quantum circuit learning for near-term applications on quantum devices, in conjunction with unique challenges presented by cost function landscapes of parametrized quantum circuits, strategies for effective training are becoming increasingly important. In order to ameliorate some of these challenges, we investigate a layerwise learning strategy for parametrized quantum circuits. The circuit depth is incrementally grown during optimization, and only subsets of parameters are updated in each training step. We show that when considering sampling noise, this strategy can help avoid the problem of barren plateaus of the error surface due to the low depth of circuits, low number of parameters trained in one step, and larger magnitude of gradients compared to training the full circuit. These properties make our algorithm preferable for execution on noisy intermediate-scale quantum devices. We demonstrate our approach on an image-classification task on handwritten digits, and show that layerwise learning attains an 8% lower generalization error on average in comparison to standard learning schemes for training quantum circuits of the same size. Additionally, the percentage of runs that reach lower test errors is up to 40% larger compared to training the full circuit, which is susceptible to creeping onto a plateau during training.
Glad to finally see our paper on layerwise learning for parametrized circuits finished, where we show how to increase the probability of successfully training circuits on NISQ devices! https://t.co/8iKi1S20zB @JarrodMcclean @masoud_mohseni Patrick van der Smagt @LeibMartin pic.twitter.com/Ntc6dahjnS
— Andrea Skolik (@askolik8) June 29, 2020
10. Machine learning-based clinical prediction modeling — A practical guide for clinicians
Julius M. Kernbach, Victor E. Staartjes
In the emerging era of big data, larger available clinical datasets and computational advances have sparked a massive interest in machine learning-based approaches. The number of manuscripts related to machine learning or artificial intelligence has exponentially increased over the past years. As analytical machine learning tools become readily available for clinicians to use, the understanding of key concepts and the awareness of analytical pitfalls are increasingly required for clinicians, investigators, reviewers and editors, who even as experts in their clinical field, sometimes find themselves insufficiently equipped to evaluate machine learning methodologies. In the first section, we provide explanations on the general principles of machine learning, as well as analytical steps required for successful machine learning-based predictive modelling - which is the focus of this series. In further sections, we review the importance of resampling, overfitting and model generalizability as well as feature reduction and selection (Part II), strategies for model evaluation, reporting and discussion of common caveats and other points of significance (Part III), as well as offer a practical guide to classification (Part IV) and regression modelling (Part V), with a complete coding pipeline. Methodological rigor and clarity as well as understanding of the underlying reasoning of the internal workings of a machine learning approach are required, otherwise predictive applications despite being strong analytical tools are not well accepted into the clinical routine. Going forward, machine learning and artificial intelligence shape and influence modern medicine across disciplines including the field of neurosurgery.
Anyone interested in #MachineLearning: Check out our series on #ML-based clinical prediction modeling „for dummies“ available as a preprint at:https://t.co/R4MuyN8Rb8@arxiv @EANSonline @EANS_yns @mnstienen @Unispital_USZ @UZH_en @SFCNSyouclin @vittoriostumpo_ @realbrainbook pic.twitter.com/C21VYjwvEy
— Victor Staartjes (@staartjesneuro) June 29, 2020