Hot Papers 2020-08-21

1. Large Associative Memory Problem in Neurobiology and Machine Learning

Dmitry Krotov, John Hopfield

retweets: 163, favorites: 549 (08/22/2020 22:53:07)
links: abs | pdf
q-bio.NC | cond-mat.dis-nn | cs.CL | cs.LG | stat.ML

Dense Associative Memories or modern Hopfield networks permit storage and reliable retrieval of an exponentially large (in the dimension of feature space) number of memories. At the same time, their naive implementation is non-biological, since it seemingly requires the existence of many-body synaptic junctions between the neurons. We show that these models are effective descriptions of a more microscopic (written in terms of biological degrees of freedom) theory that has additional (hidden) neurons and only requires two-body interactions between them. For this reason our proposed microscopic theory is a valid model of large associative memory with a degree of biological plausibility. The dynamics of our network and its reduced dimensional equivalent both minimize energy (Lyapunov) functions. When certain dynamical variables (hidden neurons) are integrated out from our microscopic theory, one can recover many of the models that were previously discussed in the literature, e.g. the model presented in ”Hopfield Networks is All You Need” paper. We also provide an alternative derivation of the energy function and the update rule proposed in the aforementioned paper and clarify the relationships between various models of this class.

Dense Associative Memories, aka modern Hopfield networks, have a huge memory storage capacity. But are they biologically realistic? In our new paper with @DimaKrotov we argue that they can be written in terms of biological variables. https://t.co/w58EAQ54xm pic.twitter.com/dGp4npMRo2
— John Hopfield (@HopfieldJohn) August 18, 2020

New microscopic theory of Dense Associative Memory aka modern Hopfield network can be reduced to the model proposed in “Hopfield Networks is All You Need’’ paper, which is equivalent to self-attention mechanism of Transformers. Work with @HopfieldJohn https://t.co/jYAxIjvgKf pic.twitter.com/212qRx9JrQ
— Dmitry Krotov (@DimaKrotov) August 18, 2020

Large Associative Memory Problem in Neurobiology and Machine Learning

Biological plausible explanation of “Hopfield Networks is All You Need” by Krotov and Hopfield

Paper: https://t.co/8k8as68S4E

Discussion: https://t.co/DoEgQppEA3 pic.twitter.com/tDWf4QgYpE
— hardmaru (@hardmaru) August 19, 2020

2. Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries

Benjamin Heinzerling, Kentaro Inui

retweets: 42, favorites: 213 (08/22/2020 22:53:08)
links: abs | pdf
cs.CL

Pretrained language models have been suggested as a possible alternative or complement to structured knowledge bases. However, this emerging LM-as-KB paradigm has so far only been considered in a very limited setting, which only allows handling 21k entities whose single-token name is found in common LM vocabularies. Furthermore, the main benefit of this paradigm, namely querying the KB using a variety of natural language paraphrases, is underexplored so far. Here, we formulate two basic requirements for treating LMs as KBs: (i) the ability to store a large number facts involving a large number of entities and (ii) the ability to query stored facts. We explore three entity representations that allow LMs to represent millions of entities and present a detailed case study on paraphrased querying of world knowledge in LMs, thereby providing a proof-of-concept that language models can indeed serve as knowledge bases.

Can language models be used for knowledge bases?

This paper aims to provide a proof of concept with some discussions and potential requirements to achieve this.https://t.co/Tj6KonH2dB pic.twitter.com/XpYJrkMsGI
— elvis (@omarsar0) August 21, 2020

3. PRINCIPIA: a Decentralized Peer-Review Ecosystem

Andrea Mambrini, Andrea Baronchelli, Michele Starnini, Daniele Marinazzo, Manlio De Domenico

retweets: 52, favorites: 143 (08/22/2020 22:53:08)
links: abs | pdf
cs.DL | nlin.AO | physics.soc-ph

Peer review is a cornerstone of modern scientific endeavor. However, there is growing consensus that several limitations of the current peer review system, from lack of incentives to reviewers to lack of transparency, risks to undermine its benefits. Here, we introduce the PRINCIPIA (http://www.principia.network/) framework for peer-review of scientific outputs (e.g., papers, grant proposals or patents). The framework allows key players of the scientific ecosystem — including existing publishing groups — to create and manage peer-reviewed journals, by building a free market for reviews and publications. PRINCIPIA’s referees are transparently rewarded according to their efforts and the quality of their reviews. PRINCIPIA also naturally allows to recognize the prestige of users and journals, with an intrinsic reputation system that does not depend on third-parties. PRINCIPIA re-balances the power between researchers and publishers, stimulates valuable assessments from referees, favors a fair competition between journals, and reduces the costs to access research output and to publish.

Did you ever think, at least for 1s, that there is something wrong with the current peer-review and publishing system of research? If the answer is YES, then you might want to read this paper: https://t.co/MX1xEYqCdU

Spoiler: #PRINCIPIA, a peer-review market
Thread 1/n pic.twitter.com/cLv4wOfXj8
— Manlio De Domenico (@manlius84) August 21, 2020

4. Neural Networks and Quantum Field Theory

James Halverson, Anindita Maiti, Keegan Stoner

retweets: 40, favorites: 137 (08/22/2020 22:53:08)
links: abs | pdf
cs.LG | hep-th | stat.ML

We propose a theoretical understanding of neural networks in terms of Wilsonian effective field theory. The correspondence relies on the fact that many asymptotic neural networks are drawn from Gaussian processes, the analog of non-interacting field theories. Moving away from the asymptotic limit yields a non-Gaussian process and corresponds to turning on particle interactions, allowing for the computation of correlation functions of neural network outputs with Feynman diagrams. Minimal non-Gaussian process likelihoods are determined by the most relevant non-Gaussian terms, according to the flow in their coefficients induced by the Wilsonian renormalization group. This yields a direct connection between overparameterization and simplicity of neural network likelihoods. Whether the coefficients are constants or functions may be understood in terms of GP limit symmetries, as expected from ‘t Hooft’s technical naturalness. General theoretical calculations are matched to neural network experiments in the simplest class of models allowing the correspondence. Our formalism is valid for any of the many architectures that becomes a GP in an asymptotic limit, a property preserved under certain types of training.

Neural Networks and Quantum Field Theoryhttps://t.co/sGsiFakZfk
ニューラルネットワークをWilson流の有効場理論として理解しようという論文。ニューラルネットワークは漸近的にGauss過程で表せるということに着目し、Gauss過程⇔自由場理論、ニューラルネットワーク⇔有効場理論と対応付ける。
— (A,H,D) (@AHD21) August 21, 2020

I'm very excited about the direction of research by @jhhalverson et al, revealing a correspondence between deep neural networks and Wilsonian Effective Field Theory:https://t.co/mYM08Vyqed
— Ryan Reece (@RyanDavidReece) August 21, 2020

5. Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation

Jeevan Devaranjan, Amlan Kar, Sanja Fidler

retweets: 21, favorites: 134 (08/22/2020 22:53:08)
links: abs | pdf
cs.CV | cs.GR | cs.LG | eess.IV

Procedural models are being widely used to synthesize scenes for graphics, gaming, and to create (labeled) synthetic datasets for ML. In order to produce realistic and diverse scenes, a number of parameters governing the procedural models have to be carefully tuned by experts. These parameters control both the structure of scenes being generated (e.g. how many cars in the scene), as well as parameters which place objects in valid configurations. Meta-Sim aimed at automatically tuning parameters given a target collection of real images in an unsupervised way. In Meta-Sim2, we aim to learn the scene structure in addition to parameters, which is a challenging problem due to its discrete nature. Meta-Sim2 proceeds by learning to sequentially sample rule expansions from a given probabilistic scene grammar. Due to the discrete nature of the problem, we use Reinforcement Learning to train our model, and design a feature space divergence between our synthesized and target images that is key to successful training. Experiments on a real driving dataset show that, without any supervision, we can successfully learn to generate data that captures discrete structural statistics of objects, such as their frequency, in real images. We also show that this leads to downstream improvement in the performance of an object detector trained on our generated dataset as opposed to other baseline simulation methods. Project page: https://nv-tlabs.github.io/meta-sim-structure/.

Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation
pdf: https://t.co/TODGyy210M
abs: https://t.co/LNMZOhTcBs
project page: https://t.co/PP3yEdXr4w pic.twitter.com/78DVzbUu6p
— AK (@ak92501) August 21, 2020

6. The effect of data encoding on the expressive power of variational quantum machine learning models

Maria Schuld, Ryan Sweke, Johannes Jakob Meyer

retweets: 24, favorites: 126 (08/22/2020 22:53:08)
links: abs | pdf
quant-ph | stat.ML

Quantum computers can be used for supervised learning by treating parametrised quantum circuits as models that map data inputs to predictions. While a lot of work has been done to investigate practical implications of this approach, many important theoretical properties of these models remain unknown. Here we investigate how the strategy with which data is encoded into the model influences the expressive power of parametrised quantum circuits as function approximators. We show that one can naturally write a quantum model as a partial Fourier series in the data, where the accessible frequencies are determined by the nature of the data encoding gates in the circuit. By repeating simple data encoding gates multiple times, quantum models can access increasingly rich frequency spectra. We show that there exist quantum models which can realise all possible sets of Fourier coefficients, and therefore, if the accessible frequency spectrum is asymptotically rich enough, such models are universal function approximators.

Very happy to share a new preprint from Maria Schuld, @rndm_wlks and me: https://t.co/u6hC4GKmXY
We show that quantum learning models will always output a Fourier series where the encoding gates specify the frequencies and the rest of the circuit specifies the weights.

See 👇 pic.twitter.com/N9msqKgjWN
— Johannes Jakob Meyer (@jj_xyz) August 21, 2020

The expressive power of variational quantum machine
learning models - by @XanaduAI's Maria Schuld and @rndm_wlks, @jj_xyz! Out now! :)https://t.co/Qn7PVXomSe
— Christian Weedbrook (@_cweedbrook) August 21, 2020

7. A Survey on Text Simplification

Punardeep Sikka, Manmeet Singh, Allen Pink, Vijay Mago

retweets: 34, favorites: 45 (08/22/2020 22:53:09)
links: abs | pdf
cs.CL

Text Simplification (TS) aims to reduce the linguistic complexity of content to make it easier to understand. Research in TS has been of keen interest, especially as approaches to TS have shifted from manual, hand-crafted rules to automated simplification. This survey seeks to provide a comprehensive overview of TS, including a brief description of earlier approaches used, discussion of various aspects of simplification (lexical, semantic and syntactic), and latest techniques being utilized in the field. We note that the research in the field has clearly shifted towards utilizing deep learning techniques to perform TS, with a specific focus on developing solutions to combat the lack of data available for simplification. We also include a discussion of datasets and evaluations metrics commonly used, along with discussion of related fields within Natural Language Processing (NLP), like semantic similarity.

A Survey on Text Simplification. #NLP #DataScience #DeepLearning #DataMining #BigData #Analytics #Python #RStats #TensorFlow #IoT #Java #JavaScript #ReactJS #GoLang #Serverless #Linux #Cloud #AI #Programmer #MachineLearning #ArtificialIntelligence #NLProc https://t.co/GiS1xy8hck pic.twitter.com/SSmwlpFWg4
— Marcus Borba (@marcusborba) August 21, 2020

A survey paper on text simplification.

"Text Simplification (TS) aims to reduce the linguistic complexity of content to make it easier to understand."

This line of research has tremendous potential to build more accessible educational content online. https://t.co/Ls7vBOL2CP pic.twitter.com/VCIHrF6rKg
— elvis (@omarsar0) August 21, 2020

Published 22 Aug 2020

ML Lead at Beatrust. (https://beatrust.com)Tatsuya Shirakawa on Twitter