1. The Values Encoded in Machine Learning Research
Abeba Birhane, Pratyusha Kalluri, Dallas Card, William Agnew, Ravit Dotan, Michelle Bao
Machine learning (ML) currently exerts an outsized influence on the world, increasingly affecting communities and institutional practices. It is therefore critical that we question vague conceptions of the field as value-neutral or universally beneficial, and investigate what specific values the field is advancing. In this paper, we present a rigorous examination of the values of the field by quantitatively and qualitatively analyzing 100 highly cited ML papers published at premier ML conferences, ICML and NeurIPS. We annotate key features of papers which reveal their values: how they justify their choice of project, which aspects they uplift, their consideration of potential negative consequences, and their institutional affiliations and funding sources. We find that societal needs are typically very loosely connected to the choice of project, if mentioned at all, and that consideration of negative consequences is extremely rare. We identify 67 values that are uplifted in machine learning research, and, of these, we find that papers most frequently justify and assess themselves based on performance, generalization, efficiency, researcher understanding, novelty, and building on previous work. We present extensive textual evidence and analysis of how these values are operationalized. Notably, we find that each of these top values is currently being defined and applied with assumptions and implications generally supporting the centralization of power. Finally, we find increasingly close ties between these highly cited papers and tech companies and elite universities.
Our (@radical_ai_, @dallascard, @willie_agnew, @DotanRavit, @TheMichelleBao & I) preprint is up on ArXiv (paper currently under review)
— Abeba Birhane (@Abebab) June 30, 2021
Paper: The Values Encoded in Machine Learning Research https://t.co/rGjFJ5C83k
Code and supplementary material: https://t.co/JJlPzuIEg8
1/ pic.twitter.com/8jkhsmdUDq
🎉New paper!📜 We annotated the values uplifted in ML papers to understand what the field as a whole is trying to acheive. We then analyzed the societal impacts of these top values, enabling understanding the impacts of even highly technical research. https://t.co/ulo6JvNhJx
— Willie Agnew (@willie_agnew) June 30, 2021
2. SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
Dara Bahri, Heinrich Jiang, Yi Tay, Donald Metzler
Self-supervised contrastive representation learning has proved incredibly successful in the vision and natural language domains, enabling state-of-the-art performance with orders of magnitude less labeled data. However, such methods are domain-specific and little has been done to leverage this technique on real-world tabular datasets. We propose SCARF, a simple, widely-applicable technique for contrastive learning, where views are formed by corrupting a random subset of features. When applied to pre-train deep neural networks on the 69 real-world, tabular classification datasets from the OpenML-CC18 benchmark, SCARF not only improves classification accuracy in the fully-supervised setting but does so also in the presence of label noise and in the semi-supervised setting where only a fraction of the available training data is labeled. We show that SCARF complements existing strategies and outperforms alternatives like autoencoders. We conduct comprehensive ablations, detailing the importance of a range of factors.
Excited to share our new paper from @GoogleAI. "SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption" (https://t.co/nm4Wh3LIMj). This is joint work with Heinrich Jiang, @ytay017 and @metzlerd. pic.twitter.com/Ht2hnvCC0i
— Dara Bahri (@dara_bahri) June 30, 2021
SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
— AK (@ak92501) June 30, 2021
pdf: https://t.co/YmfyNkLcC9
abs: https://t.co/m4kiQq0j5Q
technique for contrastive learning, where views are formed by corrupting a random subset of features pic.twitter.com/pP6phV9p3m
3. A Survey on Neural Speech Synthesis
Xu Tan, Tao Qin, Frank Soong, Tie-Yan Liu
- retweets: 2534, favorites: 309 (07/01/2021 09:39:35)
- links: abs | pdf
- eess.AS | cs.CL | cs.LG | cs.MM | cs.SD
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural speech given text, is a hot research topic in speech, language, and machine learning communities and has broad applications in the industry. As the development of deep learning and artificial intelligence, neural network-based TTS has significantly improved the quality of synthesized speech in recent years. In this paper, we conduct a comprehensive survey on neural TTS, aiming to provide a good understanding of current research and future trends. We focus on the key components in neural TTS, including text analysis, acoustic models and vocoders, and several advanced topics, including fast TTS, low-resource TTS, robust TTS, expressive TTS, and adaptive TTS, etc. We further summarize resources related to TTS (e.g., datasets, opensource implementations) and discuss future research directions. This survey can serve both academic researchers and industry practitioners working on TTS.
📘 A Survey on Neural Speech Synthesis
— elvis (@omarsar0) June 30, 2021
A survey paper summarizing current research, methods, and datasets used in neural speech synthesis. It also discusses future trends and provides a taxonomy.
A great read for ML researchers and practitioners. https://t.co/WPgGqLJQhn pic.twitter.com/R9rXTCVNnb
A Survey on Neural Speech Synthesis
— AK (@ak92501) June 30, 2021
pdf: https://t.co/kkykc53eM2
abs: https://t.co/IT0m7uOYIX pic.twitter.com/RblI1TWIRc
4. Time-Aware Language Models as Temporal Knowledge Bases
Bhuwan Dhingra, Jeremy R. Cole, Julian Martin Eisenschlos, Daniel Gillick, Jacob Eisenstein, William W. Cohen
Many facts come with an expiration date, from the name of the President to the basketball team Lebron James plays for. But language models (LMs) are trained on snapshots of data collected at a specific moment in time, and this can limit their utility, especially in the closed-book setting where the pretraining corpus must contain the facts the model should memorize. We introduce a diagnostic dataset aimed at probing LMs for factual knowledge that changes over time and highlight problems with LMs at either end of the spectrum — those trained on specific slices of temporal data, as well as those trained on a wide range of temporal data. To mitigate these problems, we propose a simple technique for jointly modeling text with its timestamp. This improves memorization of seen facts from the training time period, as well as calibration on predictions about unseen facts from future time periods. We also show that models trained with temporal context can be efficiently refreshed” as new data arrives, without the need for retraining from scratch.
🚨 Can language models answer who won the FIFA World Cup, or who's the president of Argentina? How can we adapt models to factual information that changes over time? 🚨
— Julian Eisenschlos (@eisenjulian) June 30, 2021
We tackle these and related questions in our latest work https://t.co/0wdoLai0Kj
1/5
Language models encode world knowledge, but how does a static LM represent a changing world? ⏳What happens as the LM's training data recedes into the past?⏳Are we consigned to retrain a neverending series of BERTs? Answers to these questions & more: https://t.co/kkQMmSW3dk https://t.co/JEVhwPUZBn
— Jacob Eisenstein (@jacobeisenstein) June 30, 2021
Time-Aware Language Models as Temporal Knowledge Bases
— AK (@ak92501) June 30, 2021
pdf: https://t.co/1lzWYRn4bq
abs: https://t.co/dZkN7LJTHJ
propose a time-aware language model which conditions on time using string prefixes pic.twitter.com/hxJPtgyArm
5. An Image is Worth More Than a Thousand Words: Towards Disentanglement in the Wild
Aviv Gabbay, Niv Cohen, Yedid Hoshen
Unsupervised disentanglement has been shown to be theoretically impossible without inductive biases on the models and the data. As an alternative approach, recent methods rely on limited supervision to disentangle the factors of variation and allow their identifiability. While annotating the true generative factors is only required for a limited number of observations, we argue that it is infeasible to enumerate all the factors of variation that describe a real-world image distribution. To this end, we propose a method for disentangling a set of factors which are only partially labeled, as well as separating the complementary set of residual factors that are never explicitly specified. Our success in this challenging setting, demonstrated on synthetic benchmarks, gives rise to leveraging off-the-shelf image descriptors to partially annotate a subset of attributes in real image domains (e.g. of human faces) with minimal manual effort. Specifically, we use a recent language-image embedding model (CLIP) to annotate a set of attributes of interest in a zero-shot manner and demonstrate state-of-the-art disentangled image manipulation results.
An Image is Worth More Than a Thousand Words: Towards Disentanglement in the Wild
— AK (@ak92501) June 30, 2021
pdf: https://t.co/vH9h1Ld50i
use CLIP to annotate a set of attributes of interest in a zero-shot manner and demonstrate sota disentangled image manipulation results pic.twitter.com/vKaZTWKKV5
6. The penumbra of open source: projects outside of centralized platforms are longer maintained, more academic and more collaborative
Milo Z. Trujillo, Laurent Hébert-Dufresne, James P. Bagrow
GitHub has become the central online platform for much of open source, hosting most open source code repositories. With this popularity, the public digital traces of GitHub are now a valuable means to study teamwork and collaboration. In many ways, however, GitHub is a convenience sample. We need to assess its representativeness, particularly how GitHub’s design may alter the working patterns of its users. Here we develop a novel, nearly-complete sample of public open source project repositories outside of centralized platforms like GitHub. We characterized these projects along a number of dimensions, and compare to a time-matched sample of corresponding GitHub projects. Compared to GitHub, these projects tend to have more collaborators, are maintained for longer periods, and tend to be more focused on academic and scientific problems.
Is GitHub representative of open source? What lies in the shadow of the giant?
— Laurent Hébert-Dufresne (@LHDnets) June 30, 2021
Project with @illegaldaydream & @bagrow:
The penumbra of open source: projects outside of centralized platforms are longer maintained, more academic and more collaborativehttps://t.co/Myykyro2RX
7. Scientific Credibility of Machine Translation Research: A Meta-Evaluation of 769 Papers
Benjamin Marie, Atsushi Fujita, Raphael Rubino
This paper presents the first large-scale meta-evaluation of machine translation (MT). We annotated MT evaluations conducted in 769 research papers published from 2010 to 2020. Our study shows that practices for automatic MT evaluation have dramatically changed during the past decade and follow concerning trends. An increasing number of MT evaluations exclusively rely on differences between BLEU scores to draw conclusions, without performing any kind of statistical significance testing nor human evaluation, while at least 108 metrics claiming to be better than BLEU have been proposed. MT evaluations in recent papers tend to copy and compare automatic metric scores from previous work to claim the superiority of a method or an algorithm without confirming neither exactly the same training, validating, and testing data have been used nor the metric scores are comparable. Furthermore, tools for reporting standardized metric scores are still far from being widely adopted by the MT community. After showing how the accumulation of these pitfalls leads to dubious evaluation, we propose a guideline to encourage better automatic MT evaluation along with a simple meta-evaluation scoring method to assess its credibility.
Spoiler: there is none. https://t.co/lDNeEZ6IVE
— Marcin Junczys-Dowmunt (Marian NMT) (@marian_nmt) June 30, 2021
8. A Mechanism for Producing Aligned Latent Spaces with Autoencoders
Saachi Jain, Adityanarayanan Radhakrishnan, Caroline Uhler
Aligned latent spaces, where meaningful semantic shifts in the input space correspond to a translation in the embedding space, play an important role in the success of downstream tasks such as unsupervised clustering and data imputation. In this work, we prove that linear and nonlinear autoencoders produce aligned latent spaces by stretching along the left singular vectors of the data. We fully characterize the amount of stretching in linear autoencoders and provide an initialization scheme to arbitrarily stretch along the top directions using these networks. We also quantify the amount of stretching in nonlinear autoencoders in a simplified setting. We use our theoretical results to align drug signatures across cell types in gene expression space and semantic shifts in word embedding spaces.
Autoencoders can produce aligned latent spaces by stretching along the top singular vectors of the data. This can be helpful for aligning meaningful directions in word embeddings and gene expression data. (Joint work with Adit Radha and Caroline Uhler). https://t.co/1x0n1kawPK pic.twitter.com/dRPuaCuXVp
— Saachi Jain (@saach_jain) June 30, 2021