All Articles

Hot Papers 2020-09-23

1. Ethical Machine Learning in Health

Irene Y. Chen, Emma Pierson, Sherri Rose, Shalmali Joshi, Kadija Ferryman, Marzyeh Ghassemi

The use of machine learning (ML) in health care raises numerous ethical concerns, especially as models can amplify existing health inequities. Here, we outline ethical considerations for equitable ML in the advancement of health care. Specifically, we frame ethics of ML in health care through the lens of social justice. We describe ongoing efforts and outline challenges in a proposed pipeline of ethical ML in health, ranging from problem selection to post-deployment considerations. We close by summarizing recommendations to address these challenges.

2. SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness

Nathan Ng, Kyunghyun Cho, Marzyeh Ghassemi

Models that perform well on a training domain often fail to generalize to out-of-domain (OOD) examples. Data augmentation is a common method used to prevent overfitting and improve OOD generalization. However, in natural language, it is difficult to generate new examples that stay on the underlying data manifold. We introduce SSMBA, a data augmentation method for generating synthetic training examples by using a pair of corruption and reconstruction functions to move randomly on a data manifold. We investigate the use of SSMBA in the natural language domain, leveraging the manifold assumption to reconstruct corrupted text with masked language models. In experiments on robustness benchmarks across 3 tasks and 9 datasets, SSMBA consistently outperforms existing data augmentation methods and baseline models on both in-domain and OOD data, achieving gains of 0.8% accuracy on OOD Amazon reviews, 1.8% accuracy on OOD MNLI, and 1.4 BLEU on in-domain IWSLT14 German-English.

3. A survey on Kornia: an Open Source Differentiable Computer Vision Library for PyTorch

E. Riba, D. Mishkin, J. Shi, D. Ponsa, F. Moreno-Noguer, G. Bradski

  • retweets: 585, favorites: 130 (09/24/2020 09:22:02)
  • links: abs | pdf
  • cs.CV

This work presents Kornia, an open source computer vision library built upon a set of differentiable routines and modules that aims to solve generic computer vision problems. The package uses PyTorch as its main backend, not only for efficiency but also to take advantage of the reverse auto-differentiation engine to define and compute the gradient of complex functions. Inspired by OpenCV, Kornia is composed of a set of modules containing operators that can be integrated into neural networks to train models to perform a wide range of operations including image transformations,camera calibration, epipolar geometry, and low level image processing techniques, such as filtering and edge detection that operate directly on high dimensional tensor representations on graphical processing units, generating faster systems. Examples of classical vision problems implemented using our framework are provided including a benchmark comparing to existing vision libraries.

4. A narrowing of AI research?

Joel Klinger, Juan Mateos-Garcia, Konstantinos Stathoulopoulos

  • retweets: 160, favorites: 29 (09/24/2020 09:22:02)
  • links: abs | pdf
  • cs.CY

Artificial Intelligence (AI) is being hailed as the latest example of a General Purpose Technology that could transform productivity and help tackle important societal challenges. This outcome is however not guaranteed: a myopic focus on short-term benefits could lock AI into technologies that turn out to be sub-optimal in the longer-run. Recent controversies about the dominance of deep learning methods and private labs in AI research suggest that the field may be getting narrower, but the evidence base is lacking. We seek to address this gap with an analysis of the thematic diversity of AI research in arXiv, a widely used pre-prints site. Having identified 110,000 AI papers in this corpus, we use hierarchical topic modelling to estimate the thematic composition of AI research, and this composition to calculate various metrics of research diversity. Our analysis suggests that diversity in AI research has stagnated in recent years, and that AI research involving private sector organisations tends to be less diverse than research in academia. This appears to be driven by a small number of prolific and narrowly-focused technology companies. Diversity in academia is bolstered by smaller institutions and research groups that may have less incentives to `race’ and lower levels of collaboration with the private sector. We also find that private sector AI researchers tend to specialise in data and computationally intensive deep learning methods at the expense of research involving other (symbolic and statistical) AI methods, and of research that considers the societal and ethical implications of AI or applies it in domains like health. Our results suggest that there may be a rationale for policy action to prevent a premature narrowing of AI research that could reduce its societal benefits, but we note the incentive, information and scale hurdles standing in the way of such interventions.

5. On the Theory of Modern Quantum Algorithms

Jacob Biamonte

This dissertation unites variational computation with results and techniques appearing in the theory of ground state computation. It should be readable by graduate students. The topics covered include: Ising model reductions, stochastic versus quantum processes on graphs, quantum gates and circuits as tensor networks, variational quantum algorithms and Hamiltonian gadgets.

6. Proposal of a Novel Bug Bounty Implementation Using Gamification

Jamie O’Hare, Lynsay A. Shepherd

Despite significant popularity, the bug bounty process has remained broadly unchanged since its inception, with limited implementation of gamification aspects. Existing literature recognises that current methods generate intensive resource demands, and can encounter issues impacting program effectiveness. This paper proposes a novel bug bounty process aiming to alleviate resource demands and mitigate inherent issues. Through the additional crowdsourcing of report verification where fellow hackers perform vulnerability verification and reproduction, the client organisation can reduce overheads at the cost of rewarding more participants. The incorporation of gamification elements provides a substitute for monetary rewards, as well as presenting possible mitigation of bug bounty program effectiveness issues. Collectively, traits of the proposed process appear appropriate for resource and budget-constrained organisations - such Higher Education institutions.

7. ALICE: Active Learning with Contrastive Natural Language Explanations

Weixin Liang, James Zou, Zhou Yu

Training a supervised neural network classifier typically requires many annotated training samples. Collecting and annotating a large number of data points are costly and sometimes even infeasible. Traditional annotation process uses a low-bandwidth human-machine communication interface: classification labels, each of which only provides several bits of information. We propose Active Learning with Contrastive Explanations (ALICE), an expert-in-the-loop training framework that utilizes contrastive natural language explanations to improve data efficiency in learning. ALICE learns to first use active learning to select the most informative pairs of label classes to elicit contrastive natural language explanations from experts. Then it extracts knowledge from these explanations using a semantic parser. Finally, it incorporates the extracted knowledge through dynamically changing the learning model’s structure. We applied ALICE in two visual recognition tasks, bird species classification and social relationship classification. We found by incorporating contrastive explanations, our models outperform baseline models that are trained with 40-100% more training data. We found that adding 1 explanation leads to similar performance gain as adding 13-30 labeled training data points.