May 3 · Issue #41 · View online
Monthly analysis of AI technology, geopolitics, research, and startups.
Nathan here from Air Street Capital
. Welcome back to regular readers and hello to everyone who is new! Enclosed you’ll find Your Guide to AI: April 2020
. I’ll cover key developments in AI tech, geopolitics, health/bio, startups, research and blogs. Drop me a line if you enjoyed the issue or want to share a critique :-)
Two announcements before we kick off:
- Our annual Research and Applied AI Summit on 26 June 2020 is now an online live stream. You can sign up here. We’ve got an awesome lineup featuring SiFive, Cerebras, Lyft Level 5, DeepMind, University of Vermont, Oxford Internet Institute, OpenMined, Moorfields Eye Hospital, Zama and more.
- Our London.AI meetup is also an online live stream. You can sign up here, join the Facebook group and watch previous editions on YouTube with Onfido, Wayve, PolyAI, Graphcore, Cera Care, and ZOE.
🏥 Healthcare and life science
Longitudinal, non-invasive, (quasi) real-time and cost-effective health monitoring is hugely important if we’re to improve public health globally. Naturally, connected devices and AI techniques will help scale up the analytical components of such endeavors. In an exciting study
published in Nature Biomedical Engineering, researchers from Stanford, South Korea and Leiden built a proof-of-concept mountable toilet system for personalized health monitoring via the analysis of excreta. This “smart” toilet system does a few nifty things: 1) it analyses a user’s urine by monitoring the color of urinalysis strips, 2) it uses computer vision as a uroflowmeter to measure flow rate and volume of urine, 3) and it classifies stool using deep learning, all with performance that is comparable to that of trained medical personnel. The system is personalised because each toilet identifies its user through a visual fingerprint derived from distinctive features of their anoderm. To start with, this smart toilet can help us identify a baseline of human health and can be followed with biochemical assays to inform health states.
How do we protect workers in manufacturing and critical infrastructure facilities during COVID? While most large enterprises are using CCTV cameras for manual review of security and health and safety, they can now utilise computer vision software
to operationalise 24/7 health and safety during the COVID-19 pandemic. This includes thermal monitoring and social distancing.
A number of Chinese clinical centers have been training and testing deep learning-based image analysis systems for COVID patients. A review
of these X-Ray and CT-based studies shows how these techniques could be useful for COVID identification, classification, and quantification.
You can track all relevant scientific literature and news emerging around COVID-19 through Primer.ai’s portal here
It was only in mid-February that the London-based AI-first drug discovery companies Benevolent.ai used its knowledge graph approach to identify baricitinib as a potential therapeutic against COVID-19. Now, Eli Lilly, which developed the drug for other indications, has launched
a randomised trial to investigate the efficacy and safety of baricitinib as a potential treatment for patients with serious COVID-19 infections. This is hugely exciting, not only for global public health, but also as real-world validation that AI-first approaches can indeed rapidly accelerate the discovery of much needed therapeutics. On the topic of knowledge graphs, you can find an open-source COVID-19 knowledge graph here
Recursion Pharmaceutics, the leading US-based AI-first drug discovery company, pointed its computer vision-based high-content phenotypic screening platform at 1,670 approved and reference drug compounds to discover which might rescue the COVID-19 phenotype in human kidney cells. The research
showed that only remdesivir had strong anti-viral phenotype efficacy. They found that neither chloroquine nor hydroxychloroquine had any beneficial effect in this human cell model. You can find the data open sourced here
🔮The politics of AI
Modern deep learning model performance is largely constrained by access to compute. A group of universities led by Stanford launched a call to action for a government-sponsored National Research Cloud
to boost American AI productivity. A few questions come to mind here: Will putting this into practice mean that the government issues an RFP in which the only likely winners are public AI hardware and cloud vendors? Or would new vendors like Graphcore and Cerebras stand to gain? At the end of the day, however, would such a facility be that different than the government handing out large cheques to research groups to spend on the same public clouds?
The US government continued its pressure
on Huawei after blacklisting the company last year. A proposed rule change would mean that foreign companies using US chip making equipment would be required to obtain a U.S. license before supplying certain chips to Huawei. In response, Huawei is shoring up
chip production away from Taiwan’s TSMC (which is directly affected by the aforementioned ruling) towards mainland China’s Semiconductor Manufacturing International Corp (SMIC). It’s estimated that Huawei accounts for 15% of TSMC’s revenues. Importantly, Huawei’s mobile Kirin processors that power the company’s flagship P30 and P20 smartphones can only be made by TSMC. Finally, Huawei’s revenues are turning more inward into China: the company now 41.4% of the domestic market
, up from 33.9% a year earlier.
🚗 Autonomous technology
As stay at home programmes continue around the world, the demand on logistics infrastructure continues to mount. This is perhaps the long-awaited demand-side tipping point for automated robot delivery providers including Starship, which is rapidly expanding
its contactless grocery delivery service. TuSimple and Nuro
are also expanding
their grocery fleets.
But its not just robots for delivery logistics that are seeing increased demand. Brain Corp, which supplies technology for a fleet of 10,000+ autonomous indoor surface cleaning robots, said that their robots are currently
doing 8,000+ hours of cleaning work a day. With increased attention to the hygiene of public spaces, expect this number to move up and to the right.
💪 The giants
Intel as its prime contractor for its four-year Guaranteeing AI Robustness against Deception (GARD)
Google worked with Copenhagen-based energy startup Tomorrow to develop a carbon-intelligence compute platform
that runs compute-intensive jobs at times/days when energy is most likely to come low carbon-emitting sources.
🇨🇳 AI in China
There’s no doubt that deep fake R&D is advancing rapidly. Jeff Ding shared a translation
that recounts how a Chinese web drama TV show Love of Thousand Years
went ahead and face swapped an actress out of the show in post-production. Why? The actress in question was put on the artist black list after public transport authorities said she’d been travelling with more flammable compressed gas cans than allowed. Putting her actions aside for a second, this face swapping episode should catalyse standards around IP ownership
of one’s identity for use in media.
Fermented electronic components? Zymergen announced its newest product called Hyaline
, in collaboration with Japan’s Sumitomo Corporation. The material is produced in bacteria and can be used to create thinner films that are foldable, flexible, and more durable. This has relevance in the production of full-screen touch sensors. To learn more about Zymergen, check out Aaron Kimball’s talk at RAAIS 2019 here
Google is rumored
to be shoring up its mobile device chips, e.g. those for the Pixel smartphone that are currently produced by Qualcomm. Google is said to have collaborated with Samsung on a 5-nanometer chip that will replace Qualcomm in a year’s time. This will give Google more full-stack control over its devices and help the company improve its bill of material margins.
Here’s a selection of impactful work that caught my eye, grouped in categories:
Longformer: The Long-Document Transformer, Allen Institute for Artificial Intelligence
. BERT’s self-attention mechanism means that its memory requirement scales quadratically with input sequence length. This makes it intractable for use on long documents. In this paper, the authors describe a drop-in replacement for the default self-attention that instead scales linearly with sequence length. They show how this Longformer
has improved accuracy on long-document Q&A tasks over RoBERTa.
Recipes for building an open-domain chatbot
, Facebook AI Research
. In this paper, the open-source
an impressive chatbot that’s able to hold open-domain conversations. First, they train a 9.4 billion parameter Transfomer model using public domain conversations that involved 1.5 billion training examples of extracted conversations. Next, they use a novel task called Blended Skill Task
, for training and evaluating a bot’s performance across knowledge, empathy, and personality. They compare Blender with Google’s Meena bot and show that human evaluators say that Blender sounds more human than Meena in 67% of conversations.
Improving 3D Object Detection through Progressive Population Based Augmentation
. This paper introduces a new automated data augmentation algorithm termed Progressive Population Based Augmentation
(PPBA). The authors use PPBA to optimize augmentation strategies of 3D point cloud data by narrowing down the search space and adopting the best parameters discovered in previous iterations. They show that PPBA may be up to 10x more data efficient than baseline 3D detection models without augmentation.
. This is a neat online collection of visualizations of every significant layer and neuron of eight important vision models (e.g. AlexNet, the Inception family, ResNet, and VGG). You can select an object (e.g. clock, dog, flower) and visualise how features are represented per neuron in each layer.
An empirical investigation of the challenges of real-world reinforcement learning
, Google Research and DeepMind
. RL has proved its worth in simulation environments. Recently, DeepMind’s shared Agent57
, a deep RL agent capable of achieving a score above human performance on all 57 Atari 2600 games. Progress in RL for games is due in part to the fact that simulations can be infinitely sampled, data points don’t cost much, there’s almost no system noise or latency, the state/action spaces are usually small, etc. In this work, the authors summarise a set of 9 challenges that preclude the advancement of RL systems from simulation to real-world physical systems (below). For each of these challenges, they provide a formal definition in the context of a Markov Decision Process and analyze the effects of the challenge on state-of-the-art learning algorithms. This is open sourced here
Learning Agile Robotic Locomotion Skills by Imitating Animals
, UC Berkeley
. This paper shows how reinforcement learning, imitation learning and real-world adaptation can enable legged robots to learn from motion capture footage of dogs and artist-created animations and perform a wide range of agile skills. They use a single learning-based approach to automatically synthesize controllers for a diverse repertoire of behaviors for legged robots. Adaptive policies are learned in simulation and can be quickly adapted for real-world deployment. Consider how this applies to Boston Dynamics-inspired quadruped robots, which are getting much cheaper nowadays ($10k in China
Chip Placement with Deep Reinforcement Learning
. This paper presents a learning-based approach to placing components onto a chipboard. In particular, the RL agent learns to place a netlist graph of macros (e.g., SRAMs) and standard cells (logic gates, such as NAND, NOR, and XOR) onto a chip canvas, such that power, performance, and area (PPA) are optimized, while adhering to constraints on placement density and routing congestion. They show that in under 6 hours of training, the RL agent generates super-human placement decisions that would take humans weeks to attain.
First return then explore, Uber AI and OpenAI
. This paper presents a family of algorithms that address two key issues of RL methods: a) an agent forgetting how to reach previously visited states and b) failing to first return to a state before exploring from it. To overcome this, Go-Explore
builds an archive (memory) of different states that it has visited in an environment. This way, the agent can balance exploration with derailment, and can subsequently use successful trajectories once exploration is done to train a robust policy by learning from these demonstrations. The authors show that Go-Explore
solves unsolved and hard-exploration Atari games and tackles hard-exploration robotics tasks.
Systems and methods
Explainable machine learning in deployment, Partnership on AI, CMU, Cambridge, et al.
The authors study how 50 people within 30 organisations understand and use explainable ML in production. They find that the majority use feature importance as the explainability technique and use explainability for the purposes of model debugging by ML engineers, not for decision explainability by the end recipient of the prediction.
Evolving Normalization-Activation Layers, Google and DeepMind.
In this paper, the authors revisit the co-design of normalization and activation layers by formulating them as a single building block using AutoML. This is interesting because neural network layers can take years of research to design: e.g. BatchNorm-ReLU. The layer search approach led to the discovery of EvoNorms, a set of new normalization-activation layers that go beyond existing design patterns. EvoNorms compare favorably against BatchNorm and GroupNorm layers in image recognition models. For more on AutoML, check out a paper from March called AutoML-Zero
, which demonstrates that evolutionary search can discover architectures from just their basic mathematical building blocks.
FALCON: Honest-Majority Maliciously Secure Framework for Private Deep Learning
, Princeton, Algorand, Microsoft.
This paper proposes FALCON, an end-to-end 3-party protocol for fast and secure computation of deep learning algorithms on large networks. They also show that for multi-party machine learning computations over large networks and datasets, compute operations dominate the overall latency, as opposed to the communication.
NBDT: Neural-Backed Decision Trees
, UC Berkeley and Boston University.
This paper explores how to make SOTA CV models more interpretable. Their NBDT solution does not require special architectures insofar as any image classification neural network can be transformed into an NBDT by fine-tuning with a custom loss. They show that NBDTs achieve competitive performance on image recognition datasets and are substantially more accurate than comparable decision tree models.
Science (bio, health, etc.)
Unveiling the predictive power of static structure in glassy systems, DeepMind and Google Brain
. Following last edition’s feature of graph neural networks simulating complex physics, this paper shows how related models can extract predictive information solely from the static structure of a glassy system. These graph neural networks can use the initial particle positions to predict how the glass will evolve over time across a wide range of temperatures, pressures and densities. In doing so, the authors show that machine learning can not only make quantitative predictions but can also learn a qualitative understanding of a complex physical system.
Please, show me something that’s not deep learning
A Framework for Interdomain and Multi-output Gaussian Processes (GPs), PROWLER.io and Imperial College London.
Gaussian processes (GPs) are probability distributions over functions with many properties which make them convenient for use in Bayesian models.
Making GPs work within deep learning requires bespoke derivations and implementations for small variations in the model or inference. To overcome this limitation, the authors propose a framework for interdomain GPs. This is a mathematically elegant and versatile way to define features in a domain different from the original input domain of the problem.
Blogs and reports
for why speech to text has not reached its “ImageNet moment” yet. A mix of reasons, ranging from challenges around dataset annotation for speech, building models on private data, complicated frameworks and more.
the vanilla Transformer can be improved for longer-term attention span, less memory and computation consumption, and RL task solving.
Hugging Face published a great video review of recent advances in NLP R&D here
Datasets and benchmarks
the world’s most diverse publicly available dataset for lifelong place recognition. This has significant utility to augmented reality and large-scale 3D reconstruction tasks that power autonomous mobile robots. The dataset has >1.6 million images from 30 cities that are all tagged with sequence information, are geo-located with GPS and compass angles. Importantly, the images cover all seasons over a nine-year period at different times of day and structural settings.
Open source tools
The ML Code Completeness Checklist here
assesses a code repository based on the scripts and artefacts that have been provided within it. This effort is really important to ensure reproducibility and accountability in publishing. The authors show that NeurIPS 2019 GitHub repos with the most stars correlate with having high scores on the Checklist.
Here’s a highlight of the most intriguing financing rounds:
, the Brussels and NYC-based leader in enterprise data cataloguing, raised
$112.5M at a $2.3B post-money valuation led by ICONIQ and Index. The company has 450 customers that use its software products for data privacy and protection, compliance and risk mitigation, operational efficiency and cost reduction.
, the US-based provider of RPA software for healthcare, raised
$51M in a round led by General Catalyst. The company says its software automates healthcare’s “most repetitive, high-volume administrative processes, across departments such as Revenue Cycle, Information Technology, Supply Chain, Clinical Administration, Human Resources and more, to deliver improved efficiency, reduced costs, and increased employee capacity.”
, the UK-based direct-to-consumer energy challenger, raised
£300m at a £1.5B post-money valuation. Octopus had previously acquired
the team from Usio to build its ML capabilities, which drive their agile tariffs
, an SF-based spinout from Uber’s core ML platform team, raised
a $20M Series A led by Sequoia and a16z. The company focuses on helping organizations build production-level machine learning systems, put them in production and operate them correctly. In particular, they focus on feature engineering pipelines, feature storage, and feature serving, which should help cross-team collaboration on ML models.
, a US-based self-driving software-only company, raised
a $13M Seed round to use unsupervised learning to drive cars.
, a Prague-based startup offering protection from adversarial ML and advanced fraud, raised
a $2.75M Seed round led by Index and Credo Ventures. The founders had previously sold
an AI-first cybersecurity startup to Cisco in 2013.
a Paris-based privacy-tech company powering innovation on data, raised
a €2M Seed led by Serena.
NB: The exit market was expectedly quiet in April 2020 due to COVID-19.
Nathan Benaich, 3 May 2020
Air Street Capital is a venture capital firm that invests in AI-first technology and life science companies. We’re a team of experienced investors, engineering leaders, entrepreneurs and AI researchers from the World’s most innovative technology companies and research institutions.
Did you enjoy this issue?
If you don't want these updates anymore, please unsubscribe here
If you were forwarded this newsletter and you like it, you can subscribe here