View profile

🌱Your guide to AI: February 2021

Nathan Benaich, Air Street Capital
Nathan Benaich, Air Street Capital
Dear readers,
Welcome to Your guide to AI. Here you’ll find an analytical (and opinionated) narrative covering key developments in AI over February 2021. 
If you’d like to chat about something you’re working on, share some news before it happens, or have feedback on the issue, just hit reply! I’m looking to add 5 seed stage AI-first companies to the Air Street Capital portfolio this year- thanks for last time’s awesome referrals 💎
If you enjoyed the read, I’d appreciate you hitting forward to a couple of friends 🙏

🆕 AI in Industry
🏥 Life (and) science
Continuing on the theme of reproducible and relevant clinical AI systems, the DECIDE-AI Steering Group proposed a new step that puts AI models through their paces before entering large-scale clinical trials. This is similar to how drugs must complete a phase ½ trial or surgical innovations a stage 2a/2b trial. Specifically, this intermediate testing phase should evaluate human decision-making as they interact with an AI model in the wild, which isn’t always how the developer intended it. Human factors might come into play, such as needing additional variables to make sense of algorithmic recommendations or perhaps tweaks to how the model integrates into the workflow. Around the same time, the FDA published an action plan motivating the regulation of AI-based software as a medical device. 
Indeed, rigorous real-world testing is even more important because there is evidence that human doctors are susceptible to taking bad advice, whether it’s from a human or from an AI. This was shown in an experiment that gave 250+ radiologists 8 classically difficult clinical cases to diagnose with 6 correct and 2 incorrect suggestions (from an AI system or a radiologist). The over-reliance effect was more pronounced for doctors with less training, which highlights that AI in healthcare is really not as simple as deploying a model with the best ROC curves. 
Another recurring topic in healthcare is data. There is a lot of discussions today about data using new privacy-preserving technologies that promise to unlock data silos. Progress appears to be slow: is this because of bad data infrastructure or privacy or inertia amongst key players? Perhaps some answers could come from the Human Genome Project. This herculean multi-center project to sequence and assemble the first human genome was committed to doing so in the open from the start. In 1996, researchers laid out the Bermuda Principles, in which all parties agreed to publish all human genome sequences in public databases within 24 hours without delay or exception. Twenty years later, however, the situation is less rosy: “Researchers tell tales of spending months or years tracking down data sets, only to find dead ends or unusable files. And journal editors and funding agencies struggle to monitor whether scientists are sticking to their agreements.” It’s telling that the problem of data standards, interoperability, and sharing still persists despite being open-source by design. In this project, data is stored in more than one place, researchers tend to deposit the bare minimum to meet compliance requirements, and getting data out is hard. There is no specific universal policy on the format, database, or sharing policies. Food for thought.
Next, more news in clinical trial land. Following the first trial of an AI-designed therapeutic agent thanks to Exscientia, we now have (I believe?) the second trial of an AI-selected drug, this time thanks to BenevolentAI. BEN-2293, a novel topical multi-target drug, is designed to treat atopic dermatitis - a rather nasty and chronic inflammatory skin condition (a few of my PhD lab mates studied it). This drug’s mechanism of action means that it can treat the inflammatory symptom as well as the itch. Fingers crossed it works well!
ML is also growing in relevance for energy and climate change. In the US, the Solar Energy Technologies Office funding program announced $7.3M for projects that focus on ML solutions that improve the affordability, reliability, and value of solar technologies on the US grid. Congrats to team Camus!
🌎 (geo)politics of AI
In last month’s newsletter, I wrote about UK’s new AI Roadmap - a call to action and set of recommendations to make UK a ( most?) compelling place to do AI. The country is building on strong foundations in two areas that matter a ton: talent and research. While Roadmap suggests doubling down on these two vectors, I was surprised to stumble across a report from non-profit Civitas entitled Inadvertently Arming China? Chinese military complex and its potential exploitation of scientific research at UK universities. A key finding is that “over half of 24 Russell Group universities and many or UK academic bodies have or have had productive research relationships with Chinese military-linked manufacturers and universities. Much of research at university centres and laboratories is also being sponsored by UK taxpayer through research councils, Innovate UK, and Royal Society.” Furthermore, it turns out that almost 20% of high-impact research in STEM published from UK is in collaboration with Chinese researchers. This exposes two problems:
First, it tells the story that many of the best UK universities are selling their AI research to the highest international bidder. The report pulls out examples of labs or individuals, and sometimes an institute or two, that have accepted financial support from the Chinese government/military complex. While financial figures aren’t quoted, I can’t imagine that we’re talking about colossal sums that couldn’t otherwise be filled by domestic UK companies or government budgets. For example, the UK Department of Defense received a $22B budget boost last year. As a side note, selling to the highest bidder happens across the stack in the UK, from research through to public companies. 
Second, it highlights the need for a holistic AI strategy that gets all key actors on the same page. UK universities are publicly-funded institutions, so the government has a key role to play in ensuring that they are properly funded and not made reliant on dubious, controversial, or ultimately sanctioned funding sources that could represent more risks than gains to the country in long term. There is of course a lot of precedents here that Civitas report reopens. Adding more color to China’s side of this debate is a report by CNAS on myths and realities of China’s military-civil fusion strategy. Perhaps it is these challenges that motivate China to seek research in the UK….y write: Over the past 30 years, China’s defense sector has been primarily dominated by sclerotic state-owned enterprises that remain walled off from the country’s dynamic commercial economy. At its core, MCF is intended as a remedy to this problem….Still, only a small proportion of private companies have participated in defense projects, and enterprises that are developing technologies relevant to the military have found cutting through the red tape involved in procurement to be cumbersome, not unlike frustrations of ir American counterparts.
The UK also announced a new Advanced Research and Invention Agency with the goal of funding high-risk, high-reward scientific research to the tune of £800M. At moment, the agency is recruiting a leadership team - so its ability to deliver will highly depend on who ends up at the helm. Watch this space. 
Meanwhile, Huawei is contesting its ban in the United States, stating that FCC’s ruling is “arbitrary, capricious…and not supported by substantial evidence.” Here is a timeline of the ban.
More news on facial recognition: Virginia state-approved limits on police use of facial recognition after a number of wrongful arrests finally pressures authorities to rethink their reliance on frail technology. Next, a new service called Exposing.AI was launched to show consumers how facial recognition technologies had been trained using millions of personal photographs from Flickr and SmugMug. For example, MegaFace dataset was created in 2015 by University of Washington researchers without the knowledge or the consent of people whose images y used. You can check whether your own Flickr photos have been included in one of 6 image datasets here. I gave it a test with my old Flickr account, but thankfully my photos weren’t interesting enough to make the cut. 
🍪 Hardware
In the last issue, we discussed the shortage of semiconductors felt by the automotive industry, in addition to big-ticket plans for investing to create a European domestic semiconductor market. News emerged that the EU project could involve TSMC and Samsung (not sure how anything can come about in a serious way without them?). Indeed, Europe deprioritised semiconductor manufacturing in the last 20 years and now insiders feel that new initiatives are too little, too late: “If you think that you can actually replicate [a well-oiled global supply chain] within a very short time, it’s simply not possible.” says ASML CEO. If Europe is to achieve technological sovereignty, especially in deep technology, it must absolutely pull out the stops to build advanced foundries in Europe as soon as possible. I believe the financing is available and willing, but the know-how is severely lacking. 
Meanwhile, in Taiwan, TSMC is hard at work building their latest fab facility to launch the 3-nanometer platform by H2 2022. The company is paying a 2x bonus to workers if they continue to work during the Lunar New Year because TSMC is adamant not to lose any time. Moreover, President Biden reaffirmed the US’ “rock-solid” commitment to “assisting Taiwan in maintaining a sufficient self-defense capability” while the US continues to decouple from China. 
Relatedly, NVIDIA’s Arm acquisition is attracting even more heat. This time, Qualcomm is said to have told regulators including the US FTC, the European Commission, the UK’s CMA and China’s SAMR that they oppose the deal. Google, Microsoft and Graphcore also protest the deal. As a reminder, Ian and I predicted in the State of AI Report 2020 that this deal would not be consummated. Meanwhile, NVIDIA’s VP of Applied Deep Learning Research - Brain Catanzaro - said that it’s entirely possible that “in five years, a company could invest one billion dollars in compute time to train a single language model”
Industrial robotics are finally having their moment to shine. Stats in the US show that companies ordered 64% more robots in Q4 2020 than 12 months prior, lifting the annual total up by 3.5%. Of note, it wasn’t the auto industry that generated this demand: Robot orders from food and consumer goods, life sciences, and rubber and plastics industries rose 50% YoY. 
In autonomy land, Aurora entered into a long-term partnership with Toyota and Denso to develop and test their Aurora Driver by the end of 2021. Oxbotica completed an autonomous vehicle trial over 180km on a bp refinery through day and night, fog, rain, sunshine, and operating around machinery on the refinery. Note there are no road signs and road marks here :-)
🏭 Big tech
Baidu became the sixth company to receive a fully autonomous testing permit from the California DMV (after Cruise, Waymo, Nuro, Zoox, and AutoX). The company received the first license to test fully autonomous vehicles on public roads in China last December. A historical anecdote from Cade Metz’s “Genius Makers” (an end-to-end insider’s account of 60 years of modern deep learning), Baidu’s AV project was championed by Qi Lu who joined the company as COO in 2017 after leaving Microsoft. Qi had wanted Microsoft to build an AV as a means of forcibly exploring state-of-the-art technology in domains outside of Microsoft’s comfort zone so that it could compete in the deep learning race against Google. His project wasn’t approved back then. After arriving at Baidu, he was convinced that China would get AVs onto public roads and into consumer’s hands far quicker than the US would due to the willingness of cities to retrofit their infrastructure to suit AVs.
Text predictions, which Microsoft has tested in beta with Outlook and Word users since September last year, are slated for general release in March this year. This is a huge step for NLP in productivity. Office 365 is the canonical business software in use by over 1M businesses with over 200M monthly active users. Microsoft Bing also announced Speller100: a zero-shot transformer-based spelling correction service that scales to 100+ languages thanks to carefully designed large-scale pre-training tasks. They found that users clicked on spelling suggestions (15% of queries have spelling mistakes) 67% of the time. 
It is now well known (in our State of AI Report) that big tech is absorbing huge numbers of talented professors and students in AI. New work came out by Nesta and Aalborg University called “The privatization of AI Research(-ers): Causes and Potential Consequences”. It finds that the academia to industry transition of researchers is outpacing industry to academia. The former transition accounts for 25% of the transitions completed by Top 5 universities but only 10% for Top 500 universities. Google and Microsoft are by far the most popular destinations, except for Princeton, which feeds Siemens (not sure why?). 
Twitter ran an analyst day (transcript here) in which their CTO, Parag, described the company’s use of machine learning. Of note, deep learning adoption has gone from 15% to 60% of all ML models over the last two years. A large fraction of the 3x mDAU growth in the last 3 years is driven by ML model improvements to content relevance. The company said that 50% of their rule enforcement against abuse or harm is done proactively including through ML-based automation. Abuse reports are down 40% thanks to changes to the home timeline using ML modeling. Twitter will up its investment in ML technology as well as research including recommender systems, NLP, and graph ML.
🔬Research
Here’s a selection of impactful work that caught my eye, grouped into categories:
Multi-task reinforcement learning in humans, Harvard and Max Planck. This paper designed experiments to test whether humans do multi-task RL and if so, what is the most likely algorithm they use. Their results suggest that humans learn a task by mapping previously learned policies onto novel scenarios. 
Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning, Google Research. In the last 2-3 years, there has been a set of exciting papers on model-based reinforcement learning (MBRL) in which an agent learns a model of its environment (a “world model”) that it uses to both predict what the future scene looks like and the outcomes of potential actions. This paper shows that the predicting future images feature of these agents, despite being computationally and representationally costly, is key to the success of MBRL. In fact, DreamerV2 - the first world model-based RL agent to achieve top-level performance on the Atari benchmark, is now out. 
First return, then explore, UberAI (and now OpenAI). Published in Nature, this paper investigates the problem that RL agents have in dealing with sparse reward feedback. They introduce Go-Explore, a family of algorithms that make agents explicitly remember promising environment states and return to those states before intentionally exploring. These agents solve all previously unsolved Atari games and set new SOTA on all hard exploration games.
A Deep Learning Approach for Characterizing Major Galaxy Mergers, DeepMind and many collaborators. Galaxy formation and evolution are a key part of cosmology. A simplified theorem says that galaxies cluster and merge to accumulate mass - the way they do this influences their shape and structure. Doing empirical experiments to test ideas is impossible, so the field compares theorems to observations. But even so, observations of how mergers affect galaxy morphology are elusive. To make strides in this direction, it’s important to determine galaxy merger status from observations. This paper shows for the first time how deep learning can be used to predict from the merger stage from a single image after learning from simulated merger events. Well done, Trev and team! 
Interpretable discovery of new semiconductors with machine learning, Toronto. The paper describes an evolutionary algorithm-powered search that uses machine-learned surrogate models trained on high-throughput hybrid function density function theory data that is also benchmarked experimentally. They show how efficient search through materials space can generate UV emission candidates with target properties that are validated empirically. 
Extraction of protein dynamics information from cryo-EM maps using deep learning, Kyoto and Riken. Cryo-EM is a powerful technique that is used to solve 3D protein structures by freezing proteins and shooting electrons at them to illuminate the source and visualizing the surface through a microscope. This process captures a protein in one particular 3D state, of which many exist. The procedure is too expensive to keep repeating to generate lots of possible conformations. This paper uses deep learning to recreate this dynamic 3D profile from cryo-EM maps. Super neat. 
“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI, Google Research. More and more companies are expressing interest in data quality monitoring and remediation, not only in machine learning. This paper empirically examines data quality practices in domains where AI-generated mistakes are not tolerated through interviews with 53 AI practitioners. They find that 92% of participants encountered data cascades, “the problem of compounding events causing negative downstream effects from data issues, that result in technical debt over time.” 
Uncovering Unknown Unknowns in Machine Learning, Google Research. This is a cool initiative that hopefully reflects how dataset creators have become aware that there is a gap between benchmarks and the real world that needs fixing. The paper explains a new data challenge to crowdsource adverse test sets for machine learning - these are examples that are as confusing or otherwise problematic for models to process. 
The GEM Benchmark: Natural Language Generation (NLG), its Evaluation and Metrics, lots of collaborators. This paper introduces a new living benchmark for NLG in which models can be easily applied to a wide range of testing corpora and evaluation strategies. As such, it overcomes the issue with static benchmarks, which express performance as a single number and aren’t always clear what they represent. All of the model outputs on GEM will be open-sourced to support evaluation research and integrating new metrics. 
Transformer in Transformer, Huawei and State Key Lab of CS. Transformers have come to computer vision. However, the typical approach is to model images as a sequence of patches, which ignores the intrinsic structure information inside each image patch. This paper describes a way of modeling both patch-level and pixel-level representations by stacking Transformer-iN-Transformer blocks. They show some improvement on ImageNet.
Calibrate Before Use: Improving Few-Shot Performance of Language Models, Berkeley, Maryland and UC Irvine. This paper shows GPT-3’s few-shot learning capabilities are unstable, meaning that the prompt format, training examples, and even the order of the training examples can cause accuracy to vary from the SOTA. The authors propose first estimating the model’s bias for each answer and then fitting calibration parameters onto it. 
High-Performance Large-Scale Image Recognition Without Normalization, DeepMind. This paper designs an improved class of computer vision models (ResNet) that is smaller and caster to train while achieving a new SOTA top-1 accuracy on ImageNet. Of note, this new system does away with normalization layers that are currently a key component of most image classification systems. This means no more fiddling with batch size, and removal of a computationally expensive primitive that introduces discrepancy between the behavior of the model during training and at inference time. 
Convolution-Free Medical Image Segmentation using Transformers, Harvard Medical School. This paper is another in the trend of throwing away convolutions and replacing them with self-attention between neighboring image patches to achieve competitive or better segmentation results.
📑 Resources
Papers with Code announced two new products: Datasets and Libraries. first indexes 3,000+ research datasets from machine learning and lets you search by task and modality, compare usage over time, and browse benchmarks. The second lets you visualize architectures, compare results and hyperparameters. 
Papers With Video is a new Chrome plugin that checks whether an arXiv paper also appears as a video talk and gives you a link to watch it. It covers 3,874 papers today.  
Artificial intelligence in longevity medicine - a short commentary on how AI for drug discovery is making its way into decoding aging too. 
TracIN - a simple method to estimate training data influence, i.e. degree to which a specific training example affects a model’s performance (especially in deep learning). 
Generating design systems using deep learning - a post from Tony at Uizard on their latest research-led product. 
Facebook released a new benchmark and model for continual learning, in which a model applies knowledge from prior tasks to solve new ones (rather than retraining from scratch every time). 
Apple released a paper detailing its on-device federated learning system. To me, the most obvious place in the stack for FL to make it in prime time is the operating system. 
Google released Model Search, a platform that helps researchers develop the best ML models, efficiently and automatically.
💰Startups
Funding highlight reel
Databricks, the analytics and AI tools platform company, raised a $1B Series G led by Franklin Templeton at a $28B valuation. Joining the round were all the cloud providers: Microsoft, Salesforce, Google, and Amazon. Databricks has passed $425M in ARR, growing 75% YoY. The company is one of the fastest to reach $100M ARR after turning on monetization from its open-source and it’s rapidly owning more real estate in the machine learning world with its popular MLFlow product. Very keen to see how it performs in the public market against Snowflake…Shouldn’t be long now. 
Didi’s self-driving division raised $300M led by IDG Capital after it raised $500M to formally spin-off in May 2020.  
Matillion, a UK-based ETL and data integration company, raised a $100M Series D led by Lightspeed. The business is 10 years old and saw a huge opportunity to pivot from building and maintaining data warehouses to ETL where it now leads alongside companies like Fivetran. Matillion has over 1,000 customers.
OutSystems, the low-code application builder from Lisbon, raised $150M led by Abdiel Capital and Tiger Global at a $9.5B valuation. The business, which started as a consulting company a decade or so ago, has cemented itself as the category-leading low code platform. They also have an AI team that is mining all the anonymized application recipes generated by customers to learn the most optimal recipes to suggest out of the box. Hence their inclusion here ;-)
Standard Cognition, the Amazon Go autonomous checkout company, raised a $150M Series C led by SoftBank Vision Fund 2 to open over 50,000 stores in the next 5 years.
Locus Robotics, an autonomous warehouse robotics company, raised a $150M Series E led by Tiger Global. The company says it has 4,000 robots deployed across 80 sites, of which 80% are in the US and 20% are in Europe. Watch this space for a potential SPAC following Berkshire Grey (below). 
Weights & Biases, an ML tooling provider focused on experiments, raised a $45M Series B led by Insight Partners. W&B counts over 70,000 users and 200 enterprises. The company was founded by the team behind Figure Eight (fka CrowdFlow), which kickstarted the data annotation tooling industry. Lukas, CEO, wrote a blog post in 2019 about his decision to start a new MLOps company: “Ten years ago training data was the biggest problem holding back real-world machine learning. Today, the biggest pain is a lack of basic software and best practices to manage a completely new style of coding.”
Recogni, a custom ASIC built for real-time object recognition by autonomous vehicles, raised a $49M Series B led by Mayfield Fund. 
ELSA (English Language Speech Assistant), an app that uses speech recognition to correct pronunciation and teach a new language, raised a $15M Series B led by VI Group and SIG. The company focuses on SE Asia and will grow into LatAm. It counts over 13M users. 
Uisee, a Chinese AV startup focused on robotaxis and buses in and around transport hubs, raised $150M from the National Manufacturing Transformation and Upgrade Fund, a $21B state-backed fund.
Quantifind, which automates financial crimes risk investigations, raised $22M from In-Q-Tel, S&P Global and Snowflake Ventures. 
Rescale, a hybrid cloud HPC automation company, raised Series C $50M.
Monte Carlo, the data observability company, raised a $25M Series B led by GGV and Redpoint just a few months after announcing its $15M Series A. This is a huge acceleration for such a nascent field with many competitors. 
Deepgram, which builds speech recognition software, raised a $25M Series A led by Tiger Global. 
Otter.ai, a call transcription service, raised a $50M Series B led by Spectrum Equity. 
Cellarity, a generative biology company set up by Flagship Pioneering, raised a $123M Series B led by BlackRock. 
BigHat Biosciences, the AI-first therapeutic antibody engineering company, raised a $19M Series A led by a16z. 
Reverie Labs, an AI-first small molecule discovery company, raised a $25M Series A led by Ridgeback Capital. 
Labelbox, the AI training data creation and management company, raised a $40M Series C led by B Capital. 
Rhino Health, which uses federated learning to connect hospitals and AI developers, raised a $5M Seed led by LionBird Ventures.
Census, a data integration solution that takes models and insights from the data warehouse and validates and deploys them for downstream use, raised a $16M led by Sequoia and a16z.
Ozette, an AI-powered immune monitoring platform, raised a $6M Seed round by Madrona. 
Exits
Berkshire Grey, a robotics for warehouse operations company, announced that it is going public via a SPAC at a $2.2B valuation. Their investor deck is here. This is a neat business founded by robotics veterans (iRobot) that challenges Ocado Robotics, Amazon Robotics, and 6 River Systems (Shopify) as a pure-play hardware and software company. 
DataFleets, a federated learning company, was acquired by data connectivity platform LiveRamp for $68M. This marks one of the first acquisitions in privacy-preserving data sharing technology. What’s notable here is that DataFleets originally set out to tackle healthcare data, not marketing/advertising/finance, which LiveRamp sells to. Perhaps this indicates that healthcare is not ripe for privacy-preserving methods just yet. 
AEye is another LiDAR startup to agree on a SPAC that values the business at $1.9B. You can find their company overview for investors here. It forecasts $4M of revenue this year and $617M in 2026 with 55% EBITDA margin (quite a ramp!)….
OURS Technology, makers of a lidar-on-a-chip solution, was acquired by Aurora.
Signing off, 
Nathan Benaich, 7 March 2021
Air Street Capital is a venture capital firm investing in AI-first technology and life science companies. We’re an experienced team of investors and founders based in Europe and the US with a shared passion for working with entrepreneurs from the very beginning of their company-building journey.
p.s. Here’s an easter egg 🐣
Did you enjoy this issue? Yes No
Nathan Benaich, Air Street Capital
Nathan Benaich, Air Street Capital @nathanbenaich

Monthly analysis of AI technology, geopolitics, research, and startups.

If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Created with Revue by Twitter.