View profile

🏡Your guide to AI: Q1 2020, part 1

Revue
 
Dear readers, I trust and hope that you and your families are staying healthy and safe as the global
 

Your guide to AI

March 29 · Issue #39 · View online
A market intelligence newsletter covering AI in the technology industry, research lab and venture capital market.

Dear readers,
I trust and hope that you and your families are staying healthy and safe as the global pandemic unfolds around us. COVID-19 is top of mind in the media today, and rightly so. It is imperative that we each play our part in stemming the spread of COVID-19, caring for those in need and keeping the economy alive.
With Q1 2020 wrapping up, here is a review of what you need to know in AI. Writing from 🏡 in London, I will cover key developments in AI tech, geopolitics, open sources and blogs in this edition. Next Sunday, I will share part 2 focusing on AI research and startup activity.
Three announcements before we kick off:
  • Researchers at King’s College London, NHS and ZOE launched a COVID-19 symptom tracking app to help the health service understand patterns of the disease spread. The app can be downloaded here. It been used by >1.5M people in the UK in 3 days and will launch in the US shortly.
  • Our RAAIS Foundation has launched a call for COVID-19 Relief Fellowships to fund researchers, healthcare practitioners and other initiatives that help accelerate the fight against COVID.
  • London.AI #19 will be run as a live stream on Thursday 2nd April 2020 at 6pm GMT. Tune in to hear from Graphcore, PolyAI, and KCL COVID-19 research work.
Referred by a friend? Sign up here and share the newsletter on Twitter.

🆕 Technology news, trends and opinions
🏥 Healthcare and life science
While the software investing world appears increasingly bullish on startup opportunities in biology, the life science investing world is voicing their bearish opinion even louder. Technology investors are funding AI-first life science companies at a clip, largely in the drug discovery and synthetic biology markets. In the case of drug discovery, my view is that software can help instrument the process with new data while bringing scale and rigor to its analysis. This means discovering new biology and drug assets. I’ve written about this thesis here. Chris Gibson from Recursion paints a great picture here
Life science investors principally value drug discovery companies on the basis of their drug asset pipeline: how many drugs are they developing, at what stages are they and how likely are they to work in the clinic. In general, these investors take the view that AI-first drug discovery companies have technology platforms that are overvalued because they have yet to produce a consistent pipeline of valuable assets. Some also think that AI approaches to biology are reductionist and do not help us discover new biology. There have been many failed attempts where pharma has burned their fingers in the past. This piece explores more. Natural Review Drug Discovery also featured a survey of expert opinions on both sides of the aisle here
The FDA authorised two new AI-first diagnostic imaging products. Ultromics (an Oxford-based startup) for echocardiography to diagnose heart disease, and Hologic (a US startup) for a 3D breast mammography diagnostic. 
Babylon Health signed a 10-year deal to provide the city of 300k residents with an integrated health app. It will provide remote consultation, diagnosis, and live monitoring of patients with chronic conditions. 
So far, there have been only 5 randomised clinical trials that use AI-based diagnostic systems. Four are in gastro and one is in ophthalmology. All five were conducted in China. 
A group of clinicians in London and Birmingham published a paper to help clinicians critically appraise ML studies in healthcare. Work like this is important to establish common grounds when it comes to appraising the quality of new work. For example, the paper asks clinicians to consider how data is labeled (e.g. from physician notes or by labelers), if these labels are actually clinically meaningful to avoid spurious predictions, if datasets were split correctly, if models were evaluated in the appropriate clinical pathway setting, etc. 
Benevolent.ai searched their knowledge graph of structured medical data to find approved drugs that could help treat COVID-19. They focused on drugs that might block the viral infection process in the lungs and identified baricitinib. The drug is an inhibitor of AAK1, which promotes endocytosis into AT2 epithelial cells expressing ACE2. It is hypothesized that the COVID-19 virus binds ACE2 to endocytose into lung cells. A few weeks later, baricitinib was fast-tracked into a clinical study of symptomatic patients infected by COVID-19 in Prato, Italy. We’ll have to wait several weeks to see the results. 
🔮The politics of AI
A study between MIT and Stanford put forward a method for mathematically expressing and regulating an undesirable behavior in an ML system such that these can be controlled during training. This is accomplished by having the creator of the ML model define what an algorithm should do in a way that allows the user to directly place probabilistic constraints on the solution that is returned to the algorithm. The paper is available here
The German government plans to clamp down on acquisitions of 10%+ positions (down from 25%) in its domestic companies developing “critical technology” by foreign investors and buyers. 
President Trump proposed to implement new export controls on geospatial imagery software. In particular, this relates to the sale of AI technology to countries including China. It expands an export control law that came into effect in 2018. Trump was also set to halt a large civilian 1000-strong drone programme because the robots were built in China. The White House also released a 10 point principle to govern safe AI. Meanwhile, Washington upped pressure on TSMC to manufacture chips that go into military projects in the US instead of in Taiwan. 
The Center for Security and Emerging Technology at Georgetown University published a report on “Keeping top AI talent in the United States”. It highlights that foreign talent is crucial to American AI research. However, international graduates who want to stay are forced with significant obstacles in the US immigration systems, which are only getting worse. 
Clearview AI was exposed for gleefully scraping billions of photos from public-facing data on Twitter, Facebook and Instagram without user consent for use in its facial recognition software. This software was then sold to more than 2,200 government agencies and law enforcement and companies in 27 countries including Walmart and Best Buy to identify people. Clearview received a class action lawsuit in Illinois, where Facebook was most recently fined $550M for privacy infringements. A guy wrote up his experience of submitting a request for his file from Clearview, citing California’s GDPR equivalent (CCPA). 
The European Commission put forward a proposal for the regulation of AI in Europe here. The Commission will be increasing its annual investments in AI by 70% under Horizon 2020 in order to reach €1.5B spent between 2018-2020. Some of their suggestions impose significant red tape, e.g. AI systems in Europe should be trained only on “European data” and companies must run their systems through a trustworthiness check box. If not handled with care, it feels like Europe will risk regulating itself out of being globally competitive in AI.
🚗 Autonomous technology
Remember 18 months ago when Morgan Stanley came out with a forward-looking valuation note on Waymo arguing that the business could be worth $175B? Remember in Sept 2019 when MS cut their valuation projection to $105B? Nowadays, AV enthusiasm seems to be tempered because of just how challenging the technology and economics are for real-world self-driving. In a first for one of Alphabet’s separate businesses, Waymo raised its first external financing round of $2.25B, valuing the business at $30B. The consummation of this deal signifies a departure from Alphabet’s founders stance of always financing their own bets. It’s also the latest deal in a string of $B+ AV financings into GM Cruise, Uber ATG, and Argo.ai that cements the AV industry as one where only $B+ balance sheet contenders stand a chance.
Waymo also released its newest generation of Jaguar I-Pace vehicles with in-house developed cameras, lidar, and radars. 
Ford is said to acquire properties in Austin, TX to run and service their AVs. 
The California DMV published 2019 figures for AV mileage by license holder and disengements. A total of 36 companies with 676 cars reportedly drove a total of 2.88M miles. While disengagements are reported, it’s no longer clear that the number means much. For example, Baidu ranked in last place last year with one disengagement every 206 miles, but this year self-reports one disengagement per 18k miles (i.e. 86x improvement in 12 months) placing it at the top of the performance charts ahead of Waymo. How does that work? As a result, Aurora and others have taken to expanding their simulation efforts over real-world miles. What’s more, Waymo shared that their California miles are predominantly “engineering development”, whereas their production miles are in Phoenix. As such, the California DMV results don’t provide a good comparison of AV service quality.  
Sony, which dominates the global image sensor market, is investing billions to develop driverless car sensors. 
Lyft and Aptiv reported 100k paid self-driving rides since launching their service. 
Otto founder, Anthony Levandowski, finally agreed to plead guilty to stealing trade secrets from Google and taking them over to Uber after they acquired Otto. Lewandowski faces up to 30 months of prison and a $179M payment to Google for violating employment contacts. 
💪 Large enterprise
BERT has been rolled out to Google Search in 70 languages. 
McKinsey Global Survey 2019 found that “63% of respondents report revenue increases from AI adoption in the business units where their companies use AI, with respondents from high performers nearly three times likelier than those from other companies to report revenue gains of more than 10 percent.” More here
Amazon launched a science blog that features blog posts into nifty technology and user behavior studies on their service. For example, how users make purchasing decisions.  
Sundar Pichai of Alphabet wrote in the FT that AI must be regulated, but stops short of saying how. 
Microsoft allocated $40M of their $165M AI for Good initiative to health AI. 
Berkeley received it’s largest donation to date worth $252M to establish a Division of Computing, Data Science, and Society. This emphasizes the importance of upskilling new generations across disciplines to be literate in data science and its applications to their work. 
MIT Tech Review ran an in-depth story on OpenAI, their culture, evolution since founding, switch to a capped profit company, intense operational secrecy and their belief that AGI is within reach thanks to a compute-driven strategy. 
🍪 AI hardware
Cerebras unveiled their AI computer, the CS-1, which houses a wafer-scale engine and the cooling/power system to integrate the chip within datacenter racks. 
Citadel Research conducted an extensive technical review of Graphcore’s IPU here
🇨🇳 AI in China
Thanks to Jeff Ding for sharing a paper that delves into the early history of cybernetics and AI in China from the 1950s to 1980s. The paper seeks to contextualise how the country kicked off its AI research and how “political ideologies, diplomacy, economic policies, and other social dimensions affect cybernetics and AI in China.”
Chinese researchers studied the quality and citation network of AI research papers published by domestic Chinese authors. We know that there’s been a marked inflation in paper volume since 2017. However, those papers have cited far less papers than in the past. They also found that papers tend to be more qualitative than quantitative, as well as more normative than empirical, thus suggesting that they contribute less original knowledge. 
Tsinghua University Professor Lao Dongyan spoke out against the implementation of a widespread facial recognition system in the Beijing subway system. In an expression of distrust in the government’s action (for which she is not alone), she says: “I solemnly recommend that the National People’s Congress Standing Committee conduct a fundamental legitimacy review for the Beijing Metro’s measure to employ facial recognition for security screening. At the same time, it should consider initiating corresponding legislative procedures for a legal approach to regulating the arbitrary use of facial recognition technology.” 
Relatedly, the developer of a very popular real-time object recognition model called YOLO tweeted that he had stopped computer vision research because of his concerns over mis-use of YOLO for military applications or to breach consumer privacy. 
More recently, a facial recognition service in China has been retrained to detect faces occluded with face masks. The company, Hanwang, was asked by several hospital customers in January to help them detect faces of masked healthcare workers. For a long read on this history of facial recognition algorithms, which started some 60 years ago by Panoramic Research, have a look at this piece. Finally, despite a US ban on its services and the trade war, rumor has it that China’s SenseTime is reaching $750M in annual revenues.
Meanwhile, the UK transportation system has been running trials of facial recognition systems since 2016, the latest being in London. The system was built by Japan’s NEC and conducted a live screen against a database of 5,000 criminal or lost individuals. Indian law enforcement used facial recognition to identify more than 1,00 individuals who committed communal violence against the state. 
Chinese government and public officials were mandated to replace all of their foreign computer hardware and software with local alternatives within three years. 
On China’s Sputnik moment for strengthening every major technological capability, triggered by the US. 
📑 Resources
Blogs and reports
A booklet on machine learning systems design with exercises here
Insights into paper author trends at NeurIPS and ICML 2019 here. Spoiler: Google, Stanford, and MIT lead. 
Stanford released their 2019 AI Index, an valuable resource for the industry. 
A review of NeurIPS 2019 papers here and self-driving focused papers here.
Batch (offline) vs. real time (online) inference: why, how and challenges of both here.
The Deep Learning Toolbox here. A model zoo diagram. 
NLP year in review - 2019 here.
Recursion’s Imran Haque published a neat “AI in drug discovery glossary”.
Here are two PhD theses on two hot topics in ML: 1) Neural Transfer Learning for Natural Language Processing and 2) Geometric Deep Learning.
6 stages of AutoML here.
How Lyft Level 5 approaches generalisation in their AV development here.
Groq released some performance figures comparing its chip to NVIDIA’s V100 here.
How transformer models are a special case of graph neural networks here
Videos/lectures
A collection of 30 video AI talks from RE-WORK conferences in 2019 here
Datasets
The ScanRefer dataset is a large-scale effort to perform object localisation via natural language expressions in 3D. It contains 46k descriptions of 10k objects from 703 ScanNet scenes.
Self-reported salary data at large tech companies here
Open source tools
Why and how to version control your datasets here.
A model-agnostic visual debugging tool for machine learning: Uber’s Manifold
Flax is a neural network library designed for JAX, a fast scientific computing and ML library with normal NumPy API. 
Fast.ai released their v2, which provides abstractions that can be expressed concisely and clearly by leveraging the dynamism of the underlying Python language and the flexibility of the PyTorch library. 
An ML-approach to reformatting video into different size outputs here. Automatically turn a 16:9 original into a high-quality square, vertical or portrait.
Uber released Ludwig, a code-free deep learning toolbox. The user has to define the input and output data types (numerical, categorical, text, images, time series). They also released Fiber, a system for large-scale distributed AI computing. 
Bonus round for all those video conferences: Air cloak for webcams :-)

Signing off, 
Nathan Benaich, 29 March 2020
Did you enjoy this issue?
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue