🏥 Life (and) science
Life (and) science
These past weeks have been a positive whirlwind for the biotechnology industry and global public health. First, we had the rapid approvals of COVID-19 vaccines, both using mRNA technology (Moderna, BioNTech) and more traditional methods (Oxford). While this news wasn’t (from what I can tell) driven by AI-based designs or workflows, another news item was: Baricitinib
. This drug, which was postulated in February this year by London-based Benevolent.ai as a potential treatment for patients suffering from COVID-19 was granted emergency authorization by the FDA. Patients can now receive immediate use of Baricitinib in combination with Remdesivir because patients who received this dual treatment suffered a 35% lower mortality rate than those taking Remdesivir alone. Moreover, pre-clinical experiments conducted at the University of Washington showed that computationally designed mini proteins
that mimic how an antibody would bind the COVID-19 can prevent its interaction with the ACE-2 receptor. This early work shows a) that understanding 3D protein structure is key to elucidating function, and b) that computationally-driven search and optimization of protein structure can rapidly generate useful drug candidates.
This naturally leads to another major headline, which is that of AlphaFold version 2. Two years since DeepMind demonstrated impressive performance on predicting the folded structure of previously unseen proteins from their amino acid sequence, the company has done it again
, but this time by eclipsing its competition. This task is measured by the Critical Assessment of protein Structure Prediction (CASP), which compares structure predictions from competition participants with ground truth structures generated through existing methods such as X-Ray crystallography and/or NMR. The paper describing AlphaFold 2 is yet to be released, but we know that the system was a full “rebuild”. Unlike AlphaFold 2, this newer generation instead relied on an attention-based neural network, trained end-to-end to interpret the structure of a folded protein that is represented as a spatial graph (residues are the nodes and edges connect the residues). The system was trained on public data consisting of 170k protein structures as well as large databases of protein sequences with unknown structures. Some called this work a “gigantic leap” and claimed that “protein folding is solved”, while others
tempered this claim. The truth is somewhat in between: in the same way that deep learning crushing all other methods to solve ImageNet does not mean that computer vision is solved, AlphaFold 2 crushing all other entrants in CASP to max out the competition score does not mean that protein folding is solved. Either way, this news will undoubtedly draw even more worthy attention to the applications of AI to problems in biology and health, accelerating many more impactful developments and new companies in the years to come. Exciting!
Switching focus to AI for clinical workflows, Vienna-based Allcyte released results from the first-ever prospective interventional study
showing that functional precision medicine can deliver clinical patient benefit in the form of longer progression-free survival and overall response rate. The study
presented at the American Society of Hematology conference involved 56 leukemia patients whose treatment was guided by computer vision-based microscopy analysis of how their cancer and non-cancer biopsied cells reacted to a range of 136 small molecule drugs in vitro
. This is particularly exciting because precision medicine that relies on genomics alone (i.e. sequencing cancer DNA to find driver mutations and prescribing drugs that combat that mutation) is too reductionist and many times don’t work. Functional tests that ask questions of living cancer cells are far more expressive because they more accurately mirror cancer biology.
A The (geo)politics of AI
A major thread that we pulled out in the State of AI Report 2020 was the rise of military AI, which is driven by rapid technology transfer of AI research into military contexts. As a dual-use/general-purpose technology, the translation of AI is inevitable, but how do practitioners feel about it? Georgetown’s CSET conducted a survey
of AI professionals in the US to sample their view of the Department of Defense (DoD) and DoD-funded AI projects. They found that 39% are neutral, 38% positive, and 24% negative about working on DoD-funded AI projects. While most highlighted “access to unique resources” and “interesting problems” as the top benefit of working on DoD-funded AI projects, 60% said that they do not want to do harm. So how do we reconcile the two stances? Curious to hear your opinions.
On a related topic, the scientific journal Nature surveyed
480 researchers who work in facial recognition, computer vision, and AI to understand their stance on ethical issues related to this application area. Of note, 71% of respondents agreed that facial recognition research on vulnerable populations (e.g. refugees or minority groups) could be ethically questionable even if scientists gained informed consent. Those who disagreed tried to distinguish between a) condemning and restricting unethical applications of facial recognition and b) restricting research. This highlights the tensions between research and the real world: should the translation between the two can be controlled and if so, how? Ultimately, it looks like researchers will need to consider the potential downstream applications of their research, whether they’re involved in it or not. Relatedly, the Partnership on AI launched a project called the AI Incident Database
, which is meant to document the failure of AI systems.
The UK government joins Germany, Japan, and the US is tightening its controls over potentially hostile foreign takeovers with a new National Security and Investment Bill
. Under this bill, the government will take “a targeted, proportionate approach to ensure it can scrutinize, impose conditions on or, as a last resort, block a deal in any sector where there is an unacceptable risk to national security.”
This of course includes AI. While this regime applies to investors/buyers from any country, the Bill says that “this will mean that no deal which could threaten the safety of the British people goes unchecked, and will ensure vulnerable businesses are not successfully targeted by potential investors seeking to cause them harm.”
. While I agree that these two situational criteria do require government inspection, what I personally find puzzling is that neither the DeepMind/Google nor the ARM/SoftBank acquisitions “threatened the safety of the British people” nor did they target “vulnerable business” by seeking to “cause harm”. So without further clarification, I’m not sure I see how this Bill actually solves the problem at hand: domestic winners being offered huge sums to sell out and the UK Government doing nothing about it.
I Autonomous everything
Waymo published two papers on its safety performance results
. The former presents more than 6.1M miles of automated driving in Phoenix, Arizona, and reports every collision and minor contact experienced during operations. There were 47 contact events for 2019 and the first three quarters of 2020. The piece notes that “nearly all events involved one or more road rule violations or other errors by a human driver or road user”
Oxford, UK is playing host to the UK’s first trials
of a Level 4 AV vehicle. The 6 cars are run by the city’s home-grown AV startup, Oxbotica, as part of a government-backed research project called “Project Endeavour”. The trials are due to last for a year and involve driving around the Oxford Parkway rail station into Oxford’s main train station. If you haven’t been to Oxford, this route involves quite a few roundabouts, narrow streets, and a TON of bicycles driven by people of all ages and experience levels…
Meanwhile, Uber continues to shed (presumably) non-core assets, this time focusing on their ATG self-driving unit. Over 1.5 years ago, the company spun the unity out as they were preparing to go public, raising $1B from Toyota, DENSO, and SoftBank’s Vision Fund. Now, Uber is in talks to sell the division
to private, Amazon-funded rival Aurora, which lost its major auto partner VW to Argo AI a year or so ago. The move does make sense for Uber, which like other loss-making public companies, cannot sustain hefty annual investments to the tune of hundreds of millions of dollars forever.
Following its $900M acquisition of Moovit, a mobility-as-a-service startup, Intel’s Mobileye announced
a partnership with soon-to-be-SPAC’d LiDAR company Luminar. The stated goal is to build a “full end-to-end ride-hailing experience”. How many providers do we need? Why will Intel succeed where others struggle?
Prior editions of the newsletter have included discussions of China’s central planning approach to shoring up capabilities in semiconductor design and fabrication to reduce its dependency on foreign suppliers. For example, Huawei (which has no prior fabrication experience) is reportedly
setting up a dedicated chip plant in Shanghai following tightened US export controls earlier this year, which has left the company without chips to put inside their smartphones. More broadly, China has seen 4,600 newly-registered domestic chip-related companies in Q2 2020 alone, up 207%. In fact, three new Chinese startups set up in the last year were founded by or have hired executives and engineers
from US-based Synopsys and Cadence Design Systems of the US, which are two of the largest electronic design automation toolmakers.
But much less has been said about the dark side of this race. Jeff Ding shared a fascinating translation
of an investigative report on defunct state-funded semiconductor projects in five provinces. The report, which was produced by Outlook Magazine (a state-backed publication for the Communist Party) finds that new semiconductor companies “took advantage of local governments who lack industry knowledge, and basically get governments to give them free land, factories, and massive subsidies.”
Billions of dollars worth of state investment were targeted at projects in second-, third- and even fourth-tier Chinese cities that do not have the talent or resources to sustain such projects.
Back in the US, the big news these past weeks has been Apple’s announcement of their M1 silicon and their clear strategy of vertical integration
. Amongst Amazon, Facebook, Google, and Microsoft, this move now leaves Microsoft as the only player without a clear in-house silicon initiative. Notably, Microsoft was an early partner and big investor in Graphcore, which is arguably at the head of the startup race for native AI semiconductors. Apple’s M1 makes use of new unified memory architecture
, which means that the CPU, GPU, neural processor and image signal processor all share one pool of high bandwidth, low latency memory. The company’s new Macbook Air, Macbook Pro, and Mac mini essentially share the same M1 chip.
TSMC, the world-leading semiconductor fabrication company, is also eating up more of its supply chain. The company previously left chip packaging services to a range of specialized suppliers but is now developing its own 3D stacking technology
at a chip packaging plant in Taiwan with mass production planned for 2022. This new technology enables TSMC to stack and link different kinds of chips into one package, leading to more energy-efficient and powerful compute output in a smaller chipset. And with this, TSMC extends its already large technology lead even further.
ap Enterprise software and big tech
After a couple of quarters flashing an increasingly obvious
“Add Google Meet video conferencing” on every new Google calendar entry while Zoom’s share price soared, Google is now ramping up its efforts to compete. Being video and voice, conferencing is a natural battleground for AI-driven feature fighting. Google used its open-source framework for cross-platform customizable ML solutions for live and streamed media called MediaPipe
. You’re now able to blur or replace
your background. What’s next?
Documents! Last month’s newsletter highlighted DocuSign’s move into helping users understand the contents of the documents they’re signing. Now Google introduced their “Document AI platform
”, which they’re calling a unified console for document processing. This area of enterprise AI has so far been the turf of startups (e.g. HyperScience) that help users transform documents into structured data automatically using computer vision whether for data entry, process automation, or as input for RPA. But no more!
Three of the hottest topics in the MLOps space, which concerns lifecycle management of ML systems in production, are 1) feature stores, 2) data catalogs, and 3) data quality monitoring. These products were birthed by large technology companies who were the first to really see the challenges of maintaining healthy ML systems and teams at a large scale. To start, feature stores typically offer a central place to store the signals that are computed from raw data for the purposes of an ML model. Their goal is to improve team collaboration through the reuse and iterative improvement of features, as well as monitoring their contribution to ML model success/failure in production. More on feature stores here
! Next, data catalogs are similar in the sense that they also offer a centralized destination to discover all datasets that are generated and consumed within a company, including associated metadata like who owns what, what upstream or downstream services touch a particular dataset. More on data catalogs here
! Third, data quality monitoring systems help data producers and consumers understand whether there are issues/anomalies with the data. This can be nulls, schema changes, timeliness issues, or distribution changes. All of this has an impact on the performance of ML models in production (garbage in, garbage out). Airbnb shared a two part
series on how they built internal data quality systems.