We are taking advances in machine learning and artificial intelligence and applying them to accelerate progress in natural science: biomedical research, chemistry, and material science.
Google Accelerated Science
Our mission is to produce breakthroughs in the natural sciences by applying Google technologies, including machine learning, iterative prediction/experimentation in large combinatorial spaces, and large scale analysis and computation. We believe these will enable more effective high throughput research in many domains.
Using Google's unique expertise, technology and scale, we collaborate with world-class institutions on challenges with large scientific and humanitarian benefit, working closely with leading scientists who have deep domain expertise and proven experimental infrastructure.
Automatic insight from biological images
We use machine learning to scalably interpret biological images. Our approaches offer many benefits including non-invasively imputing deep cell measures on transmission microscopy images (without stains), and discovering new phenotypes and common mechanism of actions across large cell screens.
Developing a drug takes many years and can cost upwards of $1-2B. We use computational approaches to predict molecular properties of small molecules and their binding affinity to target proteins. We are particularly focused on improving the lead optimization phase of preclinical drug discovery with the goal of translating into better clinical outcomes and reducing the time and costs of drug discovery.
Improving scientific computing with machine learning
We are developing new hybrid approaches to scientific computing that combine deep learning and classical numerical methods. For example, we used machine learning to accurately solve partial differential equations on much coarser grids than was previously possible.
The design of new materials is a complex, multi-faceted problem. We are approaching this challenge in multiple ways, including the use of machine learning to improve and speed up existing computational approaches, tight integration of semi-automated design and analysis into high-throughput experimental loops, and extracting more information from imaging of materials.
ML for accessibility
We use machine learning to help people with speech impairments due to neurological conditions such as ALS, stroke, and multiple sclerosis, to communicate and interact with technology more easily. As part of this research initiative, we ask volunteers with speech impairments to record voice samples (anyone above 18 can sign up at bit.ly/AudioData), we then use these voice samples to improve speech recognition.
Proteins are the major biological machines that make life possible. We are using modern ML methods to better understand and predict the function of biological proteins. But the design space of proteins is much larger than what we observe in the natural world. To address this design challenge, we are integrating computational and experimental work to modify and optimize proteins for a variety of uses.
Scientific Reports, vol. 9 (2019), pp. 10752
Proceedings of the National Academy of Sciences (2019), pp. 201814058
SLAS DISCOVERY: Advancing Life Sciences R\&D, vol. 0 (2019), pp. 2472555219857715
Proceedings of the National Academy of Sciences (2019), pp. 201820657
The International Conference on Machine Learning, Workshop on Climate Change (2019)
BMC Bioinformatics, vol. 19 (2018), pp. 77
Journal of Chemical Theory and Computation (2017)
Journal of Computer-Aided Molecular Design (2016), pp. 1-14
Proceedings of IEEE InfoVis 2014, IEEE (to appear)
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2019) (to appear)
The world’s fastest supercomputers were designed for modeling physical phenomena, yet they still are not fast enough to robustly predict the impacts of climate change, to design controls for airplanes based on airflow or to accurately simulate a fusion reactor.
A popular artificial-intelligence method provides a powerful tool for surveying and classifying biological data. But for the uninitiated, the technology poses significant difficulties.
Tri Alpha Energy has a unique scheme for plasma confinement called a field-reversed configuration that’s predicted to get more stable as the energy goes up, in contrast to other methods where plasmas get harder to control as you heat them.
Using our large-scale neural network training system, we trained at a scale 18x larger than previous work with a total of 37.8M data points across more than 200 distinct biological processes.
Our MPNNs set a new state of the art for predicting all 13 chemical properties in QM9.
Some of our people
Harnessing Google's machine learning prowess to attack critical problems in infectious disease has the potential for incredible impact in the world.
Designing a meaningful experiment and deeply understanding the result is the critical thread across all of the sciences.
Some of our current and previous partners
- Bill & Melinda Gates Foundation
- New York Stem Cell Foundation Research Institute
- Jake Baum, Imperial College London
- Steve Finkbeiner, Gladstone Institutes and the University of California, San Francisco
- Lee Rubin, Harvard University
- TAE Systems
- Anatole von Lilienfeld, Institute of Physical Chemistry and National Center for Computational Design and Discovery of Novel Materials (MARVEL), Department of Chemistry, University of Basel, Switzerland
- Vijay Pande, Stanford University
- John Gregoire, California Institute of Technology and Joint Center for Artificial Photosynthesis
- Jonathan Fan, Stanford University