Last night the CEO of Google DeepMind, Demis Hassabis, appeared on 60 Minutes, and had this (among other things) to say:
Demis Hassabis: Proteins are the basic building blocks of life. So, everything in biology, everything in your body depends on proteins. You know, your neurons firing, your muscle fibers twitching, it's all mediated by proteins.
But 3D protein structures like this are so complex, less than 1% were known. Mapping each one used to take years. DeepMind's AI model did 200 million in one year. Now Hassabis has AI blazing through solutions to drug development.
Demis Hassabis: So on average, it takes, you know, ten years and billions of dollars to design just one drug. We can maybe reduce that down from years to maybe months or maybe even weeks. Which sounds incredible today but that's also what people used to think about protein structures. And it would revolutionize human health, and I think one day maybe we can cure all disease with the help of AI.
Scott Pelley: The end of disease?
Demis Hassabis: I think that's within reach. Maybe within the next decade or so, I don't see why not.
So let me be one of the people to say Why Not. I'm not happy about doing this, or feeling as if I have to be doing this. I have written on this subject over and over for quite a few years now, and I sometimes worry that I'm getting pigeonholed as "That Dude That Says That AI Drug Discovery Isn't So Amazing". But at the risk of making that even more of a problem than ever, here we go.
If you want to see what I've written in the past about this, probably the original is 2007's "Andy Grove: Rich, Famous, Smart, and Wrong", in which I coined the phrase "The Andy Grove Fallacy" to describe the "why doesn't drug discovery get with the modern age and move as fast as software development" attitude. I expanded on this in 2010 and 2011 in commentaries ("Biology by the Numbers" and "Drugs, Airplane, and Radios") on the famous/infamous "Can a Biologist Fix a Radio" paper that appeared in 2002. These issues came up again in 2014 with "Google's Calico Moves Into Reality" and "Peter Thiel's Uncomplimentary View of Big Pharma".
If you're just going to read one of these earlier posts (at most) - and I can definitely understand the impulse - perhaps it should be 2015's "Silicon Valley Sunglasses". In 2018 I had pieces on nascent biotechs that were making what were (to me) grandiose claims for their computational powers in "The Case of Verge Genomics" and "Rewiring Plankton: And Reality". And the computational biology radio-fixing topic appeared again that year in "Engineering Biology, For Real?"
At this point, the AI hype really began to start taking over the world, and in 2019 I covered three announcements that companies had indeed discovered new drugs using such technology ("Has AI Discovered a Drug Now? Guess", "Another AI Drug Announcement", and "An AI-Generated Drug?". In recent years I've written some overviews of my thoughts on AI drug discovery in general (and tried to address what I think are some misconceptions about both topics) in "AI and Drug Discovery: Attacking the Right Problems" in 2021, "AI and the Hard Stuff" in 2023, and "AI Does Not Make It Easy" in 2024.
If you read through some of those, you will get about as much of my opinions on the subject as anyone can stand, and some of them you will receive several times from slightly different angles. And you'll also be well-equipped to see why Hassabis' statements above make me want to spend some time staring silently out the window, mouthing unintelligible words to myself. But let me lay down some of these thoughts again, interspersed with some new ones that I've been talking about recently.
First off, (1) the huge majority of what people are calling AI these days is in fact machine learning. Nothing wrong with that - machine learning can be great stuff, when you have a large enough and well-curated-enough data set to feed into such systems, and when the problems you're trying to work on have well-defined boundaries. But I would like to add that in my opinion (2) machine learning does not create any new knowledge. It rearranges information you have already obtained and combs through it looking for correlations and rules and patterns, and it can do a far, far better job at this than any human could. If the problem space you are working in has few enough degrees of freedom in it, it can use these patterns to make extremely useful analogies and predictions - the protein structure work that Hassabis references is a shining example of this. But (3) I strongly believe that all machine learning is done by looking for patterns in some sort of language (be it a native human language (or a coding language) as with chatbots, mathematical symbols, the language of protein sequence and structure as with AlphaFold et al., and so on. The analogies between letters, words, sentences, and grammar hold up quite well across such systems. And I believe that Wittgenstein was right when he said that in any language there are things that cannot be said.
For protein structure predictors like AlphaFold, examples of those things that cannot be said include all the larger, more important questions such as "Which of these proteins is more responsible for this particular disease?" and "Which of these proteins would be more likely to lead to a successful drug?" as well as "Which of these proteins should we avoid working on because such projects will have more potential pitfalls than the others?" Boy, those would be great questions to answer! But a machine learning program that knows protein structure cannot answer them, no matter how amazingly well it knows protein structure. You can extend that line of thinking, and you're going to have to extend it if you talk about "curing disease" in general - there are so, so many important questions that we don't even know how to get started on answering.
Now, at this point I have to briefly note Hassabis' protein-centricity, which ignores (poly)nucleotides, lipids, carbohydrates, small-molecule signaling ligands and all the rest of the incredible variety of biomolecules that make up a living system. His point of view comes naturally from someone who has been involved in such great advances in protein structure prediction, but no, "everything in your body depends on proteins" (or any one class of molecules!) is such a reductionist take that it really doesn't get you anywhere useful. It's like saying that everything in the Mona Lisa depends on the paint. We have systems built on top of systems that build up other systems past them, and our knowledge of such things is completely inadequate to cure disease within ten years. And unfortunately, it's going to be inadequate at the end of that ten years, too - I will put that marker down, although it doesn't make me happy to do it.
That's because we don't have enough pieces on the table to solve this puzzle. We don't even have enough in most of these areas to know quite what kind of puzzle we're even working on. Nowhere near. And AI/ML can be really, really good at rearranging the pieces we do have, in the limited little areas where we have some ground-truth knowledge about the real-world effects when you do that. But it will not just start filling in all those blank spots. That's up to us humans. My most optimistic take on these technologies is that if things go really, really well they might be able to help guide us towards more productive research than we might otherwise have been doing, but we are going to have a lot of data to gather, a lot of answers to run down, and a lot of twists and turns and utter surprises to deal with along the way. We're going to need a terrifying amount of new knowledge before we can actually turn to any AI/ML systems and ask them the kinds of big questions I mention above.
I realize that many on the AI side of the business are hoping for a breakthrough in Artificial General Intelligence to cut through all the sort of bleating I've been doing. But how is AGI going to suddenly reveal what is now hidden? The sum total of all the medical information in the world right now is not enough. And it's going to go on being Not Enough for quite some time to come. I hate to be like this, but I really have no other position I can take.