Listen: Apple | Spotify | YouTube | Amazon

How will artificial intelligence impact some of our most intimate — and our most vulnerable — experiences, like going to the doctor?

Mohsen Bayati, professor of operations, information & technology at Stanford Graduate School of Business, is a mathematician who is fascinated by the potential of complex equations to have impact on real-world issues like healthcare.

“If I’m picking a problem that I want to solve, I might as well pick something impactful, and healthcare seemed to be one of those areas.”

Today Bayati is deeply engaged in identifying how artificial intelligence can improve the experience of healthcare for providers and patients. “There’s a lot more problems than solutions available,” he says. “So it’s ripe for innovation.”

In Bayati’s view, integrating new technology into existing healthcare systems requires a leap of faith — and guardrails to protect patients from AI that doesn’t function as intended. But that doesn’t mean waiting for perfection, Bayati says. “[We] need to have patience with the benefits of these systems.”

If/Then is a podcast from Stanford Graduate School of Business that examines research findings that can help us navigate the complex issues we face in business, leadership, and society. Each episode features an interview with a Stanford GSB faculty member.

Full Transcript

Note: This transcript was generated by an automated system and has been lightly edited for clarity. It may contain errors or omissions.

John King: When the world thinks of bridges in the Bay Area, they think of the Golden Gate Bridge.

Kevin Cool: That’s John King. He spent decades covering architecture and urban design for the San Francisco Chronicle.

John King: It’s beautiful. It is the leading icon of the city, if not the region. The Bay Bridge, though, is the real workhorse.

Kevin Cool: The San Francisco-Oakland Bay Bridge opened in November 1936, six months before the more famous bridge on the other side of the Bay.

John King: It is longer. It is remarkably robust, and it is a structure that literally links the two halves of the region together in a much more critical way than the Golden Gate Bridge.

Archive: San Francisco, metropolis of the West, seaport of the Pacific, glories in its calm, landlocked bay.

Kevin Cool: In the 1920s, engineering marvels were emerging in cities around the world. Bay Area residents were excited about ditching the ferry ride for a bridge that would allow them to easily cross the bay.

Archive: As far back as 1856, men dreamed of linking the city of San Francisco with the east shore of the Bay. As the years passed, many proposals were made, but it remained for modern science and engineering to overcome almost insurmountable obstacles and make the dream come true.

Kevin Cool: For example: The U.S. Navy didn’t want anything hampering its ships getting in and out of the Bay.

John King: The Navy would just say, no, this is not allowed. Then Herbert Hoover became president in 1929, he was at Stanford, he very much knew the Bay Area, and he essentially told them: get this done.

Kevin Cool: Seafaring traffic wouldn’t be the only challenge: It being San Francisco, there was also the risk of earthquakes.

John King: The thing about designing for seismic stress is that structural engineers learn after each major earthquake anywhere in the world. And you kinda learn from mistakes, sadly.

Kevin Cool: Despite these obstacles, the bridge was built — but wouldn’t be truly tested until a 6.9 magnitude earthquake struck the region in 1989.

Archive: Upper deck of the Bay Bridge has collapsed on Oakland side. The upper deck of the Bay Bridge has collapsed….911 emergency.

John King: What the 1989 earthquake showed was things that seem easier can be lethal. That’s where when actually, the upper deck collapsed down on the lower deck, and miraculously only killed one person.

Kevin Cool: Half the bridge needed to be rebuilt.

John King: It’s tricky because with a bridge like the Bay Bridge. It is such a workhorse. So many people rely on it, hundreds of thousands of people every day, and there is no room for error.

Kevin Cool: The reconstructed section opened in 2013, but until the next earthquake tests the new safeguards. I for one am a bit uneasy when I cross the bridge. And I’m sure I’m not alone.

When a new technology is introduced, sometimes the fear of calamity overshadows the excitement of what’s possible, especially when the stakes are life and death. That certainly applies to artificial intelligence, which is increasingly being used in high stakes settings — like hospitals.

This tension between innovation and safety in health care is a research subject of Mohsen Bayati. Mohsen is a professor of operations, information, and technology at Stanford Graduate School of Business. That research is our focus on today’s episode.

This is If/Then, a podcast from Stanford Graduate School of Business. I’m Kevin Cool, senior editor at the GSB.

Kevin Cool: You started your career as a mathematician and found that unsatisfying for reasons that maybe you can describe, but how did you venture from that to health care?

Mohsen Bayati: That’s a very good question. So math is amazing. Like, I would like to spend hours and hours working on math problems. Aspects of math will become basically like puzzle solving, like there are problems and you would like to solve them. And then at some point, at least to me, it felt like you wouldn’t see the problems that you’re solving could benefit anyone.

So you start considering, okay, so what are the other areas I can apply my math? In fact, finance was the number one. But to me, I was thinking, if I’m picking a problem that I want to solve, might as well pick something that, I see the most impactful — and health care seemed to be one of those areas.

Kevin Cool: So, a fair amount of your work has involved the various applications of AI in health care. First of all, what makes health care such a compelling area for AI?

Mohsen Bayati: For health care specifically, on one hand, it’s just a big part of our GDP, health care costs. Basically what I’m trying to argue is that there’s a lot more problems than solutions available. So I would say, it’s a field with a lot of questions, a lot of problems. So it’s ripe for innovation. I think that’s an important element.

Now, if I want to kind of fast forward, at least talking to clinicians — and I think this is applicable in many domains — when they want to make decisions about patients, they have to go through substantial amount of data or information. They’re meeting maybe sometimes 10 patients per day, in a visit. So as a patient, when I go and visit them, for a 30 minute visit, and let’s just say I’m a patient of this doctor, that I saw that same doctor maybe five times and I saw maybe similar doctors on the same condition — imagine I have a chronic problem as a patient — so that clinician that I’m meeting, ideally should have gone through all of this information from the past visits, the whole history. And a lot of these are images, like text, lab results.

I noticed myself, like when I see my doctors, like they’re really good, but rarely you see a clinician is actually fully aware of — like they don’t remember what happened.

Kevin Cool: I’ve always suspected this as well —

Mohsen Bayati: So you have to always remind them, “Remember, two years ago, we had this discussion…” and they look it up.

“Oh, yeah, you’re right.”

So you have to kind of guide them. Of course, they have amazing intuition and they over years learned to focus on the smallest but most important pieces of information. But if they would be empowered to process a lot more, they can make much more effective decisions.

So, basically clinician time is very precious and AI has this power that can process vast amounts of information from a single patient or from a lot of similar patients and empower that clinician to basically — when they want to make the decision — they have a lot more information than they currently have.

Kevin Cool: So as with any technology, and I think maybe AI in particular, there is some uncertainty around how reliable it is, when and where to use it, and so on. What’s been your experience in your work with how AI is trusted among the people who are using it?

Mohsen Bayati: That is one of the big challenges of adoption of AI in health care, which is, if I’m a clinician and I am now trusting this system that is providing me with information about the uncertain outcomes, in a way I am delegating. I’m delegating my due diligence of going through every piece of patient profile that would matter to this AI system. I had to trust it. And you know, how do you gain trust? It’s very difficult. So there are methods that AI can gain trust among clinicians. Of course, the best one is, you know, you really see it. If I’m a clinician, if I see numerous instances that it is making the right call, then, you know, if 20 times it made the right call, I actually trust more.

Kevin Cool: So, what would be an example?

Mohsen Bayati: So one example could be, the pharmacy application that I worked on. I’m shifting gear to pharmacy, but I’ll use that one first.

As a pharmacist or a pharmacy technician, when I use AI, I need to fill in prescription. And there’s a medication that has arrived and let’s just say it’s a blood thinner. And then the doctor has a direction, a patient should take this, for example, once per day, and certain dosage specification.

But normally what clinicians write as instruction is coded in pharmacy language, and sometimes it’s very brief. And AI translates this. So, AI can take that coded message and turn it to a patient understandable text, which is, for example, take one tablet by mouth once daily.

Kevin Cool: Right.

Mohsen Bayati: So AI can make that suggestion and then the pharmacist in this case could approve this. This makes sense. This is the right direction. So now, if an AI gives me directions and even just one of them is bizarre, like for an injectable drug, it would say take that by mouth —

Kevin Cool: So it makes mistakes.

Mohsen Bayati: Makes mistakes. But a mistake like that, immediately the receiving end pharmacist would lose trust in that AI system. So the trust comes in through not seeing these bizarre mistakes. And we call these now “hallucination” for many of the generative AI models, but in the kind of a more traditional other models, these are a little more nuanced to detect.

So now I can give you another example, which is a problem that I worked on a few years ago.

So this was for prostate cancer patients, after they have sometimes like a radiation or a surgery. Post this major treatment, how should clinicians plan the upcoming years in terms of follow up treatments or they call them “insurance policy” type of treatments. So they need to know, what’s the risk of the cancer coming back?

So, can AI predict based on what they see in the patient profile that this patient’s cancer may come back? And here it’s more nuanced, because the AI will just give me a number between say zero to a 100%, like for this patient, there’s a 80% chance that the cancer will recur, for the other patient, there’s 20% chance the cancer will recur.

And now to make this more actionable, sometimes there’s a lot of additional kind of a benefit impact analysis that’s done like this patient is green, this patient is yellow, this patient is red. And then from there you make decision.

So now, if I’m a clinician, and I see red, but then nothing in the file, like every, this looks like a good patient, healthy patient, there is no signal. Then I start losing trust.

Kevin Cool: So that trust presumably gets in the way, first of all, of adoption. But then also, one of the impacts of that resistance to adoption is that the AI doesn’t improve, right?

Mohsen Bayati: Yes. There are various types of AI that improve after adoption through user feedback. And so those methods will be a lot harder to apply in the health care setting once they are not adopted. And there’s another element, in tech companies, when they deploy AI, they run experiments, and when I say tech companies, for example, think of ride sharing apps. The user interface we see, it gives me a wait time —

Kevin Cool: Right, when you get an Uber, it will arrive in seven minutes

Mohsen Bayati: This is an AI system predicting that, right? So maybe they want to come up with a new AI system that is more accurate. But the only way to know this is beneficial is to test it. They call them testing phases, meaning they go live, and assuming you do this correctly, you can test that, in fact, this new AI algorithm improved upon the old one. So you use it, versus sometimes these tests reveal that no, the algorithm we thought is helpful is actually not helpful.

Now in health care, this in-production testing is a lot harder and for the right reasons. You know, you don’t want to just randomly assign patients treatments.

Kevin Cool: So where are we today, if you were trying to characterize the status of AI adoption and AI use by health care workers, is it improving? What are the impacts of that?

Mohsen Bayati: There is a lot of success cases. So one area that there has been huge adoption is radiology. In fact, I would say three quarters of FDA-cleared AI innovations were in radiology. And the reason being the algorithms, called deep learning methods, are really good at processing images. And also this trust aspect is a lot easier because, there’s an image and then AI, for example, would circle — there could be a tumor here. Then immediately they can look at it, and the verification aspect is quick. If you can verify and the AIs answer very quickly, it makes adoption easier. But at the same time, the algorithms were really good.

Kevin Cool: So radiology has been a success.

Mohsen Bayati: Radiology has been majority success.

Another area nowadays, because of generative AI, I see multiple times recently when I go to doctor, they ask me, “I’m using an AI note taking assistant. Are you comfortable with them doing that?”

So, now AI is listening in to patient and doctor conversations when the patient is comfortable, and then takes notes. It saves time because after the visit, like the clinician can just review those notes and they can edit

Kevin Cool: So what sort of disclosures do you think are necessary when AI is being used in a treatment setting?

Mohsen Bayati: I think the most immediate ones are privacy aspects. The patient would be immediately concerned, will my data be used to train the system? And if their data is not going to be used, they need to be told. Or if it is going to be used, how it’s going to be used. Other than that, the patient really cares about being treated. So if AI empowers a clinician to come up with a better…

Kevin Cool: They’re probably going to say yes.

Mohsen Bayati: They’re going to say yes. So I think the privacy is the number one aspect. Because it is not, it’s not trivial. There are systems that have been built to be private. But sometimes inadvertently, a patient’s information leaks.

And you have to sometimes compromise, make the AI weaker, to improve privacy. And, it’s a compromise that organizations who are deploying these need to take into account.

Kevin Cool: Well, what you’re describing now with doctors’ notes, there have been press reports about ChatGPT transcriptions making up entire paragraphs of text. So certainly it’s not infallible yet, but as it improves and trust grows, one would think that this would be a real burden lifted from physicians. And one effect would be they would have more time. And secondly, it might affect their satisfaction, because I can’t imagine that’s their favorite part of the job.

Mohsen Bayati: Yes. That’s one reason anyone whose job involves a lot of repetitive writing, which is not the main part of their job, is really enjoying using AI for that. But it requires, I would say, huge care. Because the burden of verification is on us still, the users. And as we over rely on them, we have to be more careful.

Kevin Cool: So given all that we’ve talked about, the adoption of AI, the issues around trust, but the potential for really transformative change in health care, what is most important for us to understand to get this all right?

Mohsen Bayati: I think the number one thing to take is that this is just an algorithm. Because very soon after interacting with this modern, powerful AI it’s like people start getting a sense that this system knows everything. Like, it’s a being, has all the knowledge.

Kevin Cool: It’s omniscient.

Mohsen Bayati: So, maybe a lot of folks already know this, but I think it’s always good to have a helpful reminder of that, that it is just a prediction algorithm.

The other aspect that I think is a little unusual is because we work with software, like with our phones, with our computers, and if you press a button, they always have the same answer. When we press that button, each button has a definite purpose.

But AI systems are not like that. They are probabilistic, meaning, if I ask a question from AI today or again, if I ask the same question again, I may get a different answer.

Kevin Cool: Mmhmm.

Mohsen Bayati: So, I think getting used to the probabilistic nature of AI. We may start realizing when it makes a recommendation, sometimes when it gives an answer, it’s not the right answer. There is a lot of potential answers it can produce. And it just sampled one of them and gave one of them to us. And there is actually a randomization there, it’s not really thinking, their thinking is just generative sequence of words, what’s the most likely word coming after the other one? And then after that give an answer, and if that turns out by coming up with a sequence of words first and then coming with a final answer, you can give a more accurate answer.

Kevin Cool: Well, presumably, this speaks to the need to keep humans involved.

Mohsen Bayati: Yes. You touched on the most important point. There is this topic that is called alignment of AI, what the user intends to use it for the intended goal.

And what’s very evident now, is that there is always misalignment. Not intentional misalignment. The AI is trained on some data. And literally on that data, it’s trained to, given a sequence of words, to predict the next one.

But at the end of the day, it is as good as the data that it’s consumed. And one challenge is most of these AI systems still have some low quality or even bad data in their training data. Because where does that training data come from? It comes from a lot of books, a lot of academic papers, but at the same time, a lot of blogs, Reddit pages, et cetera.

So you can find incorrect facts, misinformation, all sorts of things there as well, and the AI is taught those, and they don’t disappear apparently. There is a recent paper from Anthropic which says even this post-training phase of AI algorithms that try to fix some of these mistakes in the data doesn’t seem to be completely removing them.

Kevin Cool: So the mistakes live on.

Mohsen Bayati: Yeah, live on in the brain of AI, or basically the weights of this very large neural network. And at some unpredictable opportunity, they may surface. And the unpredictable opportunity can be defined by the input prompt. The input prompt to this AI system may trigger, bring back, that incorrect information.

So being aware of that as a user makes you realize you have to always put safety guardrails in place even when they get really good. Many of the folks who work in this field believe in a year or two we’re gonna have AI systems that will be very powerful in many tasks. It’s good to know that these systems could make big mistakes, and we need to have safety guardrails in place.

Kevin Cool: Well, and particularly in an area like health care, where the stakes are so high. You can make a mistake in a different domain, perhaps, and no one is going to die, literally. So what are some other questions or some other research topics that you have in mind? What’s the next thing you want to explore?

Mohsen Bayati: So on this topic of alignment, I kind of mentioned there is bad training data in these AI systems. So I’m trying to see, can we mathematically understand what really happens, because at least one argument has been, as these models get bigger, the number of parameters in them get larger, and as we give them more training data, you would expect that these problems that I mentioned may disappear.

Kevin Cool: Hmm.

Mohsen Bayati: And mathematically, I’ve been trying recently with a group of very bright students to try to understand, is that true? Could making these networks bigger and training data larger make these problems disappear? And our initial findings is that that’s not the case. Going back to the point I was raising that, the mistakes will stay even —

Kevin Cool: They persist in the data.

Mohsen Bayati: In the data. And another topic that I’m very interested in is running experiments, like when you have any AI algorithm, or any intervention in the health care or outside health care, there is a need to test it, how it performs. And sometimes running these tests is very complicated.

In fact, during the pandemic, I was thinking about this question, you know, in many fields you run A/B test experiments where 1000 patients are recruited, half of the patients get new medication or new treatment and the other half get placebo treatment. Is the treated group better off or not?

But for a contagious disease like COVID, you can’t do that. Because even the treated patients are going to benefit potentially the untreated patients because this is a contagious disease. So we cannot really measure actually the difference correctly. So that became a very interesting question.

Kevin Cool: So, what would you want to communicate to health care leaders?

Mohsen Bayati: Sometime the benefits may not materialize immediately, because anytime humans interface with AI, initially there’s this big change in the system. There may be negative outcomes like quality of care, for example, may drop. Or efficiency of the processing prescriptions may drop. Or sometimes, and that’s the most common one, AI systems, when we think we deployed them correctly, we didn’t deploy them correctly.

So basically they need to have patience with the benefits of these AI systems. But at the same time, having specific guardrails that you know, if we are trying to deploy something during the deployment, how should we make sure the quality of care doesn’t drop? So it’s a leap of faith as well as safety guardrails.

Kevin Cool: Faith, vigilance, patience, right?

Mohsen Bayati: Yes.

Kevin Cool: So we’re, we’re going to ask you to make a prediction, Mohsen. It’s 2050, so we’re 25 years down the road. How much do you think we’ll be relying on AI, and what will be the health care landscape as a result?

Mohsen Bayati: It’s very difficult to make that prediction. I once made this prediction fifteen years ago, that I thought in five years, there will be a lot more AI usage than I envisioned at that time. And I was far off, meaning the adoption was a lot slower than I thought.

However, what I’ve seen in the last two years is unprecedented, two and a half years since, kind of ChatGPT release and generative AI. There is huge progress.

Kevin Cool: And the velocity of improvement seems to be increasing.

Mohsen Bayati: Exactly, to the level that the progress is assisted by AI itself, like the researchers are using AI to do the research. So with that, then, you would be more conservative in the positive direction. So it’s hard to guess, but I’m expecting we will rely hugely on AI.

Kevin Cool: Well, thanks, Mohsen. This is really an interesting area and good luck with your work. I think there’s a lot riding on this for all of us.

Mohsen Bayati: Thank you. It’s been a pleasure.

John King:  As romantic as ferries seem now, that was seen as the old technology, it would’ve been the equivalent of, well, gee, we all love riding horses, it’s so nice, so let’s not build automobiles.

Kevin Cool: Our show is written and produced by Making Room and the Content and Design team at the GSB. Our managing producers are Michael McDowell and Elizabeth Wyleczuk-Stern. Executive producers are Sorel Husbands Denholtz and Jim Colgan. Sound design and additional production support from Mumble Media and Aech Ashe.

Special thanks to John King, architecture critic and author of “Portal: San Francisco’s Ferry Building and the Reinvention of American Cities.”

For more on our faculty and their research, find Stanford GSB online at gsb.stanford.edu or on social media @stanfordgsb. If you enjoyed today’s conversation, consider sharing it with a friend or colleague. And remember to subscribe to If/Then wherever you get your podcasts or leave us a review. It really helps other listeners find the show.

We’d also love to hear from you. Is there a subject you’d like us to cover? Something that sparked your curiosity? Or a story you’d like to share? Email us at if then pod at stanford dot edu. That’s i f, t h e n, p o d at stanford dot edu.

Thanks for listening. We’ll be back with another episode soon.

For media inquiries, visit the Newsroom.

Explore More