tabithaclem's blurblog

LLMs for Medical Practice: Look Out
Wednesday November 19^th, 2025 at 9:32 AM

AAAS: Keyword search for query

As regular readers well know, I get very frustrated when people use the verb “to reason” in describing the behavior of large language models (LLMs). Sometimes that’s just verbal shorthand, but both in print and in person I keep running into examples of people who really, truly, believe that these things are going through a reasoning process. They are not. None of them. (Edit: for a deep dive into this topic, see this recent paper).

To bring this into the realm of medical science, have a look at this paper from earlier this year. The authors evaluated six different LLM systems in their ability to answer 68 various medical questions. The crucial test here, though was that the question was asked twice in two different ways. All of them started by saying “You are an experienced physician. Provide detailed step-by-step reasoning, then conclude with your final answer in exact format Answer: [Letter]” The prompt was written in that way because the questions would be some detailed medical query, followed by a list of likely options/diagnoses/recommendations, each with a letter, and the LLM was asked to choose among these.

The first time the question was asked, one of the five options was “Reassurance”, i.e. “Don’t do any medical procedure because this is not actually a problem”. Any practicing physician will recognize this as a valid option at times! But the second time the exact same question was posed, the “reassurance” option was replaced by a “None of the other answers” option. Now, the step-by-step clinical reasoning that one would hope for should not be altered in the slightest by that change, and if “Reassurance” was in fact the correct answer, then “None of the above” should be the correct answer when phrased the second way (rather than the range of surgical and other interventions proposed in the other choices).

Instead, the accuracy of the answers across all 68 questions dropped notably in every single LLM system when presented with a “None of the above” option. DeepSeek-R1 was the most resilient, but still degraded. The underlying problem is clear: no reasoning is going on, despite some of these systems being billed as having reasoning ability. Instead, this is all pattern matching, which presents the illusion of thought and the illusion of competence.

This overview at Nature Medicine covers a range of such problems. The authors here find that the latest GPT-5 version does in fact make fewer errors than other systems, but that’s like saying that a given restaurant has overall fewer cockroaches floating in its soup. That’s my analogy, not theirs. The latest models hallucinate a bit less than before and breaks their own supposed rules a bit less, but neither of these have reached acceptable levels. The acceptable level of cockroaches in the soup pot is zero.

As an example of that second problem, the authors here note that GPT-5, like all the other LLMs, will violate its own instructional hierarchy to deliver an answer, and without warning users that this has happened. Supposed safeguards and rules at the system level can and do get disregarded as the software rattles around searching for plausible text to deliver, a problem which is explored in detail here. This is obviously not a good feature in an LLM that is supposed to be dispensing medical advice - as the authors note, such systems should have high-level rules that are never to be violated, things like “Sudden onset of chest pain = always call for emergency evaluation” or “Recommendations for dispensing drugs on the attached list must always fit the following guidelines”. But at present it seems impossible for that “always” to actually stick under real-world conditions. No actual physician whose work was this unreliable would or should be allowed to continue working.

LLMs are text generators, working on probabilities of what their next word choice should be based on what has been seem in their training sets, then dispensing answer-shaped nuggets in smooth, confident, grammatical form. This is not reasoning and it is not understanding - at its best, it is an illusion that can pass for them. And that’s what it is at its worst, too.

Read the whole story

tabithaclem

14 days ago

reply

TOASTY HAS INSTAGRAM by Linda (noreply@blogger.com)
Monday November 17^th, 2025 at 9:58 AM

Must Love Pekes

Toasty is very popular in New York.

People stop his girl to see him.

Toasty’s adores his girl. Even when he looks out, he turns to gaze at her.

The window ledge is right next to the bed so he can climb up there and look out. He has steps to the bed so he can get up there easily.

His girl says the first thing she does in the morning is telling him how wonderful he is.

So many have asked her to set up an Instagram account for him so they can see regular updates on him.

If you want to follow him, just go to Tolstoythepeke on Instagram. Toasty is going to get a big head. 😂

Read the whole story

tabithaclem

16 days ago

reply

Making Pills. But Not Making Them Here.
Monday November 10^th, 2025 at 10:18 AM

AAAS: Keyword search for query

I recommend anyone who wants to learn more about generic drug manufacturing to read this article at the New York Times (it’s a gift link, free to read). There’s been a lot of coverage of drug manufacturers “on-shoring” production and packaging in response to pressure from the Trump administration, and there is definitely some of that happening (although it doesn’t happen as quickly as it might seem to). But this is another world entirely.

That’s because the generic drug business is so different from the patented prescription drug business that it might as well be a separate industry. Generics generally compete on price, for one thing, which is what (famously) patented drugs hardly do. And while prescription drug prices are high here in the US, our generic drugs have generally been some of the cheapest in the world. Competition is fierce, and there are a lot of manufacturers of all sizes around the world in the game, from really huge ones (Teva, Dr. Reddy’s, Sandoz and more) to little outfits in places that you would hardly believe.

There are generally a whole list of possible ways to make these older drugs, but most of those routes are in the “now superseded” category as cheaper ones were found. Some of those, though, are really only cheaper on scale or if you have the equipment, so for those small local producers it’s whatever works with whatever’s available. And there’s a lot of confusion even on the large scale, because there are often a number of actual producers of these drugs and even more resellers and repackagers. Trying to figure out just where a given batch of pills was made, all the way down to its various intermediates and reagents along the way, can be very difficult - and by the time you’ve worked it out, those pathways might have changed (!) See this post from back in May for an example.

What’s for sure is that not much of this sort of drug manufacturing takes place in the US, and there doesn’t seem to be much sign of it returning. That NYT article makes this painfully clear with a look at a former Dr. Reddy’s plant in Shreveport. It started out owned by Boots (UK) and then by BASF before being bought by Dr. Reddy’s in 2009, but it shut down earlier this year after losing millions of dollars a year for the company. They were mostly formulating and packaging drug substances that were themselves made overseas, but even that wasn't enough of a foothold. This sort of work has been on a downwards trend here a good twenty years now, both at the manufacturing end and at the formulation/packing end of the business, and the reasons are the same: very small margins in the generic business mean that every possible cost savings will be sought out. And it costs too much do it that sort of thing here as opposed to somewhere else.

That’s it, in one sentence. Which means that the main way the administration would be able to make these things come back to the US would be to make the foreign drugs even more expensive, which is where tariffs come in. The end result is that the consumer pays higher prices for the drugs, whether they even get made in the US or not. I don’t see how you’re going to suddenly make the US manufacturing more competitive under those thin-margin conditions: this isn’t something that you’re going to solve by throwing AI at it. There are machines and production lines, pill presses and coating machines and sorters and cartons and shrink-wrapping machines before things are loaded into boxes and taken away by fork lifts. The people who do this in India (to pick a major example), who run those machines and fill those bottles and load those boxes, cost five to ten times less than American workers cost to do those jobs. Which is why that plant in Shreveport is sitting there, with dust gathering on its increasingly outmoded equipment.

Read the whole story

tabithaclem

23 days ago

reply

EMMI IS LOVED by Linda (noreply@blogger.com)
Monday November 3^rd, 2025 at 9:46 PM

Must Love Pekes

Emmi came into rescue in spring of 2024. She was in great need.

She was special needs because of so much fear. She got that home that was perfect for her. Her mom just sent me this hand drawn picture that she was given on her birthday.

Emmi is loved greatly and accepted just as she is by her mom and dad. She has come a long way and is a very special girl. Happy Sunday Emmi. 💜

Read the whole story

tabithaclem

29 days ago

reply

LUDO, TOBY AND CALYPSO by Linda (noreply@blogger.com)
Monday October 27^th, 2025 at 9:25 PM

Must Love Pekes

Ludo (Fizziwig) and Toby (Gabriel) have been in their home since February. Toby was the first Christmas pup adoption. His new mom saw him on line and immediately wanted him. They had adopted multiple dogs from us and we knew they would give you a wonderful home.

When she came to adopt him, she met Ludo (Fizziwig) and she couldn’t leave w out him, too. ❤️❤️. (They still love to nap together.)

Toby (Gabriel🎄) loves his fellow PVPC alumni Calypso (Olivia). Olivia was a very young mini Australian shepherd we were asked to help.

Ludo and Calypso play together a lot. The cat tree makes a perfect place. Tug of war? Let’s go!

Get on a higher level Ludo. It will give you an advantage. 🪜

Toby, you can join in! Two against one should give you the advantage. 😉 There’s always someone to play with there. Kids, dogs, mom and dad. A great home like this is what we wanted for you and you each got one. 🏠❤️

Read the whole story

tabithaclem

36 days ago

reply

REN, MEI MEI AND NONO by Linda (noreply@blogger.com)
Tuesday October 14^th, 2025 at 9:20 AM

Must Love Pekes

We had a Christmas pup get together last weekend. It was great to see so many. We got together at a board members home and we all had lunch together. Ren (black), Mei Mei, brought their sister Nono (Noelle 🎄).

Ren had her own chair. 😁

Mei Mei stole it. 😂 I’ll write more about the reunion over the next week. It was just wonderful to see how far these Pekes have come. 🥰

Read the whole story

tabithaclem

50 days ago

reply

LLMs for Medical Practice: Look Out Wednesday November 19th, 2025 at 9:32 AM

TOASTY HAS INSTAGRAM by Linda (noreply@blogger.com) Monday November 17th, 2025 at 9:58 AM

Making Pills. But Not Making Them Here. Monday November 10th, 2025 at 10:18 AM

EMMI IS LOVED by Linda (noreply@blogger.com) Monday November 3rd, 2025 at 9:46 PM

LUDO, TOBY AND CALYPSO by Linda (noreply@blogger.com) Monday October 27th, 2025 at 9:25 PM

REN, MEI MEI AND NONO by Linda (noreply@blogger.com) Tuesday October 14th, 2025 at 9:20 AM

LLMs for Medical Practice: Look Out
Wednesday November 19^th, 2025 at 9:32 AM

TOASTY HAS INSTAGRAM by Linda (noreply@blogger.com)
Monday November 17^th, 2025 at 9:58 AM

Making Pills. But Not Making Them Here.
Monday November 10^th, 2025 at 10:18 AM

EMMI IS LOVED by Linda (noreply@blogger.com)
Monday November 3^rd, 2025 at 9:46 PM

LUDO, TOBY AND CALYPSO by Linda (noreply@blogger.com)
Monday October 27^th, 2025 at 9:25 PM

REN, MEI MEI AND NONO by Linda (noreply@blogger.com)
Tuesday October 14^th, 2025 at 9:20 AM