Doctors 74; Doctors+AI: 76; AI alone: 90 — Are Humans the Problem?

Source: LinkedIn (Allie Miller)

Background:

My work in the last 10 years has focused on human thinking in the age of AI. Initially, I called it “distributed intelligence,” which means discussing human and machine thinking together.

However, there is one big hurdle to human-machine thinking—humans, as depicted so clearly in this research published on October 28, 2024, in JAMA (Journal of the American Medical Association).

The settings of the research are depicted visually here:

Source: JAMA

Key results:

Fifty physicians (26 attendings, 24 residents; median years in practice, 3) participated virtually and at 1 in-person site. The median diagnostic reasoning score per case was 76% for the LLM group and 74% for the conventional resources-only group.

Comparing LLM alone with the control group found an absolute score difference of 16 percentage points favoring the LLM alone %90.

So bottom line: Just Doctors (74%); Doctors+AI (76%), AI Alone (%90).

(Note: the complete research also reports saving time in the doctors+AI group, but this is beyond the scope of this spark.)

Conclusion: The Human Factor

Allie Miller, who reported this research, suggests the following: 

  1. Overconfidence: Doctors often ignore ChatGPT’s correct diagnoses if they conflict with their own. How can we get AI to explain the why and influence better without manipulating?
  2. Underuse: Doctors are undertrained in AI and treat it like fancy Google (rather than copying and pasting the whole patient history in and “talking” to the data).

The more significant lesson is that driving a car is much different than riding a horse; the rules, agreements, and skills (and even licenses) are much different. By analogy, the AI’s rules, methods, and attributes must be developed to gain its full value.

See more: 

  1. Full current research 
  2. Full similar (28-Apr-23) research 
  3. List of Human Cognitive Biases (Source: Wikipedia)

Leave a Reply