OpenAI Model Outperforms Doctors in Emergency Room Diagnosis Trial

A Harvard study published in the journal Science found that an artificial intelligence system outperformed human doctors in emergency room triage diagnosis accuracy. In one experiment involving 76 patients at a Boston hospital, OpenAI's o1 reasoning model correctly identified the exact or very close diagnosis in 67 percent of cases, compared with 50 to 55 percent for teams of human doctors. The AI had access to the same electronic health records that doctors typically review, including vital sign data, demographic information, and brief notes from nurses about why the patient arrived. When more detailed information was available, the AI's accuracy rose to 82 percent, compared with 70 to 79 percent for specialist doctors. That gap was not statistically significant, suggesting the technology excels most where decisions must be made rapidly with minimal information. The AI also outperformed human doctors on longer-term treatment planning. When asked to examine five clinical case studies and develop treatment plans, the model scored 89 percent compared with 34 percent for doctors using conventional resources such as search engines. In one notable case, a patient presented with a blood clot to the lungs and worsening symptoms. Human doctors suspected the blood thinners were failing, but the AI identified lupus as the underlying cause of lung inflammation, a conclusion that was subsequently confirmed. The researchers cautioned that the AI was tested only on text-based information and was not evaluated on visual cues such as a patient's appearance or level of distress. Lead Approximately one in five US physicians already use AI to assist with diagnosis, according to recent research. In the United Kingdom, 16 percent of doctors use AI daily and 15 percent use it weekly, with clinical decision-making cited as a common application in a Royal College of Physicians survey. British doctors' primary concerns are AI error and liability risks, particularly given the absence of a formal accountability framework.

OpenAI Model Outperforms Doctors in Emergency Room Diagnosis Trial

Meta Acquires Robotics AI Startup Assured Robot Intelligence for Humanoid Push

ByteDance's Anew Labs enters AI drug discovery with IL-17 inhibitor

Five Eyes agencies publish guidance on securing agentic AI deployments

Avoca raises more than $125 million to build AI agents for home-services businesses