OpenAI, Google, and Anthropic announced specialized medical AI capabilities within days of each other this month, but this clustering suggests competitive pressure rather than coincidental timing. However, despite marketing rhetoric emphasizing the transformation of healthcare, none of the products released have been approved as medical devices, approved for clinical use, or available for direct patient diagnosis.
OpenAI introduced ChatGPT Health on January 7, allowing users in the US to connect their medical records through partnerships with b.well, Apple Health, Function, and MyFitnessPal. Google released MedGemma 1.5 on January 13th, expanding its open medical AI model to be able to interpret 3D CT and MRI scans along with whole-slide histopathology images.
Anthropic followed Claude for Healthcare on January 11 with a HIPAA-compliant connector to the CMS coverage database, ICD-10 coding system, and National Provider Identifier Registry.
All three companies target the same workflow pain points (pre-authorization review, claims processing, clinical documentation) with similar technology approaches, but with different go-to-market strategies.
Developer platform rather than diagnostic product
The architectural similarities are notable. Each system uses multimodal large-scale language models that are fine-tuned based on medical literature and clinical datasets. Each emphasizes privacy protections and regulatory disclaimers. Each is positioned to support rather than replace clinical judgment.

The difference lies in the deployment and access models. OpenAI’s ChatGPT Health operates as a consumer service with a waiting list for ChatGPT Free, Plus, and Pro subscribers outside the EEA, Switzerland, and the United Kingdom. Google’s MedGemma 1.5 was released as an open model through the Health AI Developer Foundations program and can be downloaded through Hugging Face or deployed through Google Cloud’s Vertex AI.
Anthropic’s Claude for Healthcare integrates into existing enterprise workflows through Claude for Enterprise and targets institutional buyers rather than individual consumers. The regulatory position is consistent across all three.
OpenAI clearly states that Health is “not intended for diagnostic or therapeutic purposes.” Google positions MedGemma as a “starting point for developers to evaluate and adapt to medical use cases.” Anthropic emphasizes that the output is “not intended to directly inform clinical diagnosis, patient management decisions, treatment recommendations, or other direct clinical practice applications.”

Benchmark performance and clinical validation
Although medical AI benchmark results have improved significantly across all three releases, the gap between test performance and clinical deployment remains large. Google reports that MedGemma 1.5 achieved 92.3% accuracy on MedAgentBench, Stanford University’s medical agent task completion benchmark, compared to 69.6% for the previous Sonnet 3.5 baseline.
This model improved MRI disease classification on internal examination by 14 percentage points and CT findings by 3 percentage points. Anthropic’s Claude Opus 4.5 scored 61.3% in the MedCalc medical computational accuracy test and 92.3% in MedAgentBench with Python code execution enabled.
The company also claims that its “integrity ratings” related to de facto hallucinations have also improved, although specific metrics were not disclosed.
OpenAI does not specifically publish benchmark comparisons for ChatGPT Health, instead pointing out that “more than 230 million people worldwide ask health and wellness-related questions on ChatGPT every week” based on anonymized analysis of existing usage patterns.
These benchmarks measure performance on selected test datasets rather than actual clinical results. Medical errors can have life-threatening consequences, and translating benchmark accuracy into clinical utility is more complex than in other AI application domains.
Control pathways remain unclear
The regulatory framework for these medical AI tools remains ambiguous. In the United States, FDA oversight depends on the intended use. Software that “supports or makes recommendations to health care professionals regarding the prevention, diagnosis, or treatment of disease” may require premarket review as a medical device. None of the tools announced have been cleared by the FDA.
The liability issue is similarly unresolved. Mike Reagin, Banner Health’s CTO, said the health system was “attracted to Anthropic’s focus on AI safety,” which focuses on technology selection criteria rather than liability frameworks.
Existing case law provides only limited guidance regarding the apportionment of liability when a clinician relies on Claude’s prior authorization analysis and the patient suffers harm due to a delay in treatment.
Regulatory approaches vary widely by market. While the FDA and European medical device regulations provide a well-established framework for software-as-medical devices, many regulators in APAC have not issued specific guidance on generative AI diagnostic tools.
This regulatory ambiguity impacts market implementation timelines where gaps in healthcare infrastructure may accelerate implementation, creating tension between clinical needs and regulatory vigilance.
Administrative workflow rather than clinical decision making
Actual implementation will continue to be carefully considered. Louise Lindskopf, Novo Nordisk’s director of content digitization, explained that Claude is used for “document and content automation in pharmaceutical development,” with a focus on regulatory submissions rather than patient diagnoses.
Taiwan’s National Health Insurance Bureau applied MedGemma to extract data from 30,000 pathology reports for policy analysis rather than treatment decisions.
This pattern suggests that healthcare adoption is focused on administrative workflows (billing, documentation, protocol writing) where errors are not immediately dangerous, rather than direct clinical decision support, where medical AI capabilities have the most dramatic impact on patient outcomes.
Medical AI capabilities are advancing faster than the institutions implementing them can navigate the complexities of regulation, liability, and workflow integration. The technology exists. A $20/month subscription gives you access to advanced medical reasoning tools.
Whether it will lead to a transformation in healthcare delivery depends on the questions these coordinated announcements leave unanswered.
SEE ALSO: AstraZeneca bets on in-house AI to accelerate oncology research
Want to learn more about AI and big data from industry leaders? Check out the AI & Big Data Expo in Amsterdam, California, and London. This comprehensive event is part of TechEx and co-located with other major technology events. Click here for more information.
AI News is brought to you by TechForge Media. Learn about other upcoming enterprise technology events and webinars.

