Google's AI mammography studies show real promise for NHS screening workflows

The UK’s NHS Breast Screening Programme is in a bind. It relies on double-reading — two human radiologists review each mammogram, with arbitration if they disagree — and that’s been effective. But there’s a 30% shortfall of clinical radiologists right now, and it’s projected to hit 40% by 2028. That’s not sustainable.

Google Research has been poking at this problem for a while. This month, they published two companion studies in Nature Cancer (shared earlier in March 2026) that look at how their AI-based breast cancer detection system might fit into existing workflows. The work was done in partnership with several NHS organizations as part of the AIMS study.

The first study is split into two phases. Phase 1 is a retrospective evaluation of the AI system’s standalone performance across 125,000 women (115,973 after filtering) screened at five NHS services. These services used three different clinical workflows — varying whether the second reader was blinded to the first and how arbitration cases were selected. The AI operating points were tuned per service to account for local population and workflow differences.

The primary endpoints measured the AI’s sensitivity and specificity against the historical first reader. The ground truth here is solid: a 39-month follow-up window that catches interval and next-round cancers long before they’d show up clinically. They also looked at lesion-level localization — making sure the AI wasn’t just guessing but actually pointing to the right spot in the breast — and fairness analyses. This phase was purely retrospective, so no human readers were bothered.

Phase 2 was a prospective, non-interventional deployment study to see how the live system would actually integrate into real clinical workflows. This is the sort of thing that often gets skipped in AI research, so I’m glad they did it. The results suggest the system can be slotted in without major disruption, but the paper acknowledges there’s more work needed before this is ready for prime time.

The second study is an end-to-end reader study comparing the original double-read-and-arbitration process to one where the AI system serves as the second reader. This is the more interesting one from a workflow perspective. If AI can replace one of the human readers without sacrificing accuracy, that’s a direct solution to the staffing shortage. The results are promising, but again, the authors are careful not to overclaim — they say additional work is needed to prove effectiveness in prospective clinical practice.

What I like about this is the scale and the real-world focus. A lot of AI-in-medicine papers are small lab studies that don’t translate. These studies actually went into multiple NHS sites with different workflows and tested the system against a rigorous ground truth. The lesion-level localization analysis is also a nice touch — it addresses the “black box” criticism that AI might be finding spurious correlations rather than actual pathology.

That said, I have two gripes. First, the retrospective phase is still retrospective. Retrospective studies can overestimate AI performance because the ground truth is known and the data is curated. The prospective deployment phase helps, but it was non-interventional — meaning the AI wasn’t actually changing clinical decisions. We need an interventional trial where the AI’s recommendations are acted upon before we can really trust it.

Second, the fairness analyses are mentioned but not detailed in the summary. Given the well-documented biases in medical AI — especially across different demographic groups — I’d want to see the numbers. Is the AI equally accurate across different ethnicities, breast densities, and age groups? If not, we’re just swapping one problem for another.

Still, this is solid work. The NHS needs solutions, and AI-assisted double-reading could be a big part of that. The next step is a proper prospective interventional study. I’ll be watching for that one.

Google’s AI mammography studies show real promise for NHS screening workflows

Comments (0)