top of page

Landmark dataset to accelerate research on anterior segment eye disease

  • INSIGHT communications team
  • 8 hours ago
  • 4 min read

Diseases affecting the anterior segment, the front part of the eye, are among the leading causes of visual impairment and blindness globally, yet few large-scale datasets have been available for these conditions. In response, clinical researchers at Moorfields Eye Hospital NHS Foundation Trust and University College London (UCL) Institute of Ophthalmology have developed the world’s most comprehensive anterior segment dataset, CADMUS, comprising more than 900,000 images linked to clinical records.

 

Cataract and loss of corneal transparency are two of the leading causes of blindness in adults, while in children, loss of corneal transparency is the third most common cause of blindness globally [1]. Other anterior segment conditions, such as keratoconus, disproportionately affect young people and have no cure, relying on early detection and close monitoring to prevent permanent vision loss. However, the health data research tools needed to investigate such conditions have lagged significantly behind.

 

Less than 10% of publicly available ophthalmic imaging datasets address the anterior segment of the eye, according to a 2021 review in The Lancet Digital Health [2]. More recently, a 2024 review identified 26 anterior segment datasets globally [3]. Of these, 80% originated in either the US, China, or Europe, over half contained predominantly images of normal healthy eyes, and most were incompletely described. The median dataset size across the broader ophthalmic imaging literature was just 50 patients.

 

In comparison, the CADMUS dataset comprises 945,243 images from 22,482 unique patients, collected during routine clinical care at Moorfields Eye Hospital between December 2019 and September 2024. Approximately 96% of patients have multiple follow-up visits, facilitating the study of disease progression and long-term outcomes. The data reflects the demographic diversity of the Moorfields patient population, with broad representation across ethnicities and age groups typical of London.

 

Data collection and curation was enabled by INSIGHT, the Eye and Oculomics Health Data Research Hub at Moorfields, which is the world’s largest bioresource of ophthalmic imaging data linked to clinical records. Through INSIGHT, researchers will be able to apply for access to CADMUS, which has been published as a datasheet in Ophthalmology Science.

 

CADMUS captures a broad spectrum of anterior segment pathology, such as keratoconus, Fuchs' corneal dystrophy, corneal scarring and cataracts. Over 40,000 surgical records from more than 12,000 patients are detailed, spanning cataract operations, corneal cross-linking, corneal transplant, keratectomy and other corneal procedures. Systemic co-morbidities are also captured, including hypertension and type 2 diabetes.

 

The dataset integrates three complementary data types:

  • Raw DICOM images from the MS-39 anterior segment OCT tomographer, including high-resolution radial OCT cross-sections, Placido-disc corneal topography maps, and external eye photographs

  • Derived quantitative indices, including keratometry values, pachymetry, wavefront aberrometry, and AI-generated classifier scores for keratoconus and related conditions

  • Linked electronic health record data, covering demographics, diagnoses, surgical procedures, visual acuity, and refraction measurements

Three black and white photos, from left: Placido-disc–based anterior corneal topography, radial swept-source OCT cross-section, and external eye photograph
Representative MS-39 imaging output showing Placido-disc–based anterior corneal topography, radial swept-source OCT cross-section, and external eye photograph. (Reproduced with author permission under Creative Commons Attribution 4.0 International Licence.)

 

CADMUS is intended to support research into early disease detection, surgical outcome prediction, health equity analysis, and the development of AI tools that can be deployed in real-world ophthalmic practice. The need for resources of this kind is widely recognised. The European Society of Cataract and Refractive Surgeons (ESCRS), which supported development of CADMUS, has committed to annually updating a catalogue of available anterior segment datasets through 2030, acknowledging that the field has, until now, been severely under-resourced.

 


Colour head and shoulders photo of Shafi Balal
Dr Shafi Balal

Datasheet lead author Dr Shafi Balal said: “Early research using CADMUS data has already produced promising results. We have used the dataset to establish precision limits for keratoconus progression measurement, providing a scientifically grounded basis for defining disease progression. We have also trained deep learning models on CADMUS. One model can predict patient age and biological sex from anterior segment scans, demonstrating that routine clinical images carry rich biological signals invisible to the human eye”. Dr Balal is an ophthalmic surgeon at Moorfields Eye Hospital and NIHR doctoral fellow at UCL.

 

Senior author Professor Bruce Allan said: "CADMUS exemplifies what is possible when world-class clinical care, robust data infrastructure, and a commitment to open science come together. INSIGHT at Moorfields is helping to translate the NHS's extraordinary patient data into research breakthroughs, and CADMUS is a prime example. No comparable publicly accessible anterior segment dataset offers longitudinal depth at this scale." Professor Allan is consultant ophthalmic surgeon at Moorfields Eye Hospital and honorary professor at UCL.

 

Extraction, anonymisation, and structuring of clinical data in CADMUS was enabled through INSIGHT's secure cloud-based environment. All patient data is irreversibly anonymised using cryptographic methods, and the dataset operates within an ethical framework approved by the West of Scotland Research Ethics Service. The NHS National Data Opt-Out programme is fully implemented, ensuring patient rights are protected throughout.

 

Researchers wishing to access CADMUS can do so through INSIGHT's established Data Use Application process, which includes oversight from an independent patient and public advisory board, and applies the internationally recognised "Five Safes" framework evaluating safe projects, safe people, safe data, safe settings, and safe outputs.

 

Find the CADMUS datasheet in Ophthalmology Science


Development of CADMUS has been supported by the NIHR, ESCRS, and T.F.C. Frost Charitable Trust. NIHR and Moorfields Eye Charity are supporting Dr Balal in development of an AI foundation model for the anterior segment trained on CADMUS data.



Contact the INSIGHT team to enquire about access to CADMUS: enquiries@insight.hdrhub.org


 
 
bottom of page