Skip to content

First AI Model Developed for Diagnosis of Hip Dysplasia From Plain Radiographs

Key findings

  • In this study, two deep learning models were developed to diagnose hip dysplasia from plain radiographs and detect its severity of dysplasia, the first application of artificial intelligence to that task
  • Both models were tested on the same 103 radiographs from adults who had undergone total hip arthroplasty, with a center edge angle ≤20° considered dysplastic and an angle of 20° to 25° considered borderline
  • Both models struggled to distinguish between normal and borderline hips, but they were 83% to 92% accurate in classifying hips into dysplastic or non-dysplastic
  • The models were also successful in determining the severity of hip dysplasia according to the Crowe or Hartofilakidis classifications; most misclassifications were ±1 class
  • The models could be used by professionals who lack experience in recognizing hip dysplasia, as well as for automated diagnosis and classification in large institutions

Hip dysplasia is a risk factor for hip osteoarthritis, but a recent study published in Acta Orthopaedica showed general radiologists overlooked this condition in 93% of cases. Furthermore, several studies have shown pronounced disagreement between orthopedic specialists who are asked to diagnose hip dysplasia by reading the same radiographs.

Deep learning models, a subcategory of artificial intelligence concerned with image analysis and pattern recognition, have been developed for the diagnosis of several orthopedic disorders. Now, researchers at Massachusetts General Hospital have developed the first deep learning models that diagnose hip dysplasia from plain radiographs and classify its severity.

Martin Magnéli, MD, PhD, of the Harris Orthopaedics Laboratory in the Mass General Department of Orthopaedic Surgery and of the Karolinska Institutet, Kartik M. Varadarajan, PhD, formerly an assistant professor in the Department of Orthopaedic Surgery at Harvard Medical School, Orhun K. Muratoglu, PhD, director of the Harris Orthopaedics Laboratory, and colleagues report in BMC Musculoskeletal Disorders.

Methods

The researchers obtained 1,022 anteroposterior pelvis radiographs of adults who had undergone total hip arthroplasty and divided them into training, validation, and test subsets (n=816, 103 and 103 radiographs, respectively). The hips were assessed for dysplasia by measuring the center edge (CE) angle. An angle ≤20° was considered dysplastic and an angle of 20° to 25° was considered borderline.

To evaluate the severity of dysplasia, the team applied the Crowe classification (four classes based on the degree of subluxation) and the Hartofilakidis classification (three classes based on the extent of deformation of the acetabulum in addition to the degree of subluxation).

Two deep learning models were developed and used to categorize all 103 test radiographs into:

  • Model 1: Normal, Borderline, and Crowe 1 to 4 categories
  • Model 2: Normal, Borderline, and Hartofilakidis 1 to 3 categories

Model 1 Performance

Model 1 achieved 68% accuracy overall. It diagnosed normal hips with high accuracy but struggled to distinguish between normal and borderline hips, classifying 13 of 15 borderline hips as normal.

On the other hand, Model 1 achieved 92% accuracy in distinguishing between dysplastic hips (Crowe 1–4) and normal or borderline hips. Regarding the Crowe classification, most errors involved incorrect assignment to a neighboring class.

Model 2 Performance

Model 2 achieved 74% accuracy overall. Like Model 1, it diagnosed normal hips with high accuracy but struggled to distinguish between normal and borderline hips.

Model 2 was 83% accurate in distinguishing between dysplastic hips (Hartofilakidis 1–3) and normal or borderline hips. It also made errors in distinguishing between neighboring Hartofilakidis classes.

Future Applications

Timely diagnosis of hip dysplasia and subsequent conservative treatment can delay the development of osteoarthritis and the need for more aggressive treatment. These deep learning models could be used by professionals who lack experience in recognizing hip dysplasia, as well as for automated diagnosis and classification in large institutions.

68%
to 74% overall accuracy of deep learning models in diagnosing hip dysplasia

83%
to 92% accuracy of deep learning models in distinguishing between dysplastic hips and normal or borderline hips

Learn more about the Harris Orthopaedics Lab

Refer a patient to the Department of Orthpaedic Surgery

Related

The Harris Orthopaedic Laboratory at Massachusetts General Hospital is driving research and innovation to eliminate infection and improve pain management related to total joint arthroplasty.

Related

Tony Lin-Wei Chen, MD, PhD, Young-Min Kwon, MD, PhD, and colleagues at the Bioengineering Lab developed accurate machine learning models that identify patients at high risk of prolonged hospitalization after primary total hip arthroplasty, pinpointing risk factors that can be optimized preoperatively.