Skip to content

Using Chest X-Rays, Artificial Intelligence Can Predict Risk of Lung Cancer

In This Article

  • Lung cancer is the leading cause of cancer death in the U.S.
  • The guidelines used to determine who is eligible for lung cancer screening do not accurately capture who should get screened
  • A recently reported deep learning algorithm more accurately predicts which patients will be diagnosed with lung cancer, and thus can be used to flag high-risk patients for a discussion about screening with their physician

In September, researchers Michael T. Lu, MD, MPH, Vineet Raghu, PhD, and Udo Hoffmann, MD, MPH, of the Massachusetts General Hospital Cardiovascular Imaging Research Center (CIRC), and colleagues reported in Annals of Internal Medicine a deep learning approach using chest X-rays to identifying high-risk smokers who should undergo lung cancer screening with chest CT.

The model was developed in nearly 41,856 people, then tested in over 11,000 participants from the Prostate, Lung, Colorectal and Ovarian cancer screening trial (PLCO) and National Lung Screening Trial (NLST).

Drs. Lu and Hoffmann recently discussed the motivations for the study, how a deep learning algorithm can better predict lung cancer in heavy smokers and where they hope to go from here.

Q: Why is the deep learning algorithm necessary?

A: Lung cancer is the leading cause of cancer death among men and women in the U.S. In fact, it causes more cancer death than the next three most fatal cancers combined. Lung cancer screening with chest CT can prevent lung cancer death. Despite these dire statistics, relatively few people go for screening—fewer than 5% of those who are eligible. This is much lower than breast and colorectal cancer screening rates, which are about 66%.

There are reasons heavy smokers might not get screened. Some smokers may be afraid to get the results—they do not want to know whether they might get cancer. Others might shy away from screening because of the stigma still attached to lung cancer and to smoking generally. In our recent paper, we focus on one particular reason: the guidelines currently used to determine who is eligible for screening can be too complex and miss many of the people who go on to develop lung cancer.

For the current guidelines set forth by Medicare, you need to know the patient's smoking history: how many packs a year they smoked, for how many years and when they quit. This is more information than a PCP will get from a standard visit, and it is typically not available in an electronic medical record. By way of comparison, determining eligibility for breast or colorectal cancer is relatively straightforward: if you know the patient's age and sex, you know whether they should get screened or not.

Q: What did you learn about the efficacy of your approach, especially with respect to the currently used screening guidelines?

A: We developed a convolutional neural network to predict long-term incident lung cancer using data commonly found in the electronic medical record: a chest X-ray (radiograph) image, the patient's age and sex and whether the patient is currently a smoker. Our ultimate goal is to be able to use the algorithm to automatically identify patients at high risk so that an alert can be sent to their doctors to discuss screening.

In our study, we compared the efficacy of our deep learning approach with that of the current Medicare guidelines. Screening guidelines generally identify 'high risk' groups, groups of people most likely to be diagnosed with the disease based on one or more criteria, but most cancers actually appear outside of these groups. About half of lung cancers today occur in people who would not qualify for screening under the Medicare guidelines.

Our study demonstrated that the deep learning algorithm more accurately predicted which patients would later be diagnosed with lung cancer. Assuming we screened the same number of people with both approaches, the deep learning approach would "miss" 31% fewer people who would later get lung cancer than the current screening guidelines.

Another, equally important issue we addressed was health care delivery—that is, the delivery of screening to the people who should have it. As mentioned, approximately 5% of eligible individuals currently undergo lung cancer screening CT. Because chest X-rays are such a common test, we hope that this provides an avenue to flag high-risk people, so they can have a discussion with their physician about screening.

Q: What's the next step in your work with the algorithm?

A: Our future plans involve both clinical and scientific goals. Clinically, we would like to perform a small pilot study here at Mass General and then a multicenter prospective trial comparing our algorithm with the accepted guidelines to see what percentages of patients decide to get screened with and without the intervention following identification with the algorithm. Ultimately, we want to implement it clinically at the hospital—thus far, we have used it only in study populations—to get more people who need it into the screening pipeline, with the goal of earlier diagnosis and a cure.

Scientifically, we want to see how we can further improve the algorithm, either by adding more information or incorporating repeated chest X-rays. For example, if a patient is recommended for screening today and in five years doesn't get cancer, is it worth it to use another chest X-ray five years down the road to improve the accuracy with increasing age? These are the kinds of things we would like to learn.

of eligible patients get lung cancer screening

Learn about the Department of Radiology

Learn about the Cardiovascular Imaging Research Center


Researchers at Massachusetts General Hospital and colleagues have reported a novel portable technology to enable cancer diagnosis in developing countries and other remote locations.


When deep vein thrombosis is suspected in COVID-19 patients, Massachusetts General Hospital procedures for ordering and performing ultrasound have decreased the number of scans and reduced the sonographer's exposure time.