AI Lung Cancer Detection

After a close friend’s grandfather passed away of lung cancer, he was devastated. The mass was caught too late, a common scenario as lung cancer often does not show symptoms until the late stages. I asked my doctor dad about the diagnosis process and learned that screening only occurs with a simple age and smoking pack-year cutoff; I thought there must be a more intelligent method to identify high-risk patients. After significant research about correlations between patient variables and lung cancer, I engineered an Artificial Intelligence algorithm called a Gradient Boosted Machine to better predict high-risk patients. Next, using multiple publicly available lung cancer CT scan datasets, I attempted to improve detection of cancerous lung nodules in CT scans with a Deep Convolutional Neural Network. This research took many nights of work at my computer, researching, coding, and debugging. Most of the algorithms implemented were self-taught, which was a difficult (but fun) process with my high-school level understanding of Calculus-based Statistics and Computer Science; luckily, whenever I got seriously stuck, I would contact my mentor Dr. Drew Clausen and he would help me work through possible bugs or problems. Still, the process was seriously challenging: obstacles seemed insurmountable, and I often had to rest for a few days and then freshly approach the problem with a plan to make progress. Eventually, the project was complete; using patient features such as age, sex, BMI, blood pressure, and prescriptions, the algorithm better detected patients at high risk for lung cancer (who should be screened). Once a CT scan is conducted on high-risk patients, malignant lung nodules are detected using the AI algorithm by estimating nodule calcification, spherity, and spiculation. 

While difficult, the research process was very rewarding. Any progress was extremely satisfying; even more gratifying was that this research was mostly independent. I learned the confidence and skills necessary to research, brainstorm, and implement solutions to challenging problems. These experiences have illuminated a field about which I am passionate, the intersection of technology and medicine. I learned the value of creative logic. Complex, real-world problems lack formulas or instructions. Oftentimes, comprehending the nuances of each problem is the most demanding and crucial part of such problem-solving. Fortunately, this research earned 2nd in the computational systems category of the California State Science Fair, as well as 1st place in the LA County and PV Science Fairs. 


Area under the Receiver Operating Characteristic Curve of Stage 1 High Risk Patient Identification and Stage 2 Malignant Lung Nodule Detection