Project Description
Our research group has been working on applying machine learning and deep learning models for building predictive analytics towards real-world applications in health science. Last summer, we did preliminary data exploration and model synthesis on Chest X-ray images to predict lung disease. In medical diagnostics, chest radiography is one of the powerful methods to identify lung diseases such as Edema, Pneumonia, etc. However, manual examination of Chest X-rays demands the expert’s time, which is expensive and taxing on stakeholders. Promising developments in the computer vision application of artificial intelligence (AI) suggest that at least preliminary screening can be expeditiously conducted for a large number of X-rays for the benefit of patients, medical practitioners, and providers with AI applications. Several groups are currently developing AI platforms to predict lung diseases from chest radiography. The Stanford machine learning group recently published a large Chest X-ray set, each labeled possible for multiple conditions. One of our goals is to optimize the deep Convolutional Neural Networks (CNN) based on Resnet (Residual Network) models, which have shown to be accurate for image detection and segmentation, to make the multi-class predictions on the Chest X-ray images. In the first step, the student will check how Resnet models
will work on this multi-class prediction analytics on different sub-datasets to estimate the minimal number of images required to build the deep learning model. In the next stage, the student will explore the impact of uncertainty with the labels. One of the challenges with this multi-class prediction is that the dataset has an uncertainty associated with the labels. The student will train the model without uncertainty and then add the uncertainty to understand its impact on the prediction. Then the student will examine a few Resnet models to improve the efficiency. All of this work will be benchmarked on the GPU resources at the Rutgers Campus Cluster and the Pittsburgh Supercomputing Center. Now we have have built the software stack and tested a simple CNN model to run on bridge-2 cluster at PSC.