Big Data-Driven Approach for Lung Cancer Identification via Advanced Deep Transfer Learning Models

Authors

  • Rajiv Chalasani Sacred Heart University
  • Venkataswamy Naidu Gangineni University of Madras, Chennai
  • Sriram Pabbineedi University of Central Missouri
  • Mitra Penmetsa University of Illinois at Springfield
  • Jayakeshav Reddy Bhumireddy University of Houston
  • Mukund Sai Vikram Tyagadurgam uUniversity of Illinois at Springfield

DOI:

https://doi.org/10.47672/ejt.2730

Keywords:

Lung Cancer, Deep Learning (DL) Techniques, RestNet-50, LIDC-IDRI Dataset

Abstract

Purpose: To develop and evaluate a highly accurate computer-aided diagnosis (CAD) model based on ResNet-50 for the early identification of lung cancer-related pulmonary nodules using the publicly accessible LIDC-IDRI CT image dataset.

Materials and Methods: This study utilizes the LIDC-IDRI dataset, which comprises CT scans of pulmonary nodules for the detection of lung cancer. The preprocessing pipeline involves converting all CT images to grayscale, resizing them to a consistent dimension, and applying data augmentation techniques such as rotations and flips to enhance the model's robustness. A refined ResNet-50 convolutional neural network is employed for classification to extract deep characteristics and differentiate between benign and malignant nodules.  Two baseline models, the Feed Forward Back Propagation Neural Network and the Support Vector Machine (SVM), are also used for comparison to assess the efficacy of this strategy.

Findings: The ResNet-50 model demonstrated superior performance across all evaluation metrics, achieving an accuracy of 99.38%, an F1-score of 99.37%, a precision of 99.91%, and a recall of 98.76%.  ResNet-50 showed a high capacity to reliably detect pulmonary nodules from CT images by consistently outperforming both the SVM and the Feed Forward Back Propagation Neural Network when compared to the baseline models.

Unique Contribution to Theory, Practice and Policy: Based on the findings, it is recommended that the ResNet-50–based CAD model be integrated into clinical radiology workflows to facilitate the early diagnosis of lung cancer. For broader applicability, further validation should be conducted using multi-center and prospective datasets to ensure the model’s generalizability. Additionally, incorporating real-time preprocessing and inference mechanisms within existing PACS (Picture Archiving and Communication System) platforms could streamline diagnostic processes and improve radiologist efficiency.

Downloads

Download data is not yet available.

References

J. L. Causey et al., “Highly accurate model for prediction of lung nodule malignancy with CT scans,” Sci. Rep., 2018, doi: 10.1038/s41598-018-27569-w.

V. KOLLURI, “Machine Learning in Managing Healthcare Supply Chains: How Machine Learning Optimizes Supply Chains, Ensuring the Timely Availability of Medical Supplies,” Int. J. Emerg. Technol. Innov. Res., pp. 2349–5162, 2016.

S. G. Armato et al., “The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans,” Med. Phys., 2011, doi: 10.1118/1.3528204.

V. Kolluri, “An Innovative Study Exploring Revolutionizing Healthcare with AI: Personalized Medicine: Predictive Diagnostic Techniques and Individualized Treatment,” Int. J. Emerg. Technol. Innov. Res., vol. 3, no. 11, pp. 2349–5162, 2016.

F. Bray, J. Ferlay, I. Soerjomataram, R. L. Siegel, L. A. Torre, and A. Jemal, “Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries,” CA. Cancer J. Clin., 2018, doi: 10.3322/caac 21492.

J. Abraham, “Reduced lung cancer mortality with low-dose computed tomographic screening,” Community Oncol., 2011, doi: 10.1016/S1548-5315(12)70136-5.

R. Wender et al., “American Cancer Society lung cancer screening guidelines,” CA. Cancer J. Clin., 2013, doi: 10.3322/caac.21172.

H. Hijazi and C. Chan, “A Classification Framework Applied to Cancer Gene Expression Profiles,” J. Healthc. Eng., vol. 4, no. 2, pp. 255–283, Jan. 2013, doi: 10.1260/2040-2295.4.2.255.

K. Homsapaya and O. Sornil, “Modified Floating Search Feature Selection Based on Genetic Algorithm,” MATEC Web Conf., vol. 164, p. 01023, Apr. 2018, doi: 10.1051/matecconf/201816401023.

T. Kadir and F. Gleeson, “Lung cancer prediction using machine learning and advanced imaging techniques,” 2018. doi: 10.21037/tlcr.2018.05.15.

S. Garg, “Predictive Analytics and Auto Remediation using Artificial Intelligence and Machine learning in Cloud Computing Operations,” Int. J. Innov. Res. Eng. Multidiscip. Phys. Sci., vol. 7, no. 2, 2019, doi: http://dx.doi.org/10.5281/zenodo.15362327.

R. Gruetzemacher, A. Gupta, and D. Paradice, “3D deep learning for detecting pulmonary nodules in CT scans,” J. Am. Med. Informatics Assoc., 2018, doi: 10.1093/jamia/ocy098.

S. Garg, “AI/ML DRIVEN PROACTIVE PERFORMANCE MONITORING, RESOURCE ALLOCATION AND EFFECTIVE COST MANAGEMENT IN SAAS OPERATIONS,” Int. J. Core Eng. Manag., vol. 6, no. 6, pp. 32–45, 2019.

D. Ardila et al., “End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography,” Nat. Med., 2019, doi: 10.1038/s41591-019-0447-x.

A. Hosny, C. Parmar, J. Quackenbush, L. H. Schwartz, and H. J. W. L. Aerts, “Artificial intelligence in radiology,” 2018. doi: 10.1038/s41568-018-0016-5.

M. Saric, M. Russo, M. Stella, and M. Sikora, “CNN-based Method for Lung Cancer Detection in Whole Slide Histopathology Images,” in 2019 4th International Conference on Smart and Sustainable Technologies, SpliTech 2019, 2019. doi: 10.23919/SpliTech 2019.8783041.

S. Hussein, P. Kandel, C. W. Bolan, M. B. Wallace, and U. Bagci, “Lung and Pancreatic Tumor Characterization in the Deep Learning Era: Novel Supervised and Unsupervised Learning Approaches,” IEEE Trans. Med. Imaging, vol. 38, no. 8, pp. 1777–1787, 2019, doi: 10.1109/TMI.2019.2894349.

R. Gao et al., “Distanced LSTM: Time-Distanced Gates in Long Short-Term Memory Models for Lung Cancer Detection,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019. doi: 10.1007/978-3-030-32692-0_36.

S. Perumal and T. Velmurugan, “Lung cancer detection and classification on CT scan images using Enhanced Artificial Bee Colony Optimization,” Int. J. Eng. Technol., 2018, doi: 10.14419/ijet.v7i2.26.12538.

C. Zhang et al., “Urine Proteome Profiling Predicts Lung Cancer from Control Cases and Other Tumors,” EBioMedicine, 2018, doi: 10.1016/j.ebiom.2018.03.009.

M. Talo, “Automated classification of histopathology images using transfer learning,” Artif. Intell. Med., 2019, doi: 10.1016/j.artmed.2019.101743.

J. Kuruvilla and K. Gunavathi, “Lung cancer classification using neural networks for CT images,” Comput. Methods Programs Biomed., 2014, doi: 10.1016/j.cmpb.2013.10.011.

W. Li, P. Cao, D. Zhao, and J. Wang, “Pulmonary Nodule Classification with Deep Convolutional Neural Networks on Computed Tomography Images,” Comput. Math. Methods Med., vol. 2016, pp. 1–7, 2016, doi: 10.1155/2016/6215085.

W. J. Choi and T. S. Choi, “Automated pulmonary nodule detection system in computed tomography images: A hierarchical block classification approach,” Entropy, 2013, doi: 10.3390/e15020507.

Kalla, D. (2022). AI-Powered Driver Behavior Analysis and Accident Prevention Systems for Advanced Driver Assistance. International Journal of Scientific Research and Modern Technology (IJSRMT) Volume, 1.

Kuraku, D. S., Kalla, D., & Samaah, F. (2022). Navigating the link between internet user attitudes and cybersecurity awareness in the era of phishing challenges. International Advanced Research Journal in Science, Engineering and Technology, 9(12).

Kalla, D., Smith, N., Samaah, F., & Polimetla, K. (2022). Enhancing Early Diagnosis: Machine Learning Applications in Diabetes Prediction. Journal of Artificial Intelligence & Cloud Computing. SRC/JAICC-205. DOI: doi. org/10.47363/JAICC/2022 (1), 191, 2-7.

Kalla, D., Kuraku, D. S., & Samaah, F. (2021). Enhancing cyber security by predicting malwares using supervised machine learning models. International Journal of Computing and Artificial Intelligence, 2(2), 55-62.

Katari, A., & Kalla, D. (2021). Cost Optimization in Cloud-Based Financial Data Lakes: Techniques and Case Studies. ESP Journal of Engineering & Technology Advancements (ESP-JETA), 1(1), 150-157.

Kalla, D., Smith, N., Samaah, F., & Polimetla, K. (2021). Facial Emotion and Sentiment Detection Using Convolutional Neural Network. Indian Journal of Artificial Intelligence Research (INDJAIR), 1(1), 1-13.

Chinta, P. C. R., Katnapally, N., Ja, K., Bodepudi, V., Babu, S., & Boppana, M. S. (2022). Exploring the role of neural networks in big data-driven ERP systems for proactive cybersecurity management. Kurdish Studies.

Routhu, K., Bodepudi, V., Jha, K. M., & Chinta, P. C. R. (2020). A Deep Learning Architectures for Enhancing Cyber Security Protocols in Big Data Integrated ERP Systems. Available at SSRN 5102662.

Chinta, P. C. R., & Katnapally, N. (2021). Neural Network-Based Risk Assessment for Cybersecurity in Big Data-Oriented ERP Infrastructures. Neural Network-Based Risk Assessment for Cybersecurity in Big Data-Oriented ERP Infrastructures.

Katnapally, N., Chinta, P. C. R., Routhu, K. K., Velaga, V., Bodepudi, V., & Karaka, L. M. (2021). Leveraging Big Data Analytics and Machine Learning Techniques for Sentiment Analysis of Amazon Product Reviews in Business Insights. American Journal of Computing and Engineering, 4(2), 35-51.

Chinta, P. C. R. (2022). Enhancing Supply Chain Efficiency and Performance Through ERP Optimisation Strategies. Journal of Artificial Intelligence & Cloud Computing, 1(4), 10-47363.

Sadaram, G., Sakuru, M., Karaka, L. M., Reddy, M. S., Bodepudi, V., Boppana, S. B., & Maka, S. R. (2022). Internet of Things (IoT) Cybersecurity Enhancement through Artificial Intelligence: A Study on Intrusion Detection Systems. Universal Library of Engineering Technology, (2022).

Karaka, L. M. (2021). Optimising Product Enhancements Strategic Approaches to Managing Complexity. Available at SSRN 5147875.

Chandrasekaran, A., & Kalla, D. (2023). Heart disease prediction using chi-square test and linear regression. Computer Science & Information Technology, 13, 135-146.

Kalla, D., & Kuraku, S. (2023). Phishing website url’s detection using nlp and machine learning techniques. Journal of Artificial Intelligence, 5, 145.

Kuraku, D. S., & Kalla, D. (2023). Impact of phishing on users with different online browsing hours and spending habits. International Journal of Advanced Research in Computer and Communication Engineering, 12(10).

Kuraku, S., Kalla, D., Samaah, F., & Smith, N. (2023). Cultivating proactive cybersecurity culture among IT professional to combat evolving threats. International Journal of Electrical, Electronics and Computers, 8(6).

Kuraku, D. S., Kalla, D., Smith, N., & Samaah, F. (2023). Exploring How User Behavior Shapes Cybersecurity Awareness in the Face of Phishing Attacks. International Journal of Computer Trends and Technology.

Chinta, P. C. R. (2023). Leveraging Machine Learning Techniques for Predictive Analysis in Merger and Acquisition (M&A). Journal of Artificial Intelligence and Big Data, 3(1), 10-31586.

Kuraku, D. S., Kalla, D., Smith, N., & Samaah, F. (2023). Safeguarding FinTech: elevating employee cybersecurity awareness in financial sector. International Journal of Applied Information Systems (IJAIS), 12(42).

Moore, C. (2023). AI-powered big data and ERP systems for autonomous detection of cybersecurity vulnerabilities. Nanotechnology Perceptions, 19, 46-64.

Chinta, P. C. R. (2023). The Art of Business Analysis in Information Management Projects: Best Practices and Insights. DOI, 10.

Krishna Madhav, J., Varun, B., Niharika, K., Srinivasa Rao, M., & Laxmana Murthy, K. (2023). Optimising Sales Forecasts in ERP Systems Using Machine Learning and Predictive Analytics. J Contemp Edu Theo Artific Intel: JCETAI-104.

Maka, S. R. (2023). Understanding the Fundamentals of Digital Transformation in Financial Services: Drivers and Strategic Insights. Available at SSRN 5116707.

Routhu, KishanKumar & Katnapally, Niharika & Sakuru, Manikanth. (2023). Machine Learning for Cyber Defense: A Comparative Analysis of Supervised and Unsupervised Learning Approaches. Journal for ReAttach Therapy and Developmental Diversities. 6. 10.53555/jrtdd.v6i10s(2).3481.

Chinta, Purna Chandra Rao & Moore, Chethan Sriharsha. (2023). Cloud-Based AI and Big Data Analytics for Real-Time Business Decision-Making. 36. 96-123. 10.47363/JAICC/2023.

Krishna Madhav, J., Varun, B., Niharika, K., Srinivasa Rao, M., & Laxmana Murthy, K. (2023). Optimising Sales Forecasts in ERP Systems Using Machine Learning and Predictive Analytics. J Contemp Edu Theo Artific Intel: JCETAI-104.

Bodepudi, V. (2023). Understanding the Fundamentals of Digital Transformation in Financial Services: Drivers and Strategic Insights. Journal of Artificial Intelligence and Big Data, 3(1), 10-31586.

Polu, A. R., Buddula, D. V. K. R., Narra, B., Gupta, A., Vattikonda, N., & Patchipulusu, H. (2021). Evolution of AI in Software Development and Cybersecurity: Unifying Automation, Innovation, and Protection in the Digital Age. Available at SSRN 5266517.

Polu, A. R., Vattikonda, N., Buddula, D. V. K. R., Narra, B., Patchipulusu, H., & Gupta, A. (2021). Integrating AI-Based Sentiment Analysis With Social Media Data For Enhanced Marketing Insights. Available at SSRN 5266555.

Narra, B., Buddula, D. V. K. R., Patchipulusu, H. H. S., Polu, A. R., Vattikonda, N., & Gupta, A. K. Advanced Edge Computing Frameworks for Optimizing Data Processing and Latency in IoT Networks.

Buddula, D. V. K. R., Patchipulusu, H. H. S., Vattikonda, N., Polu, A. R., Narra, B., & Gupta, A. K. Predictive Analytics in E-Commerce: Effective Business Analysis through Machine Learning.

Jha, K. M., Bodepudi, V., Boppana, S. B., Katnapally, N., Maka, S. R., & Sakuru, M. Deep Learning-Enabled Big Data Analytics for Cybersecurity Threat Detection in ERP Ecosystems.

Downloads

Published

2025-07-09

How to Cite

Chalasani, R., Gangineni, V. N., Pabbineedi, S., Penmetsa, M., Bhumireddy, J. R., & Tyagadurgam, M. S. V. (2025). Big Data-Driven Approach for Lung Cancer Identification via Advanced Deep Transfer Learning Models. European Journal of Technology, 9(1), 51–67. https://doi.org/10.47672/ejt.2730

Issue

Section

Articles