Framework based on Machine Learning for Lung Cancer Prognosis with Big Data-Driven
DOI:
https://doi.org/10.47672/ejt.2787Abstract
Purpose: Lung cancer represents a fatal condition, which is typified by unregulated cell proliferation within lung tissues, which may be identified with the help of CT scans and X-rays that indicate tumors or aberrant masses. Medical imaging is essential in early diagnosis, which can enhance prognosis and the determination of an effective treatment approach.
Materials and Methods: The study presents an IQ-OTH/NCCD machine learning framework driven by Big Data to predict and prognose lung cancer based on 1,097 expert-labeled CT scan images in the dataset that runs under benign, malignant, and normal classes. The proposed system uses the preprocessing functions such as image shuffling, lung contour cropping, and resizing to utilize the inputs in the best way to train the model. To classify it, the EfficientNet-B1 is a deep learning model, a better model in accuracy and efficiency on computation thanks to the compound scaling depth, width, resolution.
Finding: Accuracy, precision, recall, and F1-score are the main performance indicators used to assess the model. The outstanding percentages of 99.10% accuracy, 99.22% precision, 97.22% recall, and 98.16% F1-scores demonstrate the model's exceptional performance. The model shows a massive improvement compared to classical models of machine learning, such as SVM or CNN. This system provides an efficient and scalable automated detection of lung cancer, thus facilitating smart healthcare with early detection and positive patient outcomes.
Unique Contribution to Theory, Practice and Policy: Future work requires a larger, multimodal dataset incorporating clinical and genetic data. Explainable AI methods should be explored to enhance generalizability. Real-world testing in smart healthcare settings and across multiple institutions is crucial for developing a practical, AI-driven tool for early lung cancer detection.
Downloads
References
[1] S. Singamsetty, “Retinal Twins: Leveraging Binocular Symmetry with Siamese Networks for Enhanced Diabetic Retinopathy Detection,” Turkish Online J. Qual. Inq., vol. 11, no. 4, pp. 2843–2850, 2020, doi: 10.53555/tojqi.v11i4.10607.
[2] M Supriya and A. Deepa, “Machine learning approach on healthcare big data: a review,” Big Data Inf. Anal., vol. 5, no. 1, pp. 58–75, 2020, doi: 10.3934/bdia.2020005.
[3] D. M. Kasthuri and M. R. Jency, “Lung Cancer Prediction Using Machine Learning Algorithms on Big Data: Survey,” Int. J. Comput. Sci. Mob. Comput., vol. 9, no. 10, pp. 73–77, Oct. 2020, doi: 10.47760/IJCSMC.2020.v09i10.009.
[4] S. Makaju, P. W. C. Prasad, A. Alsadoon, A. K. Singh, and A. Elchouemi, “Lung Cancer Detection using CT Scan Images,” Procedia Comput. Sci., vol. 125, pp. 107–114, 2018, doi: 10.1016/j.procs.2017.12.016.
[5] V. Kolluri, “Machine Learning in Managing Healthcare Supply Chains: How Machine Learning Optimizes Supply Chains, Ensuring the Timely Availability of Medical Supplies,” J. Emerg. Technol. Innov. Res., vol. 3, no. 6, 2016.
[6] M. F. Abdullah, S. N. Sulaiman, M. K. Osman, N. K. A. Karim, I. L. Shuaib, and M. D. I. Alhamdu, “Classification of Lung Cancer Stages from CT Scan Images Using Image Processing and k-Nearest Neighbours,” in 2020 11th IEEE Control and System Graduate Research Colloquium (ICSGRC), IEEE, Aug. 2020, pp. 68–72. doi: 10.1109/ICSGRC49013.2020.9232492.
[7] V. N. Jenipher and S. Radhika, “A study on early prediction of lung cancer using machine learning techniques,” in Proceedings of the 3rd International Conference on Intelligent Sustainable Systems, ICISS 2020, 2020. doi: 10.1109/ICISS49785.2020.9316064.
[8] C. Thallam, A. Peruboyina, S. S. T. Raju, and N. Sampath, “Early Stage Lung Cancer Prediction Using Various Machine Learning Techniques,” in Proceedings of the 4th International Conference on Electronics, Communication and Aerospace Technology, ICECA 2020, 2020. doi: 10.1109/ICECA49313.2020.9297576.
[9] S. S. Sanagala, S. K. Gupta, V. K. Koppula, and M. Agarwal, “A fast and light-weight deep convolution neural network model for cancer disease identification in human lungs (s),” in Proceedings - 18th IEEE International Conference on Machine Learning and Applications, ICMLA 2019, 2019. doi: 10.1109/ICMLA.2019.00225.
[10] Ö. Günaydin, M. Günay, and Ö. Şengel, “Comparison of Lung Cancer Detection Algorithms,” in 2019 Scientific Meeting on Electrical-Electronics & Biomedical Engineering and Computer Science (EBBT), 2019, pp. 1–4. doi: 10.1109/EBBT.2019.8741826.
[11] M. I. Faisal, S. Bashir, Z. S. Khan, and F. Hassan Khan, “An Evaluation of Machine Learning Classifiers and Ensembles for Early Stage Prediction of Lung Cancer,” in 2018 3rd International Conference on Emerging Trends in Engineering, Sciences and Technology, ICEEST 2018, 2018. doi: 10.1109/ICEEST.2018.8643311.
[12] E. Cengil and A. Çinar, “A Deep Learning Based Approach to Lung Cancer Identification,” in International Conference on Artificial Intelligence and Data Processing (IDAP), 2018, pp. 1–5. doi: 10.1109/IDAP.2018.8620723.
[13] Q. Wu and W. Zhao, “Small-Cell Lung Cancer Detection Using a Supervised Machine Learning Algorithm,” in Proceedings - 2017 International Symposium on Computer Science and Intelligent Controls, ISCSIC 2017, 2017. doi: 10.1109/ISCSIC.2017.22.
[14] G. Marques, D. Agarwal, and I. de la Torre Díez, “Automated medical diagnosis of COVID-19 through EfficientNet convolutional neural network,” Appl. Soft Comput., vol. 96, Nov. 2020, doi: 10.1016/j.asoc.2020.106691.
[15] D. Riquelme and M. A. Akhloufi, “Deep Learning for Lung Cancer Nodules Detection and Classification in CT Scans,” 2020. doi: 10.3390/ai1010003.
[16] N. Banerjee and S. Das, “Prediction Lung Cancer- in Machine Learning Perspective,” in 2020 International Conference on Computer Science, Engineering and Applications, ICCSEA 2020, 2020. doi: 10.1109/ICCSEA49143.2020.9132913.
[17] H. F. Al-Yasriy, M. S. Al-Husieny, F. Y. Mohsen, E. A. Khalil, and Z. S. Hassan, “Diagnosis of Lung Cancer Based on CT Scans Using CNN,” IOP Conf. Ser. Mater. Sci. Eng., vol. 928, no. 2, 2020, doi: 10.1088/1757-899X/928/2/022035.
[34]Sandeep Kumar, C., Srikanth Reddy, V., Ram Mohan, P., Bhavana, K., & Ajay Babu, K. (2022). Efficient Machine Learning Approaches for Intrusion Identification of DDoS Attacks in Cloud Networks. J Contemp Edu Theo Artific Intel: JCETAI/101.
Ajay, S., Satya Sai Krishna Mohan G, Rao, S. S., Shaunak, S. B., Krutthika, H. K., Ananda, Y. R., & Jose, J. (2018). Source Hotspot Management in a Mesh Network on Chip. In VDAT (pp. 619-630).
Bhumireddy, J. R., Chalasani, R., Tyagadurgam, M. S. V., Gangineni, V. N., Pabbineedi, S., & Penmetsa, M. (2020). Big Data-Driven Time Series Forecasting for Financial Market Prediction: Deep Learning Models. Journal of Artificial Intelligence and Big Data, 2(1), 153–164.DOI: 10.31586/jaibd.2022.1341
Bhumireddy, J. R., Chalasani, R., Tyagadurgam, M. S. V., Gangineni, V. N., Pabbineedi, S., & Penmetsa, M. (2022). Big Data-Driven Time Series Forecasting for Financial Market Prediction: Deep Learning Models. Journal of Artificial Intelligence and Big Data, 2(1), 153–164.DOI: 10.31586/jaibd.2022.1341
Dinesh, K. (2022). Navigating the link between internet user attitudes and cybersecurity awareness in the era of phishing challenges. International Advanced Research Journal in Science, Engineering and Technology.
Gangineni, V. N., Pabbineedi, S., Penmetsa, M., Bhumireddy, J. R., Chalasani, R., & Tyagadurgam, M. S. V. Efficient Framework for Forecasting Auto Insurance Claims Utilizing Machine Learning Based Data-Driven Methodologies. International Research Journal of Economics and Management Studies IRJEMS, 1(2).
Gangineni, V. N., Tyagadurgam, M. S. V., Chalasani, R., Bhumireddy, J. R., & Penmetsa, M. (2021). Strengthening Cybersecurity Governance: The Impact of Firewalls on Risk Management. International Journal of AI, BigData, Computational and Management Studies, 2, 10-63282.
Gopalakrishnan Nair, T. R., & Krutthika, H. K. (2010). An Architectural Approach for Decoding and Distributing Functions in FPUs in a Functional Processor System. arXiv e-prints, arXiv-1001.
HK, K. (2020). Design of Efficient FSM Based 3D Network on Chip Architecture. INTERNATIONAL JOURNAL OF ENGINEERING, 68(10), 67-73.
Kakani, A. B., Nandiraju, S. K. K., Chundru, S. K., Vangala, S. R., Polam, R. M., & Kamarthapu, B. (2021). Big Data and Predictive Analytics for Customer Retention: Exploring the Role of Machine Learning in E-Commerce. International Journal of Emerging Trends in Computer Science and Information Technology, 2(2), 26-34.
Kalla, D. (2022). AI-Powered Driver Behavior Analysis and Accident Prevention Systems for Advanced Driver Assistance. International Journal of Scientific Research and Modern Technology (IJSRMT) Volume, 1.
Kalla, D., Kuraku, D. S., & Samaah, F. (2021). Enhancing cyber security by predicting malwares using supervised machine learning models. International Journal of Computing and Artificial Intelligence, 2(2), 55-62.
Kalla, D., Smith, N., Samaah, F., & Polimetla, K. (2021). Facial Emotion and Sentiment Detection Using Convolutional Neural Network. Indian Journal of Artificial Intelligence Research (INDJAIR), 1(1), 1-13.
Kamarthapu, B., Kakani, A. B., Nandiraju, S. K. K., Chundru, S. K., Vangala, S. R., & Polam, R. M. (2021). Advanced Machine Learning Models for Detecting and Classifying Financial Fraud in Big Data-Driven. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 2(3), 39-46.
Katari, A., & Kalla, D. (2021). Cost Optimization in Cloud-Based Financial Data Lakes: Techniques and Case Studies. ESP Journal of Engineering & Technology Advancements (ESP-JETA), 1(1), 150-157.
Krutthika H. K. & A.R. Aswatha (2020). Design of efficient FSM-based 3D network-on-chip architecture. International Journal of Engineering Trends and Technology, 68(10), 67–73. https://doi.org/10.14445/22315381/IJETT-V68I10P212
Krutthika H. K. & A.R. Aswatha. (2020). FPGA-based design and architecture of network-on-chip router for efficient data propagation. IIOAB Journal, 11(S2), 7–25.
Krutthika H. K. & A.R. Aswatha. (2021). Implementation and analysis of congestion prevention and fault tolerance in network on chip. Journal of Tianjin University Science and Technology, 54(11), 213–231. https://doi.org/10.5281/zenodo.5746712
Krutthika H. K. & Rajashekhara R. (2019). Network-on-chip: A survey on router design and algorithms. International Journal of Recent Technology and Engineering, 7(6), 1687–1691. https://doi.org/10.35940/ijrte.F2131.037619
Krutthika, H. K. (2019, October). Modeling of Data Delivery Modes of Next Generation SOC-NOC Router. In 2019 Global Conference for Advancement in Technology (GCAT) (pp. 1-6). IEEE.
Nair, T. R., & Krutthika, H. K. (2010). An Architectural Approach for Decoding and Distributing Functions in FPUs in a Functional Processor System. arXiv preprint arXiv:1001.3781.
Nandiraju, S. K. K., Chundru, S. K., Vangala, S. R., Polam, R. M., Kamarthapu, B., & Kakani, A. B. (2022). Advance of AI-Based Predictive Models for Diagnosis of Alzheimer’s Disease (AD) in Healthcare. Journal of Artificial Intelligence and Big Data, 2(1), 141–152.DOI: 10.31586/jaibd.2022.1340
Nandiraju, S. K. K., Chundru, S. K., Vangala, S. R., Polam, R. M., Kamarthapu, B., & Kakani, A. B. (2022). Advance of AI-Based Predictive Models for Diagnosis of Alzheimer’s Disease (AD) in Healthcare. Journal of Artificial Intelligence and Big Data, 2(1), 141–152.DOI: 10.31586/jaibd.2022.1340
Narra, B., Vattikonda, N., Gupta, A. K., Buddula, D. V. K. R., Patchipulusu, H. H. S., & Polu, A. R. (2022). Revolutionizing Marketing Analytics: A Data-Driven Machine Learning Framework for Churn Prediction. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(2), 112-121.
Pabbineedi, S., Penmetsa, M., Bhumireddy, J. R., Chalasani, R., Tyagadurgam, M. S. V., & Gangineni, V. N. (2021). An Advanced Machine Learning Models Design for Fraud Identification in Healthcare Insurance. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 2(1), 26-34.
Penmetsa, M., Bhumireddy, J. R., Chalasani, R., Tyagadurgam, M. S. V., Gangineni, V. N., & Pabbineedi, S. (2021). Next-Generation Cybersecurity: The Role of AI and Quantum Computing in Threat Detection. International Journal of Emerging Trends in Computer Science and Information Technology, 2(4), 54-61.
Polam, R. M., Kamarthapu, B., Kakani, A. B., Nandiraju, S. K. K., Chundru, S. K., & Vangala, S. R. (2021). Data Security in Cloud Computing: Encryption, Zero Trust, and Homomorphic Encryption. International Journal of Emerging Trends in Computer Science and Information Technology, 2(3), 70-80.
Polam, R. M., Kamarthapu, B., Kakani, A. B., Nandiraju, S. K. K., Chundru, S. K., & Vangala, S. R. (2021). Big Text Data Analysis for Sentiment Classification in Product Reviews Using Advanced Large Language Models. International Journal of AI, BigData, Computational and Management Studies, 2(2), 55-65.
Polu, A. R., Buddula, D. V. K. R., Narra, B., Gupta, A., Vattikonda, N., & Patchipulusu, H. (2021). Evolution of AI in Software Development and Cybersecurity: Unifying Automation, Innovation, and Protection in the Digital Age. Available at SSRN 5266517.
Polu, A. R., Narra, B., Buddula, D. V. K. R., Patchipulusu, H. H. S., Vattikonda, N., & Gupta, A. K. BLOCKCHAIN TECHNOLOGY AS A TOOL FOR CYBERSECURITY: STRENGTHS, WEAKNESSES, AND POTENTIAL APPLICATIONS.
Polu, A. R., Vattikonda, N., Gupta, A., Patchipulusu, H., Buddula, D. V. K. R., & Narra, B. (2021). Enhancing Marketing Analytics in Online Retailing through Machine Learning Classification Techniques. Available at SSRN 5297803.
Rajiv, C., Mukund Sai, V. T., Venkataswamy Naidu, G., Sriram, P., & Mitra, P. (2022). Leveraging Big Datasets for Machine Learning-Based Anomaly Detection in Cybersecurity Network Traffic. J Contemp Edu Theo Artific Intel: JCETAI/102.
Tyagadurgam, M. S. V., Gangineni, V. N., Pabbineedi, S., Penmetsa, M., Bhumireddy, J. R., & Chalasani, R. (2022). Designing an Intelligent Cybersecurity Intrusion Identify Framework Using Advanced Machine Learning Models in Cloud Computing. Universal Library of Engineering Technology, (Issue).
Tyagadurgam, M. S. V., Gangineni, V. N., Pabbineedi, S., Penmetsa, M., Bhumireddy, J. R., & Chalasani, R. (2021). Enhancing IoT (Internet of Things) Security Through Intelligent Intrusion Detection Using ML Models. International Journal of Emerging Research in Engineering and Technology, 2(1), 27-36.
Vangala, S. R., Polam, R. M., Kamarthapu, B., Kakani, A. B., Nandiraju, S. K. K., & Chundru, S. K. (2022). Leveraging Artificial Intelligence Algorithms for Risk Prediction in Life Insurance Service Industry. Available at SSRN 5459694.
Vangala, S. R., Polam, R. M., Kamarthapu, B., Kakani, A. B., Nandiraju, S. K. K., & Chundru, S. K. (2021). Smart Healthcare: Machine Learning-Based Classification of Epileptic Seizure Disease Using EEG Signal Analysis. International Journal of Emerging Research in Engineering and Technology, 2(3), 61-70.
Vattikonda, N., Gupta, A. K., Polu, A. R., Narra, B., Buddula, D. V. K. R., & Patchipulusu, H. H. S. (2022). Blockchain Technology in Supply Chain and Logistics: A Comprehensive Review of Applications, Challenges, and Innovations. International Journal of Emerging Research in Engineering and Technology, 3(3), 99-107.
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Raghuvaran Kendyala, Jagan Kurma, Jaya Vardhani Mamidala, Sunil Jacob Enokkaren, Avinash Attipalli, Varun Bitkuri

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution (CC-BY) 4.0 License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.