Construction of a decision tree model based on laboratory test results for screening lung cancer cells in pleural fluid
Main Article Content
Abstract
Parameters from laboratory analysis have been studied for screening cancer cells in pleural fluid. However, limitations in sensitivity, specificity, and appropriate cut-off values for routine laboratory use have been identified and the standard cytology methods for diagnosis are long turnaround time. At present, data analysis using machine learning is a popular tool in the era of data science for predicting and applying disease screening tools. Therefore, our research team aimed to create a decision tree model based on laboratory analysis results for screening cancer in pleural fluid. Historical data (retrospective study) from 357 samples, both cancer-positive and cancer-negative were collected, and a decision tree model was developed using the J48 algorithm in the WEKA program. Significantly different parameters (p < 0.05) included protein, adenosine deaminase (ADA), carcinoembryonic antigen (CEA), the count of high fluorescence-body fluid cells (HFBF#), and the percentage of high fluorescence-body fluid cells (HFBF%). Subsequently, decision tree modeling divided 90% of the data into a training dataset and a testing dataset, while the remaining 10% served as a blind dataset. The results of the decision tree modeling showed that the most effective model included the parameters CEA, ADA, protein, and HFBF%, achieving a sensitivity of 94.10% and specificity of 72.60%. Testing the model on a blind dataset demonstrated a sensitivity of 92.31% and specificity of 82.60%, with positive and negative predictive values of 75.00% and 95.00%, respectively. This suggests that the model could be beneficial for screening cancer cells in pleural fluid.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
Thaikerd P, Bhummichitra K, Sanchaisuriya K, Kitcharoen K, Srivorakun H, Khemtonglang N. Serous Fluid Parameters Generated on the Sysmex XN-1000 in Malignant and Non-malignant Samples. J Med Tech Assoc Thailand. 2023;51(1):8422-33 (in thai)
Labaere D, Boeckx N, Geerts I, Moens M, Van den Driessche M. Detection of malignant cells in serous body fluids by counting high-fluorescent cells on the Sysmex XN-2000 hematology analyzer. Int J Lab Hematol. 2015;37(5):715-22.
Wong-Arteta J, Gil-Rodriguez E, Cabezon-Vicente R, Bereciartua-Urbieta E, Bujanda L. High fluorescence cell count in pleural fluids for malignant effusion screening. Clin Chim Acta. 2019;499:115-7.
Buoro S, Mecca T, Azzara G, Seghezzi M, Candiago E, Gianatti A, et al. Mindray BC-6800 body fluid mode, performance of nucleated cells, and differential count in ascitic and pleural fluids. Int J Lab Hematol. 2016;38(1):90-101.
Rastogi L, Dass J, Arya V, Kotwal J. Evaluation of high-fluorescence body fluid (HF-BF) parameter as a screening tool of malignancy in body fluids. Indian J Pathol Microbiol. 2019;62(4):572-7.
Xu W, Yu Q, Xie L, Chen B, Zhang L. Evaluation of Sysmex XN-1000 hematology analyzer for cell count and screening of malignant cells of serous cavity effusion. Medicine (Baltimore). 2017;96(27):e7433.
Chian CF, Wu FP, Tsai CL, Peng CK, Shen CH, Perng WC, et al. Echogenic swirling pattern, carcinoembryonic antigen, and lactate dehydrogenase in the diagnosis of malignant pleural effusion. Sci Rep. 2022;12(1):4077.
Ai T, Tabe Y, Takemura H, Kimura K, Takahashi T, Yang H, et al. Novel flowcytometry-based approach of malignant cell detection in body fluids using an automated hematology analyzer. PLoS One. 2018;13(2):e0190886.
Zhang H, Li C, Hu F, Zhang X, Shen Y, Chen Y, et al. Auxiliary diagnostic value of tumor biomarkers in pleural fluid for lung cancer-associated malignant pleural effusion. Respir Res. 2020;21(1):284.
Li H, Huang L, Tang H, Zhong N, He J. Pleural fluid carcinoembryonic antigen as a biomarker for the discrimination of tumor-related pleural effusion. Clin Respir J. 2017;11(6):881-6.
Radjenovic-Petkovic T, Pejcic T, Nastasijević-Borovac D, Rancic M, Radojkovic D, Radojkovic M, et al. Diagnostic value of CEA in pleural fluid for differential diagnosis of benign and malign pleural effusion. Med Arh. 2009;63:141-2.
Palwisut P. Improving decision tree technique in imbalanced data sets using SMOTE for internet addiction disorder data. Information Technology Journal. 2016;12(1):54-63(in thai).
Pongsanguan W, Thinsungnoen T, Thinsungnoen M. Development of model for diabetes mellitus using decision tree technique. Journal of Science and Technology, Rajabhat Maha Sarakham University. 2018;1(1):1-8.
Makond B, Pornsawad P, Thawnashom K. Decision Tree Modeling for Osteoporosis Screening in Postmenopausal Thai Women. Informatics. 2022;9(4):83.
Banluesapy S, Jirapanthong W. Towards machine learning algorithm for screening prediction of COVID-19 patients. JOURNAL OF INFORMATION SCIENCE AND TECHNOLOGY. 2022;12(1):47-60.
Giotta M, Trerotoli P, Palmieri VO, Passerini F, Portincasa P, Dargenio I, et al. Application of a decision tree model to predict the outcome of non-intensive inpatients hospitalized for COVID-19. Int J Environ Res Public Health. 2022;19(20).
Feng M, Zhu J, Liang L, Zeng N, Wu Y, Wan C, et al. Diagnostic value of tumor markers for lung adenocarcinoma-associated malignant pleural effusion: a validation study and meta-analysis. Int J Clin Oncol. 2017;22(2):283-90.
Cho YU, Chi HS, Park SH, Jang S, Kim YJ, Park CJ. Body fluid cellular analysis using the Sysmex XN-2000 automatic hematology analyzer: focusing on malignant samples. Int J Lab Hematol. 2015;37(3):346-56.
Tanwarawutthikul P. Optimum and effectiveness of adenosine deaminase level for pleural tuberculosis screening. Journal of Medical and Public Health Region 4. 2023;13(3):45-58.