Data Management and Completeness of Dental Data in the 43-File Standard Dataset for Longitudinal Tooth Loss Study Using R Program

Main Article Content

Nutnicha Jirachaiprasit
Janpim Hintao
Wattana Pithpornchaiyakul

Abstract

Objective: This study aimed to 1) describe the methods of data management and the results of data processing, and 2) analyze the completeness of dental data from the 43-standard dataset files to conduct survival analysis of tooth loss after dental examination.


Materials and Methods: The sample group consisted of individuals utilizing the Universal Health Coverage Scheme who received dental check-ups at Ministry of Public Health-affiliated healthcare units in a southern province during 2019-2021. Dental examination data and dental service information were obtained from the dental health status files and outpatient procedure files. The data were linked through general data files, managed, and analyzed descriptively using the R program.


Results: The study found the following steps for data management: 1) data importation, 2) handling of missing values, 3) filtering the sample group based on inclusion and exclusion criteria, 4) data quality screening by checking data accuracy, 5) removing duplicate data using unique fields, and 6) linking files via hashed national ID numbers. The completeness of the dental health status files revealed that 6,820 records (20.33%) out of 33,553 imported records were complete for individuals who had undergone dental examinations. From the 4,051,350 records imported from the outpatient procedure files, 4,177 records (0.10%) had complete information for those who had undergone a dental examination and received either tooth extraction or other dental services, providing data to analyze tooth loss survival after dental check-ups. In conclusion.


Conclusion: after managing and linking all files, the sample group with complete dental data was sufficient for long-term studies on tooth loss, accounting for 24.56%. The R programming is effective tool for managing electronic dental health records to create a dataset suitable for conducting longitudinal studies on tooth loss.

Article Details

How to Cite
1.
Jirachaiprasit N, Hintao J, Pithpornchaiyakul W. Data Management and Completeness of Dental Data in the 43-File Standard Dataset for Longitudinal Tooth Loss Study Using R Program. Khon Kaen Dent J [internet]. 2025 Mar. 27 [cited 2025 Apr. 4];28(1):35-44. available from: https://he01.tci-thaijo.org/index.php/KDJ/article/view/272327
Section
Articles
Share |

References

Song M, Liu K, Abromitis R, Schleyer TL. Reusing electronic patient data for dental clinical research: a review of current status. J Dent. 2013;41(12):1148-63.

Yongsiriwit T, Pradmal S, Mobnorin J, Mapong K, Puritatkul C, Suktawi S, et al. Provincial, regional, and ministry-level health data center (HDC) systems. 2020 [cited 2022 Nov 1]. Available from: https://itjournal.moph.go.th/page/detail/22.

Puvasanti S, Suwanwong Y. Study on the opportunities for development of health information technology systems to support medical and public health operations in saraburi province. Institute for Urban Disease Control and Prevention Journal. 2021;6(1):125-51.

Chakriyanich R. Factors associated with low birth weight in health region 5. J Pub Health Nurse. 2020;34(3):1-17.

Wichanuwat S, Sriratanaball J. Completeness and accuracy of the standard 52-file dataset for use as a data source for value-based health care indicators in diabetes. JHSR. 2021;15(4):11.

Sritong N, Hemmanee P. Oral health in the working age group. 2020. Available from: https://dental.anamai.moph.go.th/web-upload/migrated/files/dental2/n4196_e6259c6e55b612675d922b9e4755004b_sur1559y.pdf.

Chonlaphasathit W. Health promotion services model in well-child clinics and the relationship of dental caries, growth, and development in children aged 0-5 years [dissertation]. Prince of Songkla University; 2023.

Sritong N, Prasertsom P. Quality of dental health records according to the ministry of public health's medical and health data standard struture. Th Dent PH J. 2019;24(1):27-43.

Saleewon N. Quality of dental data in the health data center system of community hospitals in ratchaburi province [dissertation]. Prince of Songkla University; 2024.

Wijitkunakorn P. Medical research, public health, and data analysis using R. 1st Bangkok: Sammitr Phatthana Printing (1992); 2024. 450 p.

Wijitsunthornkul K, Thitichai P, Chalorm K, Antimanon A, Kasetpiban N, Nalampang K, et al. Health data analysis in the thai public health system using R. 1st Songkhla: IQ Media; 2020.

Techavituravong R, Thianmontri A, Phitthapornchaikun S. Factors affecting one-year loss in type 2 diabetes patients. J Dent Assoc Thai. 2023;73(2):104-12.

Jakobsen JC, Gluud C, Wetterslev J, Winkel P. When and how should multiple imputation be used for handling missing data in randomised clinical trials–a practical guide with flowcharts. BMC Med Res Methodol. 2017;17(1):162.

Lewis AE, Weiskopf N, Abrams ZB, Foraker R, Lai AM, Payne PRO, et al. Electronic health record data quality assessment and tools: a systematic review. J Am Med Inform Assoc. 2023;30(10):1730-40.

Community UGA. Quality assurance of code for analysis and research (version 2023.2): Office for National Statistics, Analytical Standards and Pipelines hub; 2020. Available from: https://best-practice-and-impact.github.io/qa-of-code-guidance/.

Tableau. Guide to data cleaning: definition, benefits, components, and how to clean your data. 2024. Available from: https://www. tableau.com/learn/articles/what-is-data-cleaning.

Alexcernat. Understanding the longitudinal data workflow: a comprehensive guide. 2024. Available from:https://longitudinalanalysis.com/ understanding-the-longitudinal-data-workflow-a-comprehensive-guide/.

Phansima N, Chaisakul A. Comparison of R and SPSS programs for classifying advance payment data of Khon Kaen University Under Data Mining Concept. J Sci Tech UBU. 2014;16(1):16-31.

Office of the Permanent Secretary, Ministry of Public Health. Handbook for Collecting and Submitting Data According to Health Data Standard Structure, Fiscal Year 2021, version 2.4. Ministry of Public Health; 2021.

Zhang R, Indulska M, Sadiq S. Discovering data quality problems. Bus Inf Syst Eng 2019;61(5):575-93.

Jirachaiprasit N. Time until tooth loss among the pre-elderly and related factors: a study based on the 43-file standard datasets of a province in health region 12, Thailand [dissertation]. Prince of Songkla University; 2024.

Patel JS, Su C, Tellez M, Albandar JM, Rao R, Iyer V, et al. Developing and testing a prediction model for periodontal disease using machine learning and big electronic dental record data. Front Artif Intel.l 2022;5:979525.

Patel JS, Kumar K, Zai A, Shin D, Willis L, Thyvalikakath TP. Developing automated computer algorithms to track periodontal disease change from longitudinal electronic dental records. Diagnostics 2023;13:1028.