The management of continuous variables for data analysis in medical and public health research
Keywords:
Data management, Data and variable, Category variables, Continuous variablesAbstract
Handling continuous variables before being applied to the analysis directly affects statistical conclusions and research results. Inappropriate grouping of continuous variables causes a loss of the semantic ability to interpret the meaning of the data. In addition, it causes an error in estimation of the parameters of the statistical model, resulting in decreased predictive efficiency model and increased type 1 error. Therefore, if the researchers do not have sufficient empirical evidence to support the categorization of continuous variables, it should be directly applied to continuous variables for analysis. In case of the researcher wants to categorize the continuous variables to describe the variables by group or test the hypothesis for applying the result in practice, this situation necessity to plan at the time of designing the research methodology of the research protocol based on theoretic or clinical justification. The likelihood of bias should be considered, and the statistical analysis results should be interpreted carefully. This article presents the example of the power of managing continuous variables to the statistics value and logistics model, the pros and cons of managing continuous variables as categorical variables, and the conclusion and recommendation. The contribution will be beneficial for the researcher in selecting the appropriate approach to managing continuous variables for data analysis.
References
Agesti, A. (1996). An introduction to categorical data analysis. New York: John Wiley & Sons, Inc.
Altman, D. G. (1991). Categorising continuous variables. British journal of cancer, 64(5), 975. https://doi.org/10.1038/bjc.1991.441
Altman, D. G., Lausen, B., Sauerbrei, W., & Schumacher, M. (1994). Dangers of using "optimal" cutpoints in the evaluation of prognostic factors. Journal of the National Cancer Institute, 86(11), 829–835.
American Diabetes Association. (2019). 6 Glycemic targets: Standards of medical care in diabetes-2019. Diabetes care, 42(Suppl 1), S61–S70.
Bennette, C., & Vickers, A. (2012). Against quantiles: Categorization of continuous variables in epidemiologic research, and its discontents. BMC medical research methodology, 12, 21. https://doi.org/10.1186/1471-2288-12-21
Frøslie, K. F., Røislien, J., Laake, P., Henriksen, T., Qvigstad, E., & Veierød, M. B. (2010). Categorisation of continuous exposure variables revisited. A response to the hyperglycaemia and adverse pregnancy outcome (HAPO) Study. BMC Medical Research Methodology, 10(1), 103. doi:10.1186/1471-2288-10-103
Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic regression. (2nd ed.). New York: John Wiley & Sons, Inc.,
Mazumdar, M., & Glassman, J. R. (2000). Categorizing a prognostic variable: Review of methods, code for easy implementation and applications to decision-making about cancer treatments. Statistics in medicine, 19(1), 113–132.
Naggara, O., Raymond, J., Guilbert, F., Roy, D., Weill, A., & Altman, D. G. (2011). Analysis by categorizing or dichotomizing continuous variables is inadvisable: An example from the natural history of unruptured aneurysms. AJNR. American journal of neuroradiology, 32(3), 437–440.
Qaseem, A., Wilt, T. J., Kansagara, D., Horwitch, C., Barry, M. J., Forciea, M. A. et al. (2018). Hemoglobin A1c targets for glycemic control with pharmacologic therapy for nonpregnant adults with type 2 diabetes mellitus: A guidance statement update from the American College of Physicians. Annals of internal medicine, 168(8), 569–576.
Royston, P., Altman, D. G., & Sauerbrei, W. (2006). Dichotomizing continuous predictors in multiple regression: A bad idea. Statistics in medicine, 25(1), 127–141.
Sauerbrei, W., & Royston, P. (2010). Continuous variables: To categorize or to model?. In C. Reading (Ed.), The 8th International Conference on Teaching Statistics- Data and Context in statistics education: Towards an evidence based society. Voorburg: International statistical Institute.
Schellingerhout, J. M., Heymans, M. W., de Vet, H. C., Koes, B. W., & Verhagen, A. P. (2009). Categorizing continuous variables resulted in different predictors in a prognostic model for nonspecific neck pain. Journal of clinical epidemiology, 62(8), 868–874.
Simsin, S. (2016). A report of the management of continuous independent variables in logistic regression analysis in medical and public health journals in Thailand. Thesis of Master of Public Health, Khon Kaen University, Khon Kaen. (in Thai)
Simsin, S., & Khiewyoo, J. (2017). A report of the management of continuous independent variables in logistic regression analysis in medical and public health journals in Thailand. Srinagarind Medical Journal, 32(1), 38-44. (in Thai)
Turner, E. L., Dobson, J. E., & Pocock, S. J. (2010). Categorisation of continuous risk factors in epidemiological publications: A survey of current practice. Epidemiol Perspect Innov, 7, 9. doi:10.1186/1742-5573-7-9
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Journal of Public Health Naresuan University
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The published article is copyrighted by the Journal of Public Health and Health Sciences Research.
The statements that appear in each article in this academic and research journal are the personal opinions of each author and are not related to Naresuan University and other faculty members in the university. Responsibilities regarding each article are the responsibility of each author.