Evaluating the Impact of Missing Data Imputation Techniques on Regression Model Validity and Predictive Accuracy

Davis, Mason; Nelson, Grace; Williams, Jack

Submit Your Article

Evaluating the Impact of Missing Data Imputation Techniques on Regression Model Validity and Predictive Accuracy

Posted: May 23, 2012

Abstract

Missing data represents one of the most persistent and challenging problems in statistical analysis and machine learning applications across diverse domains including healthcare, social sciences, and business analytics. The prevalence of incomplete datasets necessitates the development and application of imputation techniques that can generate plausible values for missing observations. While numerous imputation methods have been proposed in the literature, their comparative evaluation has traditionally focused on predictive accuracy metrics, often neglecting the crucial aspect of parameter validity preservation. This research addresses this significant gap by developing a comprehensive evaluation framework that simultaneously assesses both predictive accuracy and statistical validity across multiple imputation approaches. The fundamental challenge in missing data imputation lies in the inherent tension between generating values that maintain the original data distribution's statistical properties while also enabling accurate predictions in downstream modeling tasks. Conventional evaluation paradigms have predominantly emphasized the latter, potentially leading to the adoption of imputation methods that produce biased parameter estimates or distorted covariance structures. This oversight has profound implications for inferential statistics, where the validity of parameter estimates is paramount for drawing meaningful conclusions from data. Our research makes several distinctive contributions to the field of missing data analysis. First, we introduce a novel evaluation framework that systematically examines the impact of imputation techniques on both predictive performance and parameter validity across different missing data mechanisms and proportions. Second, we investigate the often-overlooked relationship between imputation

Downloads: 83

Abstract Views: 1558

Rank: 263986

Hi, Racheal

Evaluating the Impact of Missing Data Imputation Techniques on Regression Model Validity and Predictive Accuracy

Abstract

Related eJournals

Guidelines

Policies

Get In Touch