Article

Recommended Solutions To Missing Data

Topic: Business ConsultingPublished October 12, 2008

Legacy signals

Legacy popularity: 1,626 legacy views

Reader rating

Not enough ratings yet

Aggregate average appears after enough eligible reader ratings.

Rate this resource

Sign in to rate this resource.

Sign in to rate this resource

There are two methods for dealing with missing data that have become available in mainstream statistical software in the last few years. These two methods are vast improvements over traditional approaches, as described in Limitations to Common Approaches to Missing Data. This article outlines these two methods.

Both of the methods discussed here require that the missing data mechanism is ignorable, that is, not related to the missing values (see Missing Data Mechanisms). If the mechanism is ignorable, resulting estimates (i.e., regression parameters and standard errors) will be unbiased with no loss of power.

The first method is Multiple Imputation (MI). Just like the imputation methods discussed in Limitations to Common Approaches to Missing Data, Multiple Imputation fills in estimates for the missing data. However, to capture the uncertainty in those estimates, MI imputes the values multiple times. Because it uses an imputation method with error built in, the multiple estimates should be similar, but not identical. The result is multiple data sets with identical values for all of the non-missing values and slightly different values for the imputed values in each data set. The statistical analysis of interest, such as ANOVA or logistic regression, is performed separately on each data set, and the results are then combined. Because of the variation in the imputed values, there should also be variation in the parameter estimates, leading to appropriate estimates of standard errors and appropriate p-values.

Multiple Imputation is available in SAS, S-Plus, and Solas. In SAS, PROC MI creates the multiple data sets, which can then be easily analyzed separately using standard statistical procedures. PROC MIANALYZE will then combine the results from these separate analyses. Joe Schafer at Penn State has developed four S-Plus libraries for multiple imputing normal, categorical, mixed, and panel data. He has made the library for normal data available as a free stand-alone package called NORM. Multiple Imputation is also available in Solas, but its algorithms have been questioned as inappropriate, and we cannot recommend its use at this time.

The second method is to analyze the full, incomplete data set using maximum likelihood estimation. This method does not impute any data, but rather uses all data observed for each case to compute maximum likelihood estimates. The maximum likelihood estimate of a parameter is the value of the parameter that is most likely to have resulted in the observed data. When data are missing, we can factor the likelihood function. The likelihood is computed separately for those cases with complete data on some variables and those with complete data on all variables. These two likelihoods are then maximized together to find the estimates. Like multiple imputation, this method gives unbiased parameter estimates and standard errors. One advantage is that it does not require the careful selection of variables used to impute values that Multiple Imputation requires. It is, however, limited to linear models.

Analysis of the full, incomplete data set using maximum likelihood estimation is available in AMOS. AMOS is a structural equation modeling package, but it can run multiple linear regression models. AMOS is easy to use and is now integrated into SPSS, but it will not produce residual plots, influence statistics, and other typical output from regression packages. The missing value analysis package in SPSS will do some very limited maximum likelihood estimates for means and correlations only.

References:
Schafer, J. Software for Multiple Imputatio

Hox, J.J. (1999) A Review of Current Software for Handling Missing Data, Kwantitatieve Methoden, 62, 123-138.
Allison, P. (2000). Multiple Imputation for Missing Data: A Cautionary Tale, Sociological Methods and Research, 28, 301-309.

Article author

About the Author

Copyright © 2008, Karen Grace-Marti Karen Grace-Martin, founder of The Analysis Factor, has helped social science researchers practice statistics for 9 years, as a statistical consultant at Cornell University and in her own business. She knows the kinds of resources and support that researchers need to practice statistics confidently, accurately, and efficiently, no matter what their statistical background. To answer your questions, receive advice, and view a list of resources to help you learn and apply appropriate statistics to your data, visit www.analysisfactor.com.

Further reading

Further Reading

4 total

Article

The medical device sector demands greater regulatory standards worldwide. Firms must ensure product safety and quality for patient well-being. Implementing the ISO 13485standards for medical devices can help meet these expectations. Skilled ISO 13485 consultants can assist in the implementation journey,and this delivers measurable value. This ISO is not about a paperwork exercise, but it offers practical implementation procedures. It allows medical firms to design efficient q

February 17, 2026

Article

Are You Worried That Competitors Are Ahead in Ways We Can’t See? How to Stop Playing Blind and Start Seeing What Actually Matters: Weekly Winning StrategiesrnMany companies lose because they fight ghosts. Imagining competitor advantage that doesn’t exist. Missing the real threats right in front of them. Stop worrying about invisible competitors and start seeing what matters. The Panic That Wastes MillionsrnA fintech startup approached us in 2025 with $800K in their bank a

February 8, 2026

Article

Inventory management is one of the most important parts of running a successful business. No matter if you own a retail store, a restaurant, or a small warehouse, knowing what products you have in stock helps you avoid losses and serve customers better. When inventory is poorly managed, businesses often face common problems such as missing items, overstocked shelves, or products running out at the wrong time. These issues can directly affect profits and customer trust. In the

January 16, 2026

Article

Inventory management is one of the most important parts of running a successful business. No matter if you own a retail store, a restaurant, or a small warehouse, knowing what products you have in stock helps you avoid losses and serve customers better. When inventory is poorly managed, businesses often face common problems such as missing items, overstocked shelves, or products running out at the wrong time. These issues can directly affect profits and customer trust.rnIn th

January 16, 2026