Is Predicted Data a Viable Alternative to Real Data?

It is costly to collect the household- and individual-level data that underlies official estimates of poverty and health. For this reason, developing countries often do not have the budget to update their estimates of poverty and health regularly, even though these estimates are most needed there. O...

Full description

Bibliographic Details
Main Author: Fujii, Tomoki
Other Authors: van der Weide, Roy
Format: eBook
Language:English
Published: Washington, D.C The World Bank 2016
Series:World Bank E-Library Archive
Online Access:
Collection: World Bank E-Library Archive - Collection details see MPG.ReNa
LEADER 02029nmm a2200241 u 4500
001 EB002104989
003 EBX01000000000000001245079
005 00000000000000.0
007 cr|||||||||||||||||||||
008 221013 ||| eng
100 1 |a Fujii, Tomoki 
245 0 0 |a Is Predicted Data a Viable Alternative to Real Data?  |h Elektronische Ressource  |c Tomoki Fujii 
260 |a Washington, D.C  |b The World Bank  |c 2016 
300 |a 45 p 
700 1 |a van der Weide, Roy 
700 1 |a Fujii, Tomoki 
041 0 7 |a eng  |2 ISO 639-2 
989 |b WOBA  |a World Bank E-Library Archive 
490 0 |a World Bank E-Library Archive 
028 5 0 |a 10.1596/1813-9450-7841 
856 4 0 |u http://elibrary.worldbank.org/doi/book/10.1596/1813-9450-7841  |x Verlag  |3 Volltext 
082 0 |a 330 
520 |a It is costly to collect the household- and individual-level data that underlies official estimates of poverty and health. For this reason, developing countries often do not have the budget to update their estimates of poverty and health regularly, even though these estimates are most needed there. One way to reduce the financial burden is to substitute some of the real data with predicted data. An approach referred to as double sampling collects the expensive outcome variable for a sub-sample only while collecting the covariates used for prediction for the full sample. The objective of this study is to determine if this would indeed allow for realizing meaningful reductions in financial costs while preserving statistical precision. The study does this using analytical calculations that allow for considering a wide range of parameter values that are plausible to real applications. The benefits of using double sampling are found to be modest. There are circumstances for which the gains can be more substantial, but the study conjectures that these denote the exceptions rather than the rule. The recommendation is to rely on real data whenever there is a need for new data, and use the prediction estimator to leverage existing data