Skip to content
Blog

Why survey data should be here to stay

Douglas Johnson 25 June 2025

©IDinsight

With the cancellation of USAID’s Demographic and Health Surveys (DHS) program, countries are scrambling to fill the gap in critical data left by this program. The DHS program funded standardized, rigorous household surveys in over 90 countries on a broad range of public health topics. Prior to its cancellation, DHS surveys were the primary source of data on hundreds of key indicators. 

In discussions we’ve had on how to respond to the cancellation of the DHS program, one question comes up over and over again: “Can we generate these indicators through other data sources?” In particular, we have heard many people propose using data from countries’ health management information systems (HMISs), routine health information systems (RHISs), civil registration and vital statistics (CRVS) programs, or other administrative data to estimate key indicators previously generated using DHS data. The urge to use data from existing administrative systems rather than conduct costly and tedious surveys is very understandable. A national household survey can cost several million dollars, while data from existing administrative systems is free. 

In most cases, admin data can’t replace survey data

Unfortunately, in most cases, the answer to this question is “no” – administrative data is not a reliable substitute for data from household surveys. Dozens of studies have assessed the reliability of administrative data for generating key health indicators in low and middle-income countries  (LMICs), and, in most cases, the verdict was not positive. A systematic review of studies on routine health information systems by Hoxha et al (2022) found that “in many LMICs, RHISs remain fragmented and disorganised, and concerns regarding the quality, accuracy, timeliness, completeness, and representativeness of RHIS data are widespread.”1 Similarly, a systematic review by Lundin et al (2022) of studies assessing the quality of newborn data from health information systems in LMICs found that a large share of studies showed high rates of incompleteness in the data and low internal consistency.2 A systematic review by Wetherill et al (2023) focusing on immunization data in particular found that immunization data are generally incomplete in low-income countries.3 Another systematic review by Dolan and MacNeil (2023) found that administrative data on immunization coverage were inflated by 26-30% compared to data from household surveys across several countries.4 Lastly, a systematic review by Okwariji et al (2024) focusing on data on low birthweight and preterm births found that 85% of LMICs had less than 90% completeness for low birthweight data, and many reported implausible year-on-year jumps.5

This is not for lack of trying. Funders and national governments have invested huge sums of money into improving these systems. These efforts appear to have improved the quality of these data, yet progress has been slow. In most countries, the day when these data can replace the need for household surveys is still a long way off.

There are several underlying reasons why administrative data are often unreliable. First, officials may have an incentive to mis- or under-report certain indicators. For example, in many countries, there is a huge spike in the number of babies with a birthweight of 2500 grams, just above the threshold for low birthweight (and thus the threshold below which the facility is required to administer additional interventions) (Blanc and Wardlaw, 2005).6 Even in cases where officials do not have an incentive to misreport figures, they often do not have a strong incentive to take the job of reporting these data (which are likely in addition to many other reporting tasks they are responsible for) all that seriously (Wetherill et al, 2023). Another challenge with using administrative data is that it must often be combined with census data to arrive at an indicator. For example, estimating mortality requires not just data on deaths but also data on the number of people of the relevant age group alive at the time period in question. 

Skeptics may point out that survey data are hardly perfect. Response rates to household surveys are declining in many countries, and surveys rely on the honesty of the respondent.7  Yet, for the most part, data from rigorous household surveys in LMICs are reliable. Rigorous sampling methods ensure that sampled households are representative of all households.8 Despite recent declines in response rates in high-income countries, response rates to national household surveys in low-income countries are typically above 90% and don’t appear to be declining.9 And most indicators do not rely on sensitive questions, and even when they do, careful sequencing of questions, enumerator training, or even tricks like list randomization can help ensure reliable responses and data.

For cheaper, faster household surveys, invest in survey infrastructure

If we can’t replace household surveys with administrative data, how can we cut the costs of generating these data? In our opinion, funders and governments should jointly invest in a common survey infrastructure that allows countries to conduct household surveys cheaply and quickly. In particular, smart investments in the following areas would significantly reduce the cost of household surveys: 

  1. Modular standardized questionnaires: The DHS model questionnaires were the gold standard of household questionnaires. Each question was meticulously crafted and tested to ensure valid responses. Yet these questionnaires were designed to be adopted in full. A more modular, standardized questionnaire with different options to choose from, and metadata on how much time each option takes, would reduce the amount of effort required to develop questionnaires and help ensure indicators are somewhat comparable across countries.
  2. Technology for data collection: Mobile data collection tools like Open Data Kit, KoBoToolbox, and SurveyCTO have greatly reduced the cost of surveys. While these tools are fantastic, there is still room for improvement in the software used to collect data and manage survey operations. As an example, here at IDinsight, we recently created SurveyStream to better manage core aspects of survey operations, such as how enumerators are assigned to respondents. 
  3. Better sampling methods: The standard approach to sampling households for the DHS and most other high-quality household surveys is rigorous but tedious and costly. We believe that investments in geospatial sampling methods could yield a sampling approach that is just as rigorous but significantly less expensive. 

Even with these investments, countries may still be forced to reduce the frequency, duration, and sample size of their public health surveys. But, with these common infrastructure elements, we believe countries could still generate the same critical indicators once supplied by DHS surveys. 

  1. 1. Hoxha, Klesta, et al. “Understanding the challenges associated with the use of data from routine health information systems in low-and middle-income countries: a systematic review.” Health Information Management Journal 51.3 (2022): 135-148.
  2. 2. Lundin, Rebecca, et al. “Quality of routine health facility data used for newborn indicators in low-and middle-income countries: A systematic review.” Journal of global health 12 (2022): 04019.
  3. 3. Wetherill, Olivia, Chung-won Lee, and Vance Dietz. “Root causes of poor immunisation data quality and proven interventions: a systematic literature review.” Annals of infectious disease and epidemiology 2.1 (2017): 1.
  4. 4. Dolan, Samantha B., and Adam MacNeil. “Comparison of inflation of third dose diphtheria tetanus pertussis (DTP3) administrative coverage to other vaccine antigens.” Vaccine 35.27 (2017): 3441-3445.
  5. 5. Okwaraji, Yemisrach B., et al. “National routine data for low birthweight and preterm births: Systematic data quality assessment for United Nations member states (2000–2020).” BJOG: An International Journal of Obstetrics & Gynaecology 131.7 (2024): 917-928.
  6. 6. Blanc, Ann K., and Tessa Wardlaw. “Monitoring low birth weight: an evaluation of international estimates and an updated estimation procedure.” Bulletin of the World Health Organization 83.3 (2005): 178-185d.
  7. 7. Jabkowski, Piotr, and Piotr Cichocki. “Survey response rates in European comparative surveys: a 20-year decline irrespective of sampling frames or survey modes.” Quality & Quantity (2024): 1-21.
  8. 8. Vaessen, Martin, Mamadou Thiam, and Thanh Lê. “Chapter XXII The demographic and health surveys.” United Nations Statistical Division, United Nations Department of Economic and Social Affairs 26 (2005).
  9. 9. For example, Vaessen, Thiam, and Le (2005) find that the average response rate for DHS surveys conducted between 1990 and 2000 was 97.5%. Response rates to more recent DHS surveys, such as the Malawi 2022, Nepal 2022, and Ethiopia 2019 surveys, all appear to be higher than 98%.