Skip to content
Project

Strengthening administrative data quality and use to improve lives in India’s aspirational districts

IDinsight’s DataDelta team is partnering with the Indian government to improve the quality of administrative data used for a landmark social policy.

 

 

IDinsight Field Manager Lead Syed Maqbool (second from right) during data collection in Andhra Pradesh, India ©IDinsight

Decision-makers challenge

In 2018, the Indian government launched a new program designed to address lagging socio-economic indicators in the country’s poorest districts. In contrast to previous social policy, the Aspirational District Programme (ADP), launched by the government policy think tank NITI Aayog, shifted the focus from inputs to outcomes such as maternal and child health indicators. This shift was designed to ensure that India’s economic growth translated into tangible improvements in the lives of people living in extreme poverty.

Since the program is based on district performance, ensuring decision-makers use high-quality, reliable data to evaluate performance is critical to its success. However, historically, the administrative data that the districts collected has not been reliable. In 2018, NITI Aayog, the Gates Foundation, and IDinsight partnered to produce reliable data to support the program and help ensure its success.

In the first phase of our collaboration with NITI Aayog on the ADP, IDinsight conducted multiple rounds of household surveys to track socioeconomic outcomes at the district level, informing the ADP’s district-wide rankings.  The focus was on five critical sectors: Health & Nutrition, Education, Agriculture & Water Resources, Financial Inclusion & Skill Development, and Infrastructure.

However, conducting household surveys on the necessary scale and frequency needed for the ADP (i.e., monthly, encompassing over 70 indicators across 112 districts in 27 states)  is too costly and time-consuming, and not all administrators trusted the use of third-party data for the rankings. Given these challenges, the DataDelta team worked with NITI Aayog to find another solution to their data-use challenges.  The new approach entailed regularly assessing and enhancing the quality of administrative data, with a focus on two key outcomes:

  1. Institutionalizing data quality and usage at the district level.
  2. Creating scalable solutions to support systems that verify large data sets, streamline data flows from various government departments, and improve efficiency at the district level.

Impact opportunity

The data quality improvement strategy has the potential to serve as a model for improving the reliability of administrative data across a range of government initiatives beyond the ADP.

Our approach

DataDelta developed tools and designed a verification strategy to ensure that the monthly data used for district rankings credibly reflects on-the-ground performance. This approach involved first understanding all potential sources of incorrect data entries. 

IDinsight conducted verification of over 30 ADP indicators across the five sectors. The process began by mapping data flows for each indicator from the unit level to the aggregated level submitted to the Champions of Change dashboard. Using a robust sampling strategy with multiple sampling frames, we verified a sample of this data at each record-keeping level, as well as at the unit level, where it was directly verified with the service recipients. This extensive effort involved hiring and managing 400+ surveyors across 38 districts in 8 states. Verification tools were tailored to capture the unique attributes of each verification unit.

This led to the development of adaptable data quality toolkits that can be used by any sector to conduct regular, scientific, field-based, and desk-based checks while also supporting integration with existing supervision and verification structures.

The results

The findings highlighted critical points in the data flow where inaccuracies frequently occur, including during data entry in registers, during the digitization process, and throughout aggregations across registers or sub-district levels. It also identified indicators that are particularly susceptible to misreporting. Based on this analysis, DataDelta proposed short- and long-term strategies to address the issues, including

  1. Revisiting indicator definitions to ensure clarity and reduce exclusions
  2. Conducting regular admin-led scientific sample-based back checks, and third-party-led sample-based back checks at the unit and facility level
  3. Improving the existing record-keeping system and developing robust record keeping and reporting practices
  4. Improving dissemination of guidelines and conducting regular trainings across levels to ensure clear understanding of process and indicators
  5. Reviewing the workload of officials and filling vacancies to reduce burden and unintentional errors and provide incentives
  6. Strengthening the culture of data use and quality in districts

Outcomes 

Two significant outcomes of the project are the tech-based solutions to measuring and improving administrative data quality:    

  1. Admin-Led Verification Toolkit: Piloted in Sonbhadra, the tool assesses data quality by triangulating data recorded at key junctures of data flow and reconciling the results in action-oriented data quality reports. This is particularly useful to monitor the data quality before the data gets digitized into the system. This toolkit can be integrated into existing supervision visits conducted by various programs across departments.
  2. Analytics-Led Verification Tool: This tool provides data quality diagnostics through a series of basic and machine learning-based checks on already digitized data. These checks cover nearly all data points across all districts, offering extensive coverage at a scale that in-person backchecks cannot achieve. Additionally, they serve as a valuable layer in the overall verification strategy, helping to distinguish between data entry errors and intentional performance inflation.

Together, these tools offer a comprehensive approach to understanding and addressing the overarching issues that impact data quality and the use of administrative data.