IDinsight is innovating to provide high-quality data to decision-makers efficiently and cost-effectively.
Survey of an Auxiliary Nursing Midwife (ANM) in Bahraich, Uttar Pradesh, India. ©Prabhat Sharma/IDinsight
I will never forget the day that I woke up to a flurry of missed calls and WhatsApp messages that all contained the same message: “it’s not working!”
It was the first day of data collection for our first COVID-19 phone survey. My teammates and I had worked diligently to design, prepare and launch a mobile phone survey in our effort to understand the impacts of COVID-19 on rural households in India. Policymakers needed to quickly make informed decisions about how to manage the pandemic and requested IDinsight to support. Given this urgency, we had only twelve days to execute a survey to a representative sample of over 5,000 households across eight states in India.
I frantically opened my laptop to the Google Sheet tracker and saw my defeat: importrange internal error. The Google Sheet we had spent hours defining formulas for to manage assignments, productivity, and data quality had crashed given the influx of data we received. This meant that surveyors did not know who to survey, and as a team, we were unable to monitor data quality and survey progress. Luckily, we were able to fix the error by disaggregating our Google Sheets into smaller ones that could better process our data as a temporary solution. Although we were able to complete data collection in time, resolving the issue cost us considerable time and effort.
Thanks to IDinsight’s continued investment in data systems, I have not had this experience since.
Data on Demand (DoD) aims to accelerate the do-learn-improve cycle for policy changes and programmatic interventions by generating affordable, high-quality primary data at quicker speeds. By facilitating access to data “on demand,” DoD hopes to transform the way the social sector innovates, learns, and improves. In this way, DoD has the potential to affect policies that could improve millions of lives.
One of our main activities on DoD is to conduct data collection at a large scale. The data collection process involves many moving pieces, and it is inefficient to start from square one for each survey. Furthermore, it requires a tremendous amount of effort to manage a remote surveyor workforce using purely manual systems, especially when management processes need to be informed by real time data coming in from the field. To tackle these challenges, the DoD team has been collaborating with IDinsight’s Data Science, Engineering and Monitoring Systems (DSEM) team to design and create a platform called SurveyStream, which consists of automated systems that can be used to better manage all stages of the data collection process.
The chart below highlights the various tasks involved in managing primary data collection activities:
As you can imagine, executing these tasks on short timelines, at scale, and across different regions can be quite challenging, especially if each task is manually done.
Let’s take a deeper look into one of these task categories: assignments. The goal of this task is to ensure that surveyors know who to survey. While this may seem simple, there are a few considerations that complicate the task:
It is very possible that some surveyors experience problems with receiving their assignments in a timely manner if all tasks are individually and manually executed. For example, it could be the case that a surveyor may not get their assignment sheet or notice that they are visiting households too far away. It is also possible that a surveyor drops out and we forget to give that surveyor’s assignments to someone else, and therefore miss out on surveying certain respondents.
When breaking down each of the other tasks involved in surveyor and data collection management, a parallel set of challenges emerge.
In order to improve our ability to efficiently execute multiple data collection exercises while also innovating to make data collection faster, cheaper, and of higher quality, we have been collaborating with the DSEM team for the past two years to create automated systems, which form a platform we call SurveyStream. These systems are meant to standardize and automate the tasks involved in surveyor management and data collection (described above). Systems can be designed to fit each project’s needs: they can be as simple as a Google Sheet or more complex like a customized web application.1 On the DoD team, we have found that system requirements depend on the scope of the data collection exercise. For smaller pilots, we have successfully managed processes using Google Sheets, Google Data Studio, and Google Forms, but for large-scale surveys, we have found the need to invest in more technologically advanced systems to handle each task and manage large loads of data.
The table below summarizes a select few of the SurveyStream systems our teams have created to improve the efficiency and ease of primary data collection tasks:
All raw survey data inputted in the above systems is extracted from the survey platform we use, SurveyCTO, using an API.2 3 Our data is stored in a secure cloud based server, which is more secure than project teammates keeping files locally on their computers. These various systems are integrated using the same database and data pipeline back end, called SurveyStream. In the future, we hope to integrate different SurveyStream features like the data quality and assignment systems to the web application so that they can be configured and monitored on a more user-friendly platform, rather than more rudimentary interfaces like Google Sheets.
Let’s return to the example above on the tasks involved in sharing assignments with surveyors such that they know who to survey. With the Productivity Tracker, Web App, and Email Assignments systems described above, many of the challenges are resolved.
We have found that the systems built for assignments reduce the time spent on assignments related tasks by ~50%. As a result, IDinsight teammates can direct their time towards other aspects of the project or other innovation related work.
The beauty of each automated solution lies in its replicability, flexibility, accuracy, security, and scalability.
IDinsight is continuing to support the work of the DoD and DSEM teams in building and improving SurveyStream. Now that the key datasets and features used for survey management are being stored and managed through the core SurveyStream platform, our opportunities for building more sophisticated features and doing more advanced analysis of survey data will grow. We are currently exploring building features for optimizing assignments based on GPS locations, tools to visualize sampling frames, estimating population sizes from satellite imagery, ranking surveyors in terms of suitability for a survey, predicting poor data quality, photo analysis, and more!
The efficiency benefits gained from each system are substantial – surveys can be run and managed more rapidly and with fewer errors, freeing up time for teammates to work on other productive activities, and reducing overall survey costs. We strongly believe that a continued investment in designing and developing these systems will enable the DoD team to take on more projects in which high-quality data is collected at a higher frequency and lower cost.
6 September 2024
2 September 2024
20 August 2024
15 August 2024
13 August 2024
11 July 2024
7 July 2024
4 July 2024
2 July 2024
12 April 2022
12 September 2023
4 June 2020
30 June 2022