Skip to content
Blog

How to ensure productivity and data quality for a phone survey at scale

Part two in our series of lessons learned based on a 6,000 person survey in India and a 600 person survey in Kenya.

During COVID-19, rapid and accurate data collection on economic and physical health is vital to ensuring the best policy response. In our last post, we described the hiring and training processes we introduced to run a ~6,000 person phone survey in India and a ~600 person phone survey in Kenya. In this post, we share our experiences with data collection daily management: ensuring that the survey stays on pace, encouraging high levels of data quality, effectively managing feedback loops, and communicating with surveyors. Many of these practices are also useful for large scale in-person surveys that are being managed remotely.

Productivity & Data Quality

Since our survey was short, we needed to prepare productivity and data quality measures ahead of time so that any findings could be introduced immediately. We created a dashboard to receive live updates on productivity, high-frequency checks, and audio audit scores on both an enumerator and district level.

Productivity

We used a SurveyCTO dataset to connect to our trackers (see more in our post on reaching respondents), which provided information on when to call certain respondents based on previous attempts. In these trackers, we calculated surveyor-level statistics on the number of surveys that were completed, half-completed, not-reached, and refused. We compiled these productivity numbers so that District Coordinators could compare collective district-wide performance and the performances of individual surveyors. If a district seemed to be underperforming, we could try to uncover why and take steps to help boost productivity. For example, after seeing a specific district underperform, we learned that a surveyor had become ill. We were able to re-assign a high-performing surveyor from a high-performing district in the same language group to help complete surveys for the underperforming district.

High-Frequency Checks

High-frequency checks allow us to look for suspicious patterns in the data, to ensure quality. After looking through the questionnaire, we noted down a few places the data we collected could be inaccurate and flagged them. These included flags for outliers where responses are certainly possible, but rare (for example, a respondent stating their house has 12 rooms), logic inconsistencies (for example, a respondent claiming they did not have a bank account, but later stating they had received direct benefit transfers through the bank), and counts of the number of times a surveyor had marked a respondent answered “don’t know” and “refuse to respond”. We created a server dataset in SurveyCTO which outputted the variables we were interested in into a Google Sheet, where we added together the number of times each surveyor had inputted a value that we flagged. We were able to present this information to District Coordinators in a dashboard so that they could see which question received the most flags overall and which surveyors had the most flags on a question level.

Audio Audits

We asked for consent to record phone calls (which we were able to record by linking SurveyCTO to an API of a call-recording software). After the first set of phone surveys were submitted, we systematized a Google Sheet that was populated with links to the audio recordings of the surveys. Our monitors were tasked to listen to these recordings while filling out an Audio Audit form. In this form, we copied a random assortment of questions from the main form and the monitor answered the questions on the form as if they were the surveyor filling out the initial survey. This allowed us to calculate mismatches between what respondents said and what was entered by the surveyors in the form. We also asked the monitor some qualitative questions to rank the surveyor’s speed, adherence to protocols, level of engagement etc. Since we did not want to call respondents back to ask essentially the same survey again, this was the main method in which we monitored data quality. In the survey we recently completed, monitors listened to 40 per cent of the consented recordings concurrently with data collection.

Dashboard

In India, we collected all of the productivity and data quality indicators in a dashboard for our team. With this dashboard, we were able to monitor district performance and data quality at a high level. Our State and District Coordinators also had access to the dashboard and referenced it during debriefs. Having a regularly updated dashboard allowed us to take immediate action if we found that productivity or quality were not up to standards, which was vital given the short timeframe of the survey.

Snapshot of the DoD dashboard in real-time used to track data quality; Geographic identifiers anonymized for confidentiality
Lessons Learned & Tips
  1. We have found that there is an important line to be drawn between over-including and under-including quality checks. It is very easy to think that every question is worthy of a don’t know/refusal count, a high-frequency check, or an audio audit check. If there are too many checks, it becomes difficult to prioritize which are most important. While looking for questions to flag, think about which questions deal with the most important indicators, which ones might be most difficult for a surveyor or respondent to understand, which ones might be most tedious for a surveyor to ask, and which ones lead to skipping patterns that could save a surveyor a lot of time if they enter “don’t know,” “refuse to answer,” or “no.”
  2. Similarly, having too many statistics and tables on a dashboard is overwhelming. For our next survey, we hope to incorporate feedback from our District Coordinators who use the dashboard to ensure that only the most useful metrics are emphasized.
  3. In the first round of our survey with the Data on Demand team in India, we received feedback that it was difficult for District Coordinators to manage their trackers, the dashboard, and productivity sheets. In our second round, we combined all of these sheets into one dashboard. In this way, District Coordinators could manage productivity, data quality, and daily management in one place.