Skip to content
Blog

How behavioral “nudges” can improve COVID-19 self-reporting

Lessons from a low-cost digital RCT in South Africa.

Photo credit: Yagazie Emezi/Getty Images/Images of Empowerment

Randomized controlled trials (RCTs) are known as the gold standard in research. Yet, conventional RCTs often come with large budgets and extended timelines. In the context of the COVID-19 pandemic, where data is needed quickly in order to inform decisions, utilizing innovative methods can allow for faster, cheaper, and still well-powered RCTs. Through our work with Praekelt.org, the National Department of Health (NDOH), and HigherHealth in South Africa we found that conducting an RCT – in terms of implementation and data collection – entirely through a widely-used digital platform was both feasible and led to meaningful results. The RCT relied on backend data from HealthCheck, a self-assessment tool for COVID-19 risk. It had a sample size of nearly 20,000 users and was completed within one month. Our results have implications for any behavioral messaging nudges that aim to increase honest reporting of COVID-19 symptoms.

Background

Following the outbreak of the first cases of COVID-19, the South African government responded swiftly to mitigate the spread of the disease through a set of public health measures and restrictions. In addition, the National Department of Health (NDOH) in South Africa, in partnership with Praekelt.org, launched HealthCheck. HealthCheck helps users assess their COVID-19 risk via WhatsApp or USSD. The platform takes users through a COVID symptom checker and provides appropriate health behavior recommendations in return. HealthCheck was adopted by HigherHealth, the national agency in South Africa which supports the health and wellbeing of universities and colleges across the country. HigherHealth mandated that university-goers (students, staff, and lecturers) must repeatedly complete the COVID-risk self-assessment via HealthCheck on a daily or near-daily basis and produce a “low-risk” result to gain entry to college campuses across South Africa.  

We hypothesized that, over time, responses on HealthCheck were likely to become less truthful on average, with a higher proportion of “low-risk” users relative to the true prevalence of low-risk symptoms in the population. We thought this could be for two primary reasons: 1) HealthCheck users who need to visit university campuses may be likely to downplay symptoms to achieve a “low-risk” result (and thus access to campus); and 2) Over time, filling out the symptoms tracker becomes more mechanical and less deliberate, so users are likely to fill out the same set of responses automatically each time, even if their symptoms have changed.

Study Overview

IDinsight designed several behaviorally-informed messages to increase users’ honest assessment of their COVID-19 symptoms and conducted an RCT from March 23, 2022 – April 24, 2022 among 19,689 unique HealthCheck users to evaluate their impact. Specifically, we tested whether adding an honesty prompt varying the way the prompt was framed (appealing to a user’s sense of pro-sociality, making the consequences of dishonesty salient, or appealing to a user’s sense of morality1) before users begin the symptom check can lead users to more truthful symptom reporting and to avoid visiting public university spaces when they are symptomatic _ as compared to the status-quo checker. Users were randomized into one of three treatment arms and the status-quo control arm, where a standard question asking whether responses were accurate was placed after symptom questions. Therefore, our research question was: What is the effect of asking individuals to pre-commit to truthful reporting of COVID symptoms? Does framing the honesty pre-commitment message as a pro-social appeal, moral appeal, or as increasing the salience of consequences have a differential effect on truthful reporting of symptoms compared to status-quo message neutral framing?

Measuring“honesty” using back-end data

The main goal of this study was to understand the effects of different types of messages on improving the honest reporting of COVID-19 symptoms, which in turn leads to more potentially contagious people rightfully avoiding university campuses and decreasing the spread of COVID-19. Since this is a digital RCT with back-end data only, we had to come up with primary outcomes that capture the consequences of “honest” reporting, while making some reasonable assumptions. 

University goers who were at risk of having COVID-19 could honestly reflect on their COVID-19 risk status in one of two ways: (1) by recognizing they were at risk of COVID-19, not fully completing the HealthCheck symptom tracker, and avoiding campus on their own accord, or (2) by recognizing they were at risk of COVID-19, fully completing the HealthCheck symptom tracker, and avoiding campus based on the “high” or “moderate” risk determination. We define two primary outcomes to analyze the effects of our treatment arms on honest symptoms reporting under both scenarios. 

The first primary outcome variable was “Percent of HealthChecks Completed.” We hypothesize that the lower proportion of completed HealthChecks for users in any of the treatment arms indicates that users who received treatment messages may have known that they will receive a “high” or “moderate” risk status and therefore avoided campus on their own by simply not filling out a complete symptoms check. Indeed, if users did not complete a HealthCheck symptoms tracker, they would not be allowed on campus since they would not have a “lowrisk” status (or any status) to share. The “Percent of HealthChecks Completed” outcome was constructed via a binary variable (complete or incomplete), with the HealthCheck as the unit of analysis. We also made sure to check whether users filled out multiple checks (e.g. obtaining a “high” or “moderate” risk, re-doing the check with different answers and arriving at a “low-risk”). We analyzed 115,498 initiated HealthChecks in total. 

Our second primary outcome variable was “Days avoided campus per week.” We proxied that a user “avoided campus” each day a symptom check was initiated if they either did not complete the HealthCheck symptoms tracker or if they did and the risk status assigned to the completed HealthCheck instance was “moderate” or “high”. Constructing the outcome this way allows us to use two pieces of information to proxy for appropriately avoiding campus when symptomatic. Because we hypothesized that some university-goers who might be at risk of COVID-19 might downplay their symptoms while filling out the symptoms tracker in the status-quo (control group), any significant increase in the number of days per week users avoided campus, signifies an increase in honestly assessing their symptoms.

Key Findings

1. Being in any treatment group decreased the proportion of completed HealthChecks significantly (p-value<0.01 for all three treatments). The largest decrease was seen among users in T1  “moral appeal”, who completed 2.1 percentage points (pp) fewer HealthChecks relative to the control group. We hypothesize that the lower rate of completing the checks was because users knew if they completed the HealthCheck they would get a medium or high-risk result, barring them from campus entry. Thus, we can assume that by not completing the check (required to access campus), they decided to avoid campus on their own.

2. Being in any treatment group also increased the number of days per week users avoided campus. On average, users in the control group avoided visiting campus roughly 0.28 days per week, which translates to roughly 3 days avoided per semester.

Users in treatment 1 (T1 – Pro-social appeal) avoided2 campus an additional 0.03 days per week (p-value<0.01). Among 1000 users, this would add up to an additional 360 person-days per semester that campus is avoided.

Users in treatment 2 (T2 – salience of consequences l) avoided campus an additional 0.02 days per week (p-value<0.05). Among 1000 users, this would add up to an additional 240 person-days per semester that campus is avoided.

Users in treatment 3 (T3 – moral appeal) avoided campus an additional 0.09 days per week (p-value<0.01). For 1000 users, this translates to 1,080 additional person-days campus is avoided. 

In other words, the effect of receiving the moral appeal message on avoiding campus translates to the average university-goer avoiding one additional day on campus per 12 weeks.

Translating these to an estimate of COVID risk reduction requires additional analyses (and assumptions around transmissibility and proportion of symptomatic cases that are COVID vs. flu/cold), but scaled up to the population level of HealthCheck users, these represent meaningful changes. At scale, the impacts observed in the Honesty Study will be magnified, with no (or negligible) cost implications, as the messages are only adding a sentence of framing language. 

Secondary outcomes analysis revealed that users exposed to treatment three (moral appeal, the treatment where we see the largest increase in terms of our primary outcomes described above) reported 0.08 more symptoms on average per completed healthcheck compared to the control group (where there were an average of 0.04 symptoms reported for each completed HealthCheck submission). This was largely driven by an increase in the reporting of coughs, fevers, and sore throats. 

Lastly, we analyzed whether treatment affected patterns of usage within the day. Users who might fill out HealthCheck and be deemed “at risk” might fill out HealthCheck multiple times in order to be cleared to visit campus. If a user wanted to attend campus, but was not cleared to, there was nothing stopping the user from filling out HealthCheck a second time to receive a “low-risk” status. We do not find any statistically significant effects of any treatment group on HealthCheck usage patterns.

Lessons

1. Small but mighty – minor impacts via digital behavioral nudges make a difference when scaled 

Our results showed that appealing to users’ sense of morality had implications for the number of days per semester students avoided campus, as well as the honest reporting of symptoms. While ~1 extra day of avoiding campus per semester per user may not be a monumental result, when considered on a population scale the impacts add up. A major benefit of this behavior-nudge approach is that the additional cost is negligible or nonexistent. The only change made was in the framing of the text, not in the number or frequency of messages.

2. Digital RCTs allow for harnessing the power of back-end data  

Creatively using back-end data to construct primary outcomes can lead to large sample sizes and a well-powered experiment. For our Honesty experiment, we were able to include almost 20,000 unique users in our sample. Collecting primary data among this many users would impose a huge cost. Instead, we designed our primary and secondary outcomes around data already being collected by back-end data in the app. This large sample size allowed us to detect small but meaningful effect sizes. In other interventions, tiny effect sizes may not be policy relevant as the costs of running a program would not be worth a small magnitude effect. However, in this context, there is a low-to-zero cost of re-framing a standard HealthCheck message. As such, small effect sizes -of one more day per semester where a person with COVID symptoms avoids campus can justify the low-costs of adopting the intervention at scale.

3. Honesty matters

Considering the limited ability of governments to continuously test for COVID cases, they have relied on self-screening and self-reporting of testing data and symptoms, often through internet or mobile applications.3 Digital self-assessments rely on the honesty of the people reporting their symptoms, as they weigh the tradeoffs of gaining access to desired services with risks to themselves and their community. Our findings – that appealing to an individual’s sense of morality improved honest reporting – have implications for those designing behavioral nudges for COVID-19 screening and beyond.

  1. 1. The initial design included an arm testing of the effect of simply moving the accuracy prompt from the end of the symptom questions list to the beginning, without any additional framing. We felt this was worth testing given the mixed evidence from Shu et al. 2012 and later, Kristal et al. 2020. We later decided to remove this arm given the 2012 paper was retracted due to reports of fraudulent data and focus the experiment on testing message framing.
  2. 2. Being categorized as high or medium risk was used as a proxy for users avoiding campus.
  3. 3. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7584449/