Skip to content
Blog

Applying AI to improve your program? Look out for its risks and unintended consequences

©️IDinsight

AI is rapidly transforming service delivery in the development sector. From health diagnostics to education platforms to agricultural advice, AI applications are scaling faster than most technology interventions we have seen before. This warrants close examination of the benefits and harm posed by an AI-powered development program. 

While we foresee unprecedented upsides to AI applications, we are cautious of their unintended consequences. Here, we are not referring to existential AI risks and implications at a societal scale, like automation-induced job loss, environmental impacts, or theorized apocalyptic outcomes, although these are important and should remain on our radar. 

Our focus is on the micro and programmatic level, asking: What types of unintended consequences might arise from applying generative AI to a given development problem? Below, we have shared three examples. Each follows a common structure:

  • Spotlighting the status quo: How services are currently delivered and what gaps or inequities exist. 
  • Outlining the opportunity: What AI could plausibly improve
  • Surfacing the risk: Anticipating unintended consequences that might emerge due to AI adoption at scale 

1. Creative constraint and over-standardization:
An example in education

The status quo: Teaching quality varies wildly. The best teachers adapt on the fly—extending activities that work, skipping material students already know, adjusting based on what they see in real-time. The poorer performing ones rely on rote instruction. In one seven-country study across Sub-Saharan Africa, only 7% of teachers met minimum subject knowledge standards and just 10% reached basic pedagogical thresholds. Nearly 40% taught unplanned lessons.

The opportunity: AI can provide structured, evidence-based lesson plans and real-time coaching to under-resourced teachers. These systems offer standardized curricula incorporating pedagogical best practices, scripts for effective questioning, and content customized to student performance. Research shows scripted approaches work—one analysis found programs using structured guides achieved learning gains averaging 6.1 correct words per minute in oral reading fluency.

The risk: Here’s the catch. That same research reveals a counterintuitive pattern. While scripted materials improve outcomes overall, there’s actually a negative relationship between scripting levels and impact. For every additional 10 percentage points of scripting, learning gains decreased by 1.4 correct words per minute.

Why? Because how teachers deviate from scripts matters enormously. When teachers made thoughtful structural modifications—changing activities to boost engagement—learning improved. When they simply skipped content (which happened 99% of the time), learning suffered.

AI systems optimized for consistency might inadvertently kill the adaptation that separates great teaching from mediocre teaching. If AI tools are too rigid, we risk “mean reversion”—constraining good teachers while failing to help struggling teachers develop their judgment. The result: flatter, more consistent teaching, but potentially fewer bright spots of outstanding teaching.

2. Reduction in trust and loss of personalization:
An example in health

The status quo: Frontline health workers are community institutions. During a week shadowing health workers in the Philippines, we watched decades-old relationships shape how workers advised, persuaded, and supported clients. They serve as reliable information delivery systems, grounded in a deep understanding of family dynamics, cultural barriers, and individual circumstances.

The opportunity: Many frontline workers are overwhelmed and sometimes under-informed, leading to missed diagnoses and inconsistent counseling. AI tools can support and extend their capacity—offering diagnostic support, evidence-based treatment recommendations, and motivational interviewing scripts for vaccine-hesitant groups. AI can also directly educate populations through chatbots and call centers.

The risk: Two potential problems lurk beneath the efficiency gains.

First, trust might erode. A large-scale RCT in West Bengal found that SMS messages from credible experts increased preventive health behaviors even when identical information was widely available. In our Philippines surveys, “barangay health worker” consistently ranks as the most trusted health information source, while “the internet” ranks dead last.

Imagine a pregnant woman texting her health worker at lunch: “Is it safe to get a COVID vaccine at 35 weeks?” Now imagine directing her to a chatbot instead. Same information, different source: potentially, a different outcome.

Second, contextual knowledge gets lost. Human relationships capture nuances that matter for health behaviors. Rural health workers in Bangladesh, providing nutrition counseling, learned they needed to target different family members with sophisticated messaging strategies—husbands, mothers-in-law—based on household power dynamics. They knew when family support systems broke down, when stigma prevented care-seeking, and when spiritual beliefs affected treatment adherence.

AI lowers the cost of delivering accurate information, whether through chatbots for beneficiaries or AI assistants for health workers. But information alone rarely drives behavior. Key mediators like trust and the surrounding social context depend on sustained relationships. As AI integrates into programs, we must understand how it might disrupt these factors and design systems that protect the relational drivers of effective interventions.

3. Selection bias and unequal benefits:
An example in agriculture

The status quo: Agricultural extension operates through a patchwork of government agencies, NGOs, and community systems delivering knowledge to farmers. Field officers provide training and tailored advice based on direct observation of soil, weather, and financial constraints. But coverage is thin—ratios range from 1:1,800 to 1:10,000 workers per farmer across Sub-Saharan Africa, far exceeding the FAO guideline of 1:800. Chronic understaffing means agents visit infrequently and often focus on better-connected, prosperous farmers.

The opportunity: AI tools can bridge the gap. Chatbots like FarmerAI deliver real-time, region-specific agricultural insights directly to farmers’ phones. Smartphone apps enable soil monitoring and early detection of diseases, pests, and weeds. These tools can improve efficiency and profitability for smallholder farmers at scale.

The risk: Self-directed learning replaces human supervision, creating selection bias problems. Instead of being invited to an extension worker training, farmers interact with chatbots independently. This shifts the burden from extension agents motivating farmers, to farmers motivating themselves.

Who thrives in this system? Motivated, technologically literate farmers, who are also likely economically better off. Who gets left behind? Everyone else. This isn’t just about digital divides. In markets with competitive dynamics, tech-savvy farmers could capture larger market shares, leaving less connected farmers worse off than before the intervention.

Beyond motivation, the accountability that comes from human relationships—farmers not wanting to disappoint extension agents, extension agents feeling responsible for farmer success—gets lost when we substitute people with apps.

What this means for implementation and evaluation

These aren’t inevitable outcomes; they’re risks to watch for and design around. The key to identifying these risks is evaluation that goes beyond measuring average impacts to capturing heterogeneous effects and unintended consequences. We propose a few suggestions below:

Measure baseline attributes that matter. If limiting productive creativity is a risk, measure teacher ability at baseline. If behavior change might depend on trust and human interaction, assess whether those relationships currently exist. If selection bias may be a concern, collect data on motivation, technology access, and equity dimensions.  These variables enable analysis beyond averages to understand which groups benefit and which get left behind.

Design for heterogeneity. If subgroup effects could make or break your impact case, power your study to detect them. Analyze differences not just in final outcomes but in early take-up and engagement patterns.

Build in participatory processes. Involve affected stakeholders before system design locks in, not after problems emerge. This helps align AI systems with user values and real-world conditions. 

The promise of AI in development is real. But moving fast and disruption works poorly when the “things” you break are health systems, education outcomes, and farmer livelihoods. Well-timed evaluations that account for both intended and unintended consequences are critical to translating AI’s potential for development into meaningful reality rather than hype.