Household sample surveys are valuable inputs into policy decisions. Making data collection cheaper and faster may expand the use of such surveys. For most household sample surveys, researchers either conduct comprehensive household listings in sampled areas, which can be slow and costly or rely on field-based household selection methods, which may lead to non-representative samples. In India, we investigate the use of publicly available voter rolls as an alternative to household listings or field-based sampling methods. Using voter rolls for sampling can save the majority of the cost of constructing a sampling frame relative to a household listing, but there is limited evidence on their accuracy and completeness. To assess the suitability of voter rolls for the purpose of generating household sampling frames, we conducted a household listing in 9 rural polling stations and 4 urban polling stations comprising 7,769 voting-age adults across four states. We compared the listing results to voter rolls for these polling stations and found that, overall, voter rolls include 91% of the households found in the ground-truth household listing. Coverage is significantly higher in rural areas (96%) compared to urban areas (78%). Exclusion in voter rolls does not appear to vary by a household’s religion or socioeconomic status, though there is some evidence that wealthier, higher-caste households in urban areas are slightly more likely to be excluded. We conducted simulations to show that sampling from voter rolls can produce estimates of household-level economic variables with little bias, especially in rural areas. These results, albeit not representative of all Indian states, suggest that voter rolls are suitable for constructing household sampling frames in rural areas.
In this paper, we set out to assess the suitability of using voter rolls for constructing household sampling frames. This was motivated by two reasons: first, using voter rolls to sample households is a much cheaper alternative than conducting household listing, and second, researchers across different fields already use them for sampling, but with limited evidence on their accuracy and completeness. Based on our findings of low household exclusion and low bias in sampling estimates, especially in rural areas, we recommend that researchers should use voter rolls for household sampling in rural areas. In contrast, we caution against using them for sampling households in urban areas, or for sampling individuals in general.
Additionally, given little evidence of systematic exclusion of marginalised subgroups, voter rolls are also a promising household sampling frame for research examining program or service delivery to marginalised groups. For researchers interested in using voter rolls for sampling in India, we have assembled a guide with practical advice for finding and downloading voter rolls, processing and extracting relevant information, and selecting a household sample.
Further research would be useful to assess the accuracy and completeness of voter rolls in other states in India. Given high variance in exclusion rates across urban polling stations, it would also be useful to examine the mechanisms that explain this variation, such that researchers may be able to more easily predict the quality of voter rolls in new areas. Finally, we look forward to further innovations in alternative sampling methods that can reduce cost of sample surveys, such as using satellite imagery for constructing household sampling frames.
2 February 2021
30 March 2020
3 September 2020
15 May 2019