Statistical validation of a large-scale web survey during the COVID-19 pandemic in India
Main Article Content
Abstract
Background: There was an overwhelming demand for data to respond to economic and health emergencies during the COVID-19 pandemic. This forced the remote modes of data collection such as mobile and web surveys to come to the forefront, which was not the case before in many low and middle-income countries, including India. The primary concerns with remote mode surveys are undercoverage of target population and self-selection of the survey respondents resulting in biased estimates.
Methods: Using unit level data from COVID-19 Trends and Impact Survey (CTIS) from India, the largest public health web survey, we examine the bias in the estimates of vaccine uptake, a population measure which changes rapidly with time, particularly right after its roll out in India on 16 January 2021. In the absence of independently verified ‘ground truth’ or ‘gold standard’ for assessing bias in surveys, we discuss the need for statistical representativeness of web surveys and methods of achieving it.
Results: Bias in CTIS estimates of vaccine uptake is not constant over time, rather it increases up to a certain point of time and then decreases. Our findings are explained by the fact that the variability in the outcome of interest in the population first increases with time and then goes downward after more than 50% of the population are vaccinated. The validation of CTIS vaccine uptake estimates was possible as it is one of the rare situations where reliable gold standard measures were available. For another key indicator from CTIS, COVID-like illness (CLI) constructed based on self-reporting of symptoms, it is not trivial to assess the bias in the outcome as the quality of the gold standard is questionable.
Conclusion: Since absence of independently verified ‘ground truth’ or ‘gold standard’ for assessing bias in surveys is well acknowledged, it is crucial to validate statistical representativeness of web surveys with respect to key demographic characteristics of respondents which are often correlated with many outcome variables.
Article Details
The Medical Research Archives grants authors the right to publish and reproduce the unrevised contribution in whole or in part at any time and in any form for any scholarly non-commercial purpose with the condition that all publications of the contribution include a full citation to the journal as published by the Medical Research Archives.
References
2. Tsuboi S, Yoshida H, Ae R, Kojo T, Nakamura Y, Kitamura K. Selection bias of Internet panel surveys: A comparison with a paper-based survey and national governmental statistics in Japan. Asia Pacific Journal of Public Health 2015; 27(2): NP2390-NP9.
3. Rao J. On making valid inferences by integrating data from surveys and other sources. Sankhya B 2021; 83(1): 242-72.
4. Bradley VC, Kuriwaki S, Isakov M, Sejdinovic D, Meng X-L, Flaxman S. Unrepresentative big surveys significantly overestimated US vaccine uptake. Nature 2021; 600(7890): 695-700.
5. Kreuter F, Barkay N, Bilinski A, et al. Partnering with a global platform to inform research and public policy making. Survey Research Methods 2020; 14(2): 159-63.
6. Barkay N, Cobb C, Eilat R, et al. Weights and methodology brief for the COVID-19 symptom survey by University of Maryland and Carnegie Mellon University, in partnership with Facebook. arXiv preprint arXiv:200914675 2020.
7. Pramanik S, Motheram A. India’s COVID-19 Vaccination Drive: Its Relevance in Managing the Pandemic. Contextualizing the COVID Pandemic in India: A Development Perspective: Springer; 2023: 199-223.
8. Adjodah D, Dinakar K, Chinazzi M, et al. Association between COVID-19 outcomes and mask mandates, adherence, and attitudes. PLoS One 2021; 16(6).
9. Babalola S, Krenn S, Rimal R, et al. KAP COVID Dashboard. Johns Hopkins Center for Communication Programs, Massachusetts Institute of Technology, Global Outbreak Alert and Response Network, Facebook Data for Good 2021.
10. Chowdhury SR, Motheram A, Pramanik S. Covid-19 vaccine hesitancy: trends across states, over time. Ideas for India 2021; 16.
11. Salomon JA, Reinhart A, Bilinski A, et al. The US COVID-19 Trends and Impact Survey: Continuous real-time measurement of COVID-19 symptoms, risks, protective behaviors, testing, and vaccination. Proceedings of the National Academy of Sciences 2021; 118(51).
12. Sukumaran R, Patwa P, Sethuraman T, et al. COVID-19 Outbreak Prediction and Analysis using Self Reported Symptoms. arXiv preprint arXiv:2101 10266 2020.
13. Kreuter F. What surveys really say. Nature: News and Views 2021.
14. Couper MP. Web surveys: A review of issues and approaches. The Public Opinion Quarterly 2000; 64(4): 464-94.
15. Couper MP, Miller PV. Web survey methods: Introduction. Public Opinion Quarterly 2008; 72(5): 831-5.
16. Keelery S. Number of Facebook users India 2015-2023. https://www.statista.com/statistics/304827/number-of-facebook-users-in-india/. 2020.
17. Little RJ, Vartivarian S. Does weighting for nonresponse increase the variance of survey means? Survey Methodology 2005; 31(2): 161.
18. Holt D, Smith TF. Post stratification. Journal of the Royal Statistical Society: Series A (General) 1979; 142(1): 33-46.
19. Zhang L-C. Post-stratification and calibration—a synthesis. The American Statistician 2000; 54(3): 178-84.
20. Astley CM, Tuli G, Mc Cord KA, et al. Global monitoring of the impact of the COVID-19 pandemic through online surveys sampled from the Facebook user base. Proceedings of the National Academy of Sciences 2021; 118(51).
21. Menni C, Valdes AM, Freidin MB, et al. Real-time tracking of self-reported symptoms to predict potential COVID-19. Nature medicine 2020; 26(7): 1037-40.
22. National Comission on Population. Population projections for India and States 2011-2036: Report of the technical group on population projections: Ministry of Health and Family Welfare, Government of India New Delhi, 2019.
23. Legal Correspondent. On CoWIN, Supreme Court flags digital divide. Available at https://www.thehindu.com/news/national/on-cowin-supreme-court-flags-digital-divide/article34711169.ece. New Delhi: The Hindu, 2021.
24. Sharma R. India: digital divide and the promise of vaccination for all. Available at https://blogs.lse.ac.uk/southasia/2021/06/28/india-digital-divide-and-the-promise-of-vaccination-for-all/: South Asia@ London School of Economics & Political Science, 2021.
25. United Nations. The need for data innovations in the time of COVID-19. New York: United Nations Statistics Division, 2020.
26. Heggeness M. The need for data innovation in the time of covid-19. Available at https://www.minneapolisfed.org/article/2020/the-need-for-data-innovation-in-the-time-of-covid-19: Policy brief. Opportunity and Inclusive Growth Institute, Federal Reserve Bank of Minneapolis., 2020.
27. NCAER NDIC. Delhi NCR Coronavirus Telephone Survey- Round 1 (April 3-6): Preliminary report. Available online https://www.ncaer.org/image/userfiles/file/NDIC-TEL/DCVTS%20Results%202020-04-12-FINAL.pdf. New Delhi: National Council of Applied Economic Research, National Data Innovation Centre, 2020a.
28. NCAER NDIC. Delhi NCR Coronavirus Telephone Survey- Round 2 (April 23-26): Preliminary report. Available online https://www.ncaer.org/image/userfiles/file/NDIC-TEL/Round-2/NCAER%20May%201%202020%20DCVTS-2%20Presentation.pdf. New Delhi: National Council of Applied Economic Research, National Data Innovation Centre, 2020b.
29. NCAER NDIC. Delhi NCR Coronavirus Telephone Survey- Round 3 (June 15-23): Preliminary report. Available online https://www.ncaer.org/NDIC/DCVTS3%20Report_Final.pdf. New Delhi: National Council of Applied Economic Research, National Data Innovation Centre, 2020c.
30. NCAER NDIC. Delhi NCR Coronavirus Telephone Survey, Round 4 Dec 23, 2020 - Jan 4, 2021. Available at https://www.ncaer.org/image/userfiles/file/DCVTS4/DCVTS4_Presentation.pdf. New Delhi: National Council of Applied Economic Research, National Data Innovation Centre, 2021.
31. Hersh S, Nair D, Komaragiri PB, Adlakha RK. Patchy signals: capturing women’s voices in mobile phone surveys of rural India. BMJ Global Health 2021; 6(Suppl 5): e005411.
32. Totapally S, Sonderegger P, Rao P, Gupta G. The efficacy of government entitlements in helping BPL families navigate the financial impacts of COVID-19. Note to policymakers: Early results from an ongoing survey of 2020; 18.
33. Jaacks LM, Veluguri D, Serupally R, Roy A, Prabhakaran P, Ramanjaneyulu G. Impact of the COVID-19 pandemic on agricultural production, livelihoods, and food security in India: baseline results of a phone survey. Food security 2021; 13(5): 1323-39.
34. Qin J, Leung D, Shao J. Estimation with survey data under nonignorable nonresponse or informative sampling. Journal of the American Statistical Association 2002; 97(457): 193-200.
35. Pfeffermann D, Eltinge JL, Brown LD, Pfeffermann D. Methodological issues and challenges in the production of official statistics: 24th annual Morris Hansen lecture. Journal of Survey Statistics and Methodology 2015; 3(4): 425-83.
36. Pfeffermann D. Bayes-based non-bayesian inference on finite populations from non-representative samples: A unified approach. Calcutta Statistical Association Bulletin 2017; 69(1): 35-63.
37. ICMR. Advisory for COVID-19 Testing During the Second Wave of the Pandemic. Was available at https://www.icmr.gov.in/pdf/covid/strategy/Advisory_COVID_Testing_in_Second_Wave_04052021.pdf. New Delhi: Indian Council of Medical Research, 2021.
38. Maitra P, Shelar, Jyoti and Thevar, Steffi. Reliance on RAT in rural Maha indicates caseload may be higher. Available at https://www.hindustantimes.com/cities/mumbai-news/reliance-on-rat-in-rural-maha-indicates-caseload-may-be-higher-101622661246313.html. Nagpur/ Mumbai/ Pune: Hindustan Times, 2021.
39. Chaba AA. Explained: Can Increasing Rapid Antigen Tests Control Rural Covid-19 Spread in Punjab? Available at https://indianexpress.com/article/explained/punjab-covid-spread-second-wave-antigen-testing-7329121/, 2021.
40. Mehta V, Jyoti D, Guria RT, Sharma CB. Correlation between chest CT and RT-PCR testing in India’s second COVID-19 wave: a retrospective cohort study. BMJ Evidence-Based Medicine 2022.
41. D’Souza S, Shetty R, Kundu G, et al. COVID-19 positivity rate in corneal tissue donors–A cause for concern! Indian Journal of Ophthalmology 2021; 69(10): 2808.
42. Carfì A, Bernabei R, Landi F. Persistent symptoms in patients after acute COVID-19. Jama 2020; 324(6): 603-5.
43. Sudre CH, Murray B, Varsavsky T, et al. Attributes and predictors of long COVID. Nature medicine 2021; 27(4): 626-31.
44. Shaman J, Galanti M. Will SARS-CoV-2 become endemic? Science 2020; 370(6516): 527-9.
45. Liu X, Huang J, Li C, et al. The role of seasonality in the spread of COVID-19 pandemic. Environmental research 2021; 195: 110874.
46. Baker RE, Yang W, Vecchi GA, Metcalf CJE, Grenfell BT. Assessing the influence of climate on wintertime SARS-CoV-2 outbreaks. Nature communications 2021; 12(1): 1-7.
47. Phillips N. The coronavirus is here to stay—here’s what that means. Nature 2021; 590(7846): 382-4.
48. Karlinsky A, Kobak D. Tracking excess mortality across countries during the COVID-19 pandemic with the World Mortality Dataset. Elife 2021; 10: e69336.
49. Serikbayeva B, Abdulla K, Oskenbayev Y. State capacity in responding to COVID-19. International Journal of Public Administration 2021; 44(11-12): 920-30.
50. Wälde K. How to remove the testing bias in CoV-2 statistics. 2020.
51. Díaz-Pachón DA, Rao JS. A simple correction for COVID-19 sampling bias. Journal of theoretical biology 2021; 512: 110556.
52. Wang W, Rothschild D, Goel S, Gelman A. Forecasting elections with non-representative polls. International Journal of Forecasting 2015; 31(3): 980-91.