Abstract:Initiation, monitoring, and evaluation of development programmes can involve field-based data collection about project activities. This data collection through digital devices may not always be feasible though, for reasons such as unaffordability of smartphones and tablets by field-based cadre, or shortfalls in their training and capacity building. Paper-based data collection has been argued to be more appropriate in several contexts, with automated digitization of the paper forms through OCR (Optical Character Recognition) and OMR (Optical Mark Recognition) techniques. We contribute with providing a large dataset of handwritten digits, and deep learning based models and methods built using this data, that are effective in real-world environments. We demonstrate the deployment of these tools in the context of a maternal and child health and nutrition awareness project, which uses IVR (Interactive Voice Response) systems to provide awareness information to rural women SHG (Self Help Group) members in north India. Paper forms were used to collect phone numbers of the SHG members at scale, which were digitized using the OCR tools developed by us, and used to push almost 4 million phone calls. The data, model, and code have been released in the open-source domain.
Abstract:Sexual education aims to foster a healthy lifestyle in terms of emotional, mental and social well-being. In countries like India, where adolescents form the largest demographic group, they face significant vulnerabilities concerning sexual health. Unfortunately, sexual education is often stigmatized, creating barriers to providing essential counseling and information to this at-risk population. Consequently, issues such as early pregnancy, unsafe abortions, sexually transmitted infections, and sexual violence become prevalent. Our current proposal aims to provide a safe and trustworthy platform for sexual education to the vulnerable rural Indian population, thereby fostering the healthy and overall growth of the nation. In this regard, we strive towards designing SUKHSANDESH, a multi-staged AI-based Question Answering platform for sexual education tailored to rural India, adhering to safety guardrails and regional language support. By utilizing information retrieval techniques and large language models, SUKHSANDESH will deliver effective responses to user queries. We also propose to anonymise the dataset to mitigate safety measures and set AI guardrails against any harmful or unwanted response generation. Moreover, an innovative feature of our proposal involves integrating ``avatar therapy'' with SUKHSANDESH. This feature will convert AI-generated responses into real-time audio delivered by an animated avatar speaking regional Indian languages. This approach aims to foster empathy and connection, which is particularly beneficial for individuals with limited literacy skills. Partnering with Gram Vaani, an industry leader, we will deploy SUKHSANDESH to address sexual education needs in rural India.
Abstract:Understanding what factors bring about socio-economic development may often suffer from the streetlight effect, of analyzing the effect of only those variables that have been measured and are therefore available for analysis. How do we check whether all worthwhile variables have been instrumented and considered when building an econometric development model? We attempt to address this question by building unsupervised learning methods to identify and rank news articles about diverse events occurring in different districts of India, that can provide insights about what may have transpired in the districts. This can help determine whether variables related to these events are indeed available or not to model the development of these districts. We also describe several other applications that emerge from this approach, such as to use news articles to understand why pairs of districts that may have had similar socio-economic indicators approximately ten years back ended up at different levels of development currently, and another application that generates a newsfeed of unusual news articles that do not conform to news articles about typical districts with a similar socio-economic profile. These applications outline the need for qualitative data to augment models based on quantitative data, and are meant to open up research on new ways to mine information from unstructured qualitative data to understand development.
Abstract:We attempt to make two arguments in this essay. First, through a case study of a mobile phone based voice-media service we have been running in rural central India for more than six years, we describe several implementation complexities we had to navigate towards realizing our intended vision of bringing social development through technology. Most of these complexities arose in the interface of our technology with society, and we argue that even other technology providers can create similar processes to manage this socio-technological interface and ensure intended outcomes from their technology use. We then build our second argument about how to ensure that the organizations behind both market driven technologies and those technologies that are adopted by the state, pay due attention towards responsibly managing the socio-technological interface of their innovations. We advocate for the technology engineers and researchers who work within these organizations, to take up the responsibility and ensure that their labour leads to making the world a better place especially for the poor and marginalized. We outline possible governance structures that can give more voice to the technology developers to push their organizations towards ensuring that responsible outcomes emerge from their technology. We note that the examples we use to build our arguments are limited to contemporary information and communication technology (ICT) platforms used directly by end-users to share content with one another, and hence our argument may not generalize to other ICTs in a straightforward manner.