

Abstract:The social media platform provides an opportunity to gain valuable insights into user behaviour. Users mimic their internal feelings and emotions in a disinhibited fashion using natural language. Techniques in Natural Language Processing have helped researchers decipher standard documents and cull together inferences from massive amount of data. A representative corpus is a prerequisite for NLP and one of the challenges we face today is the non-standard and noisy language that exists on the internet. Our work focuses on building a corpus from social media that is focused on detecting mental illness. We use depression as a case study and demonstrate the effectiveness of using such a corpus for helping practitioners detect such cases. Our results show a high correlation between our Social Media Corpus and the standard corpus for depression.


Abstract:Recent advances in Big Data has prompted health care practitioners to utilize the data available on social media to discern sentiment and emotions expression. Health Informatics and Clinical Analytics depend heavily on information gathered from diverse sources. Traditionally, a healthcare practitioner will ask a patient to fill out a questionnaire that will form the basis of diagnosing the medical condition. However, medical practitioners have access to many sources of data including the patients writings on various media. Natural Language Processing (NLP) allows researchers to gather such data and analyze it to glean the underlying meaning of such writings. The field of sentiment analysis (applied to many other domains) depend heavily on techniques utilized by NLP. This work will look into various prevalent theories underlying the NLP field and how they can be leveraged to gather users sentiments on social media. Such sentiments can be culled over a period of time thus minimizing the errors introduced by data input and other stressors. Furthermore, we look at some applications of sentiment analysis and application of NLP to mental health. The reader will also learn about the NLTK toolkit that implements various NLP theories and how they can make the data scavenging process a lot easier.