Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pushpak Bhattacharyya

"My life is miserable, have to sign 500 autographs everyday": Exposing Humblebragging, the Brags in Disguise

Dec 28, 2024

Sharath Naganna, Saprativa Bhattacharjee, Pushpak Bhattacharyya, Biplab Banerjee

Figure 1 for "My life is miserable, have to sign 500 autographs everyday": Exposing Humblebragging, the Brags in Disguise

Figure 2 for "My life is miserable, have to sign 500 autographs everyday": Exposing Humblebragging, the Brags in Disguise

Figure 3 for "My life is miserable, have to sign 500 autographs everyday": Exposing Humblebragging, the Brags in Disguise

Figure 4 for "My life is miserable, have to sign 500 autographs everyday": Exposing Humblebragging, the Brags in Disguise

Abstract:Humblebragging is a phenomenon where individuals present self-promotional statements under the guise of modesty or complaints. For example, a statement like, "Ugh, I can't believe I got promoted to lead the entire team. So stressful!", subtly highlights an achievement while pretending to be complaining. Detecting humblebragging is important for machines to better understand the nuances of human language, especially in tasks like sentiment analysis and intent recognition. However, this topic has not yet been studied in computational linguistics. For the first time, we introduce the task of automatically detecting humblebragging in text. We formalize the task by proposing a 4-tuple definition of humblebragging and evaluate machine learning, deep learning, and large language models (LLMs) on this task, comparing their performance with humans. We also create and release a dataset called HB24, containing 3,340 humblebrags generated using GPT-4o. Our experiments show that detecting humblebragging is non-trivial, even for humans. Our best model achieves an F1-score of 0.88. This work lays the foundation for further exploration of this nuanced linguistic phenomenon and its integration into broader natural language understanding systems.

* Under review at ARR

Via

Access Paper or Ask Questions

"Did my figure do justice to the answer?" : Towards Multimodal Short Answer Grading with Feedback (MMSAF)

Dec 27, 2024

Pritam Sil, Bhaskaran Raman, Pushpak Bhattacharyya

Figure 1 for "Did my figure do justice to the answer?" : Towards Multimodal Short Answer Grading with Feedback (MMSAF)

Figure 2 for "Did my figure do justice to the answer?" : Towards Multimodal Short Answer Grading with Feedback (MMSAF)

Figure 3 for "Did my figure do justice to the answer?" : Towards Multimodal Short Answer Grading with Feedback (MMSAF)

Figure 4 for "Did my figure do justice to the answer?" : Towards Multimodal Short Answer Grading with Feedback (MMSAF)

Abstract:Personalized feedback plays a vital role in a student's learning process. While existing systems are adept at providing feedback over MCQ-based evaluation, this work focuses more on subjective and open-ended questions, which is similar to the problem of Automatic Short Answer Grading (ASAG) with feedback. Additionally, we introduce the Multimodal Short Answer grading with Feedback (MMSAF) problem over the traditional ASAG feedback problem to address the scenario where the student answer and reference answer might contain images. Moreover, we introduce the MMSAF dataset with 2197 data points along with an automated framework for generating such data sets. Our evaluations on existing LLMs over this dataset achieved an overall accuracy of 55\% on Level of Correctness labels, 75\% on Image Relevance labels and a score of 4.27 out of 5 in correctness level of LLM generated feedback as rated by experts. As per experts, Pixtral achieved a rating of above 4 out of all metrics, indicating that it is more aligned to human judgement, and that it is the best solution for assisting students.

Via

Access Paper or Ask Questions

Reconsidering SMT Over NMT for Closely Related Languages: A Case Study of Persian-Hindi Pair

Dec 22, 2024

Waisullah Yousofi, Pushpak Bhattacharyya

Figure 1 for Reconsidering SMT Over NMT for Closely Related Languages: A Case Study of Persian-Hindi Pair

Figure 2 for Reconsidering SMT Over NMT for Closely Related Languages: A Case Study of Persian-Hindi Pair

Figure 3 for Reconsidering SMT Over NMT for Closely Related Languages: A Case Study of Persian-Hindi Pair

Figure 4 for Reconsidering SMT Over NMT for Closely Related Languages: A Case Study of Persian-Hindi Pair

Abstract:This paper demonstrates that Phrase-Based Statistical Machine Translation (PBSMT) can outperform Transformer-based Neural Machine Translation (NMT) in moderate-resource scenarios, specifically for structurally similar languages, like the Persian-Hindi pair. Despite the Transformer architecture's typical preference for large parallel corpora, our results show that PBSMT achieves a BLEU score of 66.32, significantly exceeding the Transformer-NMT score of 53.7 on the same dataset. Additionally, we explore variations of the SMT architecture, including training on Romanized text and modifying the word order of Persian sentences to match the left-to-right (LTR) structure of Hindi. Our findings highlight the importance of choosing the right architecture based on language pair characteristics and advocate for SMT as a high-performing alternative, even in contexts commonly dominated by NMT.

* 5 pages, 2 figures

Via

Access Paper or Ask Questions

RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari Languages

Dec 14, 2024

Harshvivek Kashid, Pushpak Bhattacharyya

Figure 1 for RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari Languages

Figure 2 for RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari Languages

Figure 3 for RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari Languages

Figure 4 for RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari Languages

Abstract:Optical Character Recognition (OCR) technology has revolutionized the digitization of printed text, enabling efficient data extraction and analysis across various domains. Just like Machine Translation systems, OCR systems are prone to errors. In this work, we address the challenge of data generation and post-OCR error correction, specifically for low-resource languages. We propose an approach for synthetic data generation for Devanagari languages, RoundTripOCR, that tackles the scarcity of the post-OCR Error Correction datasets for low-resource languages. We release post-OCR text correction datasets for Hindi, Marathi, Bodo, Nepali, Konkani and Sanskrit. We also present a novel approach for OCR error correction by leveraging techniques from machine translation. Our method involves translating erroneous OCR output into a corrected form by treating the OCR errors as mistranslations in a parallel text corpus, employing pre-trained transformer models to learn the mapping from erroneous to correct text pairs, effectively correcting OCR errors.

Via

Access Paper or Ask Questions

Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages

Dec 14, 2024

Poulami Ghosh, Raj Dabre, Pushpak Bhattacharyya

Figure 1 for Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages

Figure 2 for Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages

Figure 3 for Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages

Figure 4 for Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages

Abstract:Pre-trained language models (PLMs) are known to be susceptible to perturbations to the input text, but existing works do not explicitly focus on linguistically grounded attacks, which are subtle and more prevalent in nature. In this paper, we study whether PLMs are agnostic to linguistically grounded attacks or not. To this end, we offer the first study addressing this, investigating different Indic languages and various downstream tasks. Our findings reveal that although PLMs are susceptible to linguistic perturbations, when compared to non-linguistic attacks, PLMs exhibit a slightly lower susceptibility to linguistic attacks. This highlights that even constrained attacks are effective. Moreover, we investigate the implications of these outcomes across a range of languages, encompassing diverse language families and different scripts.

* Work in Progress

Via

Access Paper or Ask Questions

Timing Matters: Enhancing User Experience through Temporal Prediction in Smart Homes

Nov 27, 2024

Shrey Ganatra, Spandan Anaokar, Pushpak Bhattacharyya

Figure 1 for Timing Matters: Enhancing User Experience through Temporal Prediction in Smart Homes

Figure 2 for Timing Matters: Enhancing User Experience through Temporal Prediction in Smart Homes

Figure 3 for Timing Matters: Enhancing User Experience through Temporal Prediction in Smart Homes

Figure 4 for Timing Matters: Enhancing User Experience through Temporal Prediction in Smart Homes

Abstract:Have you ever considered the sheer volume of actions we perform using IoT (Internet of Things) devices within our homes, offices, and daily environments? From the mundane act of flicking a light switch to the precise adjustment of room temperatures, we are surrounded by a wealth of data, each representing a glimpse into user behaviour. While existing research has sought to decipher user behaviours from these interactions and their timestamps, a critical dimension still needs to be explored: the timing of these actions. Despite extensive efforts to understand and forecast user behaviours, the temporal dimension of these interactions has received scant attention. However, the timing of actions holds profound implications for user experience, efficiency, and overall satisfaction with intelligent systems. In our paper, we venture into the less-explored realm of human-centric AI by endeavoring to predict user actions and their timing. To achieve this, we contribute a meticulously synthesized dataset comprising 11k sequences of actions paired with their respective date and time stamps. Building upon this dataset, we propose our model, which employs advanced machine learning techniques for k-class classification over time intervals within a day. To the best of our knowledge, this is the first attempt at time prediction for smart homes. We achieve a 40% (96-class) accuracy across all datasets and an 80% (8-class) accuracy on the dataset containing exact timestamps, showcasing the efficacy of our approach in predicting the temporal dynamics of user actions within smart environments.

* 7 pages + 1 reference, 5 figures, 5 tables

Via

Access Paper or Ask Questions

Together We Can: Multilingual Automatic Post-Editing for Low-Resource Languages

Oct 23, 2024

Sourabh Deoghare, Diptesh Kanojia, Pushpak Bhattacharyya

Figure 1 for Together We Can: Multilingual Automatic Post-Editing for Low-Resource Languages

Figure 2 for Together We Can: Multilingual Automatic Post-Editing for Low-Resource Languages

Figure 3 for Together We Can: Multilingual Automatic Post-Editing for Low-Resource Languages

Figure 4 for Together We Can: Multilingual Automatic Post-Editing for Low-Resource Languages

Abstract:This exploratory study investigates the potential of multilingual Automatic Post-Editing (APE) systems to enhance the quality of machine translations for low-resource Indo-Aryan languages. Focusing on two closely related language pairs, English-Marathi and English-Hindi, we exploit the linguistic similarities to develop a robust multilingual APE model. To facilitate cross-linguistic transfer, we generate synthetic Hindi-Marathi and Marathi-Hindi APE triplets. Additionally, we incorporate a Quality Estimation (QE)-APE multi-task learning framework. While the experimental results underline the complementary nature of APE and QE, we also observe that QE-APE multitask learning facilitates effective domain adaptation. Our experiments demonstrate that the multilingual APE models outperform their corresponding English-Hindi and English-Marathi single-pair models by $2.5$ and $2.39$ TER points, respectively, with further notable improvements over the multilingual APE model observed through multi-task learning ($+1.29$ and $+1.44$ TER points), data augmentation ($+0.53$ and $+0.45$ TER points) and domain adaptation ($+0.35$ and $+0.45$ TER points). We release the synthetic data, code, and models accrued during this study publicly at https://github.com/cfiltnlp/Multilingual-APE.

* Accepted at Findings of EMNLP 2024

Via

Access Paper or Ask Questions

ETF: An Entity Tracing Framework for Hallucination Detection in Code Summaries

Oct 22, 2024

Kishan Maharaj, Vitobha Munigala, Srikanth G. Tamilselvam, Prince Kumar, Sayandeep Sen, Palani Kodeswaran, Abhijit Mishra, Pushpak Bhattacharyya

Abstract:Recent advancements in large language models (LLMs) have significantly enhanced their ability to understand both natural language and code, driving their use in tasks like natural language-to-code (NL2Code) and code summarization. However, LLMs are prone to hallucination-outputs that stray from intended meanings. Detecting hallucinations in code summarization is especially difficult due to the complex interplay between programming and natural languages. We introduce a first-of-its-kind dataset with $\sim$10K samples, curated specifically for hallucination detection in code summarization. We further propose a novel Entity Tracing Framework (ETF) that a) utilizes static program analysis to identify code entities from the program and b) uses LLMs to map and verify these entities and their intents within generated code summaries. Our experimental analysis demonstrates the effectiveness of the framework, leading to a 0.73 F1 score. This approach provides an interpretable method for detecting hallucinations by grounding entities, allowing us to evaluate summary accuracy.

* 11 pages, 6 Figures, 5 Tables

Via

Access Paper or Ask Questions

Beyond Aesthetics: Cultural Competence in Text-to-Image Models

Jul 11, 2024

Nithish Kannen, Arif Ahmad, Marco Andreetto, Vinodkumar Prabhakaran, Utsav Prabhu, Adji Bousso Dieng, Pushpak Bhattacharyya, Shachi Dave

Figure 1 for Beyond Aesthetics: Cultural Competence in Text-to-Image Models

Figure 2 for Beyond Aesthetics: Cultural Competence in Text-to-Image Models

Figure 3 for Beyond Aesthetics: Cultural Competence in Text-to-Image Models

Figure 4 for Beyond Aesthetics: Cultural Competence in Text-to-Image Models

Abstract:Text-to-Image (T2I) models are being increasingly adopted in diverse global communities where they create visual representations of their unique cultures. Current T2I benchmarks primarily focus on faithfulness, aesthetics, and realism of generated images, overlooking the critical dimension of cultural competence. In this work, we introduce a framework to evaluate cultural competence of T2I models along two crucial dimensions: cultural awareness and cultural diversity, and present a scalable approach using a combination of structured knowledge bases and large language models to build a large dataset of cultural artifacts to enable this evaluation. In particular, we apply this approach to build CUBE (CUltural BEnchmark for Text-to-Image models), a first-of-its-kind benchmark to evaluate cultural competence of T2I models. CUBE covers cultural artifacts associated with 8 countries across different geo-cultural regions and along 3 concepts: cuisine, landmarks, and art. CUBE consists of 1) CUBE-1K, a set of high-quality prompts that enable the evaluation of cultural awareness, and 2) CUBE-CSpace, a larger dataset of cultural artifacts that serves as grounding to evaluate cultural diversity. We also introduce cultural diversity as a novel T2I evaluation component, leveraging quality-weighted Vendi score. Our evaluations reveal significant gaps in the cultural awareness of existing models across countries and provide valuable insights into the cultural diversity of T2I outputs for under-specified prompts. Our methodology is extendable to other cultural regions and concepts, and can facilitate the development of T2I models that better cater to the global population.

* 30 pages, 10 figures, preprint

Via

Access Paper or Ask Questions

Looks can be Deceptive: Distinguishing Repetition Disfluency from Reduplication

Jul 11, 2024

Arif Ahmad, Mothika Gayathri Khyathi, Pushpak Bhattacharyya

Figure 1 for Looks can be Deceptive: Distinguishing Repetition Disfluency from Reduplication

Figure 2 for Looks can be Deceptive: Distinguishing Repetition Disfluency from Reduplication

Figure 3 for Looks can be Deceptive: Distinguishing Repetition Disfluency from Reduplication

Figure 4 for Looks can be Deceptive: Distinguishing Repetition Disfluency from Reduplication

Abstract:Reduplication and repetition, though similar in form, serve distinct linguistic purposes. Reduplication is a deliberate morphological process used to express grammatical, semantic, or pragmatic nuances, while repetition is often unintentional and indicative of disfluency. This paper presents the first large-scale study of reduplication and repetition in speech using computational linguistics. We introduce IndicRedRep, a new publicly available dataset containing Hindi, Telugu, and Marathi text annotated with reduplication and repetition at the word level. We evaluate transformer-based models for multi-class reduplication and repetition token classification, utilizing the Reparandum-Interregnum-Repair structure to distinguish between the two phenomena. Our models achieve macro F1 scores of up to 85.62% in Hindi, 83.95% in Telugu, and 84.82% in Marathi for reduplication-repetition classification.

Via

Access Paper or Ask Questions