Alert button
Picture for Yaman Kumar

Yaman Kumar

Alert button

CiteCaseLAW: Citation Worthiness Detection in Caselaw for Legal Assistive Writing

May 03, 2023
Mann Khatri, Pritish Wadhwa, Gitansh Satija, Reshma Sheik, Yaman Kumar, Rajiv Ratn Shah, Ponnurangam Kumaraguru

Figure 1 for CiteCaseLAW: Citation Worthiness Detection in Caselaw for Legal Assistive Writing
Figure 2 for CiteCaseLAW: Citation Worthiness Detection in Caselaw for Legal Assistive Writing
Figure 3 for CiteCaseLAW: Citation Worthiness Detection in Caselaw for Legal Assistive Writing
Figure 4 for CiteCaseLAW: Citation Worthiness Detection in Caselaw for Legal Assistive Writing

In legal document writing, one of the key elements is properly citing the case laws and other sources to substantiate claims and arguments. Understanding the legal domain and identifying appropriate citation context or cite-worthy sentences are challenging tasks that demand expensive manual annotation. The presence of jargon, language semantics, and high domain specificity makes legal language complex, making any associated legal task hard for automation. The current work focuses on the problem of citation-worthiness identification. It is designed as the initial step in today's citation recommendation systems to lighten the burden of extracting an adequate set of citation contexts. To accomplish this, we introduce a labeled dataset of 178M sentences for citation-worthiness detection in the legal domain from the Caselaw Access Project (CAP). The performance of various deep learning models was examined on this novel dataset. The domain-specific pre-trained model tends to outperform other models, with an 88% F1-score for the citation-worthiness detection task.

* A dataset for Legal domain 
Viaarxiv icon

H-AES: Towards Automated Essay Scoring for Hindi

Feb 28, 2023
Shubhankar Singh, Anirudh Pupneja, Shivaansh Mital, Cheril Shah, Manish Bawkar, Lakshman Prasad Gupta, Ajit Kumar, Yaman Kumar, Rushali Gupta, Rajiv Ratn Shah

Figure 1 for H-AES: Towards Automated Essay Scoring for Hindi
Figure 2 for H-AES: Towards Automated Essay Scoring for Hindi
Figure 3 for H-AES: Towards Automated Essay Scoring for Hindi

The use of Natural Language Processing (NLP) for Automated Essay Scoring (AES) has been well explored in the English language, with benchmark models exhibiting performance comparable to human scorers. However, AES in Hindi and other low-resource languages remains unexplored. In this study, we reproduce and compare state-of-the-art methods for AES in the Hindi domain. We employ classical feature-based Machine Learning (ML) and advanced end-to-end models, including LSTM Networks and Fine-Tuned Transformer Architecture, in our approach and derive results comparable to those in the English language domain. Hindi being a low-resource language, lacks a dedicated essay-scoring corpus. We train and evaluate our models using translated English essays and empirically measure their performance on our own small-scale, real-world Hindi corpus. We follow this up with an in-depth analysis discussing prompt-specific behavior of different language models implemented.

* 9 pages, 3 Tables, To be published as a part of Proceedings of the 37th AAAI Conference on Artificial Intelligence 
Viaarxiv icon

Get It Scored Using AutoSAS -- An Automated System for Scoring Short Answers

Dec 21, 2020
Yaman Kumar, Swati Aggarwal, Debanjan Mahata, Rajiv Ratn Shah, Ponnurangam Kumaraguru, Roger Zimmermann

Figure 1 for Get It Scored Using AutoSAS -- An Automated System for Scoring Short Answers
Figure 2 for Get It Scored Using AutoSAS -- An Automated System for Scoring Short Answers
Figure 3 for Get It Scored Using AutoSAS -- An Automated System for Scoring Short Answers
Figure 4 for Get It Scored Using AutoSAS -- An Automated System for Scoring Short Answers

In the era of MOOCs, online exams are taken by millions of candidates, where scoring short answers is an integral part. It becomes intractable to evaluate them by human graders. Thus, a generic automated system capable of grading these responses should be designed and deployed. In this paper, we present a fast, scalable, and accurate approach towards automated Short Answer Scoring (SAS). We propose and explain the design and development of a system for SAS, namely AutoSAS. Given a question along with its graded samples, AutoSAS can learn to grade that prompt successfully. This paper further lays down the features such as lexical diversity, Word2Vec, prompt, and content overlap that plays a pivotal role in building our proposed model. We also present a methodology for indicating the factors responsible for scoring an answer. The trained model is evaluated on an extensively used public dataset, namely Automated Student Assessment Prize Short Answer Scoring (ASAP-SAS). AutoSAS shows state-of-the-art performance and achieves better results by over 8% in some of the question prompts as measured by Quadratic Weighted Kappa (QWK), showing performance comparable to humans.

Viaarxiv icon

LIFI: Towards Linguistically Informed Frame Interpolation

Nov 11, 2020
Aradhya Neeraj Mathur, Devansh Batra, Yaman Kumar, Rajiv Ratn Shah, Roger Zimmermann, Amanda Stent

Figure 1 for LIFI: Towards Linguistically Informed Frame Interpolation
Figure 2 for LIFI: Towards Linguistically Informed Frame Interpolation
Figure 3 for LIFI: Towards Linguistically Informed Frame Interpolation
Figure 4 for LIFI: Towards Linguistically Informed Frame Interpolation

In this work, we explore a new problem of frame interpolation for speech videos. Such content today forms the major form of online communication. We try to solve this problem by using several deep learning video generation algorithms to generate the missing frames. We also provide examples where computer vision models despite showing high performance on conventional non-linguistic metrics fail to accurately produce faithful interpolation of speech. With this motivation, we provide a new set of linguistically-informed metrics specifically targeted to the problem of speech videos interpolation. We also release several datasets to test computer vision video generation models of their speech understanding.

* 9 pages, 7 tables, 4 figures 
Viaarxiv icon

Calling Out Bluff: Attacking the Robustness of Automatic Scoring Systems with Simple Adversarial Testing

Jul 14, 2020
Yaman Kumar, Mehar Bhatia, Anubha Kabra, Jessy Junyi Li, Di Jin, Rajiv Ratn Shah

Figure 1 for Calling Out Bluff: Attacking the Robustness of Automatic Scoring Systems with Simple Adversarial Testing
Figure 2 for Calling Out Bluff: Attacking the Robustness of Automatic Scoring Systems with Simple Adversarial Testing
Figure 3 for Calling Out Bluff: Attacking the Robustness of Automatic Scoring Systems with Simple Adversarial Testing
Figure 4 for Calling Out Bluff: Attacking the Robustness of Automatic Scoring Systems with Simple Adversarial Testing

A significant progress has been made in deep-learning based Automatic Essay Scoring (AES) systems in the past two decades. The performance commonly measured by the standard performance metrics like Quadratic Weighted Kappa (QWK), and accuracy points to the same. However, testing on common-sense adversarial examples of these AES systems reveal their lack of natural language understanding capability. Inspired by common student behaviour during examinations, we propose a task agnostic adversarial evaluation scheme for AES systems to test their natural language understanding capabilities and overall robustness.

Viaarxiv icon

"Notic My Speech" -- Blending Speech Patterns With Multimedia

Jun 12, 2020
Dhruva Sahrawat, Yaman Kumar, Shashwat Aggarwal, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann

Figure 1 for "Notic My Speech" -- Blending Speech Patterns With Multimedia
Figure 2 for "Notic My Speech" -- Blending Speech Patterns With Multimedia
Figure 3 for "Notic My Speech" -- Blending Speech Patterns With Multimedia
Figure 4 for "Notic My Speech" -- Blending Speech Patterns With Multimedia

Speech as a natural signal is composed of three parts - visemes (visual part of speech), phonemes (spoken part of speech), and language (the imposed structure). However, video as a medium for the delivery of speech and a multimedia construct has mostly ignored the cognitive aspects of speech delivery. For example, video applications like transcoding and compression have till now ignored the fact how speech is delivered and heard. To close the gap between speech understanding and multimedia video applications, in this paper, we show the initial experiments by modelling the perception on visual speech and showing its use case on video compression. On the other hand, in the visual speech recognition domain, existing studies have mostly modeled it as a classification problem, while ignoring the correlations between views, phonemes, visemes, and speech perception. This results in solutions which are further away from how human perception works. To bridge this gap, we propose a view-temporal attention mechanism to model both the view dependence and the visemic importance in speech recognition and understanding. We conduct experiments on three public visual speech recognition datasets. The experimental results show that our proposed method outperformed the existing work by 4.99% in terms of the viseme error rate. Moreover, we show that there is a strong correlation between our model's understanding of multi-view speech and the human perception. This characteristic benefits downstream applications such as video compression and streaming where a significant number of less important frames can be compressed or eliminated while being able to maximally preserve human speech understanding with good user experience.

* Under Review 
Viaarxiv icon

audino: A Modern Annotation Tool for Audio and Speech

Jun 09, 2020
Manraj Singh Grover, Pakhi Bamdev, Yaman Kumar, Mika Hama, Rajiv Ratn Shah

Figure 1 for audino: A Modern Annotation Tool for Audio and Speech
Figure 2 for audino: A Modern Annotation Tool for Audio and Speech

In this paper, we introduce a collaborative and modern annotation tool for audio and speech: audino. The tool allows annotators to define and describe temporal segmentation in audios. These segments can be labelled and transcribed easily using a dynamically generated form. An admin can centrally control user roles and project assignment through the admin dashboard. The dashboard also enables describing labels and their values. The annotations can easily be exported in JSON format for further processing. The tool allows audio data to be uploaded and assigned to a user through a key-based API. The flexibility available in the annotation tool enables annotation for Speech Scoring, Voice Activity Detection (VAD), Speaker Diarisation, Speaker Identification, Speech Recognition, Emotion Recognition tasks and more. The MIT open source license allows it to be used for academic and commercial projects.

* Submitted to 28th ACM International Conference on Multimedia 
Viaarxiv icon

Multi-modal Automated Speech Scoring using Attention Fusion

May 17, 2020
Manraj Singh Grover, Yaman Kumar, Sumit Sarin, Payman Vafaee, Mika Hama, Rajiv Ratn Shah

Figure 1 for Multi-modal Automated Speech Scoring using Attention Fusion
Figure 2 for Multi-modal Automated Speech Scoring using Attention Fusion
Figure 3 for Multi-modal Automated Speech Scoring using Attention Fusion
Figure 4 for Multi-modal Automated Speech Scoring using Attention Fusion

In this study, we propose a novel multi-modal end-to-end neural approach for automated assessment of non-native English speakers' spontaneous speech using attention fusion. The pipeline employs Bi-directional Recurrent Convolutional Neural Networks and Bi-directional Long Short-Term Memory Neural Networks to encode acoustic and lexical cues from spectrograms and transcriptions, respectively. Attention fusion is performed on these learned predictive features to learn complex interactions between different modalities before final scoring. We compare our model with strong baselines and find combined attention to both lexical and acoustic cues significantly improves the overall performance of the system. Further, we present a qualitative and quantitative analysis of our model.

* Submitted to INTERSPEECH 2020 
Viaarxiv icon

Learning based Methods for Code Runtime Complexity Prediction

Nov 04, 2019
Jagriti Sikka, Kushal Satya, Yaman Kumar, Shagun Uppal, Rajiv Ratn Shah, Roger Zimmermann

Figure 1 for Learning based Methods for Code Runtime Complexity Prediction
Figure 2 for Learning based Methods for Code Runtime Complexity Prediction
Figure 3 for Learning based Methods for Code Runtime Complexity Prediction
Figure 4 for Learning based Methods for Code Runtime Complexity Prediction

Predicting the runtime complexity of a programming code is an arduous task. In fact, even for humans, it requires a subtle analysis and comprehensive knowledge of algorithms to predict time complexity with high fidelity, given any code. As per Turing's Halting problem proof, estimating code complexity is mathematically impossible. Nevertheless, an approximate solution to such a task can help developers to get real-time feedback for the efficiency of their code. In this work, we model this problem as a machine learning task and check its feasibility with thorough analysis. Due to the lack of any open source dataset for this task, we propose our own annotated dataset CoRCoD: Code Runtime Complexity Dataset, extracted from online judges. We establish baselines using two different approaches: feature engineering and code embeddings, to achieve state of the art results and compare their performances. Such solutions can be widely useful in potential applications like automatically grading coding assignments, IDE-integrated tools for static code analysis, and others.

* 14 pages, 2 figures, 8 tables 
Viaarxiv icon