Abstract:Accurately assessing candidate seniority from resumes is a critical yet challenging task, complicated by the prevalence of overstated experience and ambiguous self-presentation. In this study, we investigate the effectiveness of large language models (LLMs), including fine-tuned BERT architectures, for automating seniority classification in resumes. To rigorously evaluate model performance, we introduce a hybrid dataset comprising both real-world resumes and synthetically generated hard examples designed to simulate exaggerated qualifications and understated seniority. Using the dataset, we evaluate the performance of Large Language Models in detecting subtle linguistic cues associated with seniority inflation and implicit expertise. Our findings highlight promising directions for enhancing AI-driven candidate evaluation systems and mitigating bias introduced by self-promotional language. The dataset is available for the research community at https://bit.ly/4mcTovt
Abstract:Understanding how information is shared across documents, regardless of the format in which it is expressed, is critical for tasks such as information retrieval, summarization, and content alignment. In this work, we introduce a novel framework for modelling Semantic Coverage Relations (SCR), which classifies document pairs based on how their informational content aligns. We define three core relation types: equivalence, where both texts convey the same information using different textual forms or styles; inclusion, where one document fully contains the information of another and adds more; and semantic overlap, where each document presents partially overlapping content. To capture these relations, we adopt a question answering (QA)-based approach, using the answerability of shared questions across documents as an indicator of semantic coverage. We construct a synthetic dataset derived from the SQuAD corpus by paraphrasing source passages and selectively omitting information, enabling precise control over content overlap. This dataset allows us to benchmark generative language models and train transformer-based classifiers for SCR prediction. Our findings demonstrate that discriminative models significantly outperform generative approaches, with the RoBERTa-base model achieving the highest accuracy of 61.4% and the Random Forest-based model showing the best balance with a macro-F1 score of 52.9%. The results show that QA provides an effective lens for assessing semantic relations across stylistically diverse texts, offering insights into the capacity of current models to reason about information beyond surface similarity. The dataset and code developed in this study are publicly available to support reproducibility.
Abstract:Real-time image classification on resource-constrained platforms demands inference methods that balance accuracy with strict latency and power budgets. Early-exit strategies address this need by attaching auxiliary classifiers to intermediate layers of convolutional neural networks (CNNs), allowing "easy" samples to terminate inference early. However, conventional training of early exits introduces a covariance shift: downstream branches are trained on full datasets, while at inference they process only the harder, non-exited samples. This mismatch limits efficiency--accuracy trade-offs in practice. We introduce the Boosted Training Scheme for Early Exits (BTS-EE), a sequential training approach that aligns branch training with inference-time data distributions. Each branch is trained and calibrated before the next, ensuring robustness under selective inference conditions. To further support embedded deployment, we propose a lightweight branch architecture based on 1D convolutions and a Class Precision Margin (CPM) calibration method that enables per-class threshold tuning for reliable exit decisions. Experiments on the CINIC-10 dataset with a ResNet18 backbone demonstrate that BTS-EE consistently outperforms non-boosted training across 64 configurations, achieving up to 45 percent reduction in computation with only 2 percent accuracy degradation. These results expand the design space for deploying CNNs in real-time image processing systems, offering practical efficiency gains for applications such as industrial inspection, embedded vision, and UAV-based monitoring.
Abstract:In the era of conversational AI, generating accurate and contextually appropriate service responses remains a critical challenge. A central question remains: Is explicit intent recognition a prerequisite for generating high-quality service responses, or can models bypass this step and produce effective replies directly? This paper conducts a rigorous comparative study to address this fundamental design dilemma. Leveraging two publicly available service interaction datasets, we benchmark several state-of-the-art language models, including a fine-tuned T5 variant, across both paradigms: Intent-First Response Generation and Direct Response Generation. Evaluation metrics encompass both linguistic quality and task success rates, revealing surprising insights into the necessity or redundancy of explicit intent modelling. Our findings challenge conventional assumptions in conversational AI pipelines, offering actionable guidelines for designing more efficient and effective response generation systems.
Abstract:Automating the decision of whether a code change requires manual review is vital for maintaining software quality in modern development workflows. However, the emergence of new programming languages and frameworks creates a critical bottleneck: while large volumes of unlabelled code are readily available, there is an insufficient amount of labelled data to train supervised models for review classification. We address this challenge by leveraging Large Language Models (LLMs) to translate code changes from well-resourced languages into equivalent changes in underrepresented or emerging languages, generating synthetic training data where labelled examples are scarce. We assume that although LLMs have learned the syntax and semantics of new languages from available unlabelled code, they have yet to fully grasp which code changes are considered significant or review-worthy within these emerging ecosystems. To overcome this, we use LLMs to generate synthetic change examples and train supervised classifiers on them. We systematically compare the performance of these classifiers against models trained on real labelled data. Our experiments across multiple GitHub repositories and language pairs demonstrate that LLM-generated synthetic data can effectively bootstrap review recommendation systems, narrowing the performance gap even in low-resource settings. This approach provides a scalable pathway to extend automated code review capabilities to rapidly evolving technology stacks, even in the absence of annotated data.
Abstract:First responders widely adopt body-worn cameras to document incident scenes and support post-event analysis. However, reviewing lengthy video footage is impractical in time-critical situations. Effective situational awareness demands a concise visual summary that can be quickly interpreted. This work presents a computer vision pipeline that transforms body-camera footage into informative panoramic images summarizing the incident scene. Our method leverages monocular Simultaneous Localization and Mapping (SLAM) to estimate camera trajectories and reconstruct the spatial layout of the environment. Key viewpoints are identified by clustering camera poses along the trajectory, and representative frames from each cluster are selected. These frames are fused into spatially coherent panoramic images using multi-frame stitching techniques. The resulting summaries enable rapid understanding of complex environments and facilitate efficient decision-making and incident review.