Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Madiha Kazi

Pose Matters: Evaluating Vision Transformers and CNNs for Human Action Recognition on Small COCO Subsets

Jun 13, 2025

MingZe Tang, Madiha Kazi

Abstract:This study explores human action recognition using a three-class subset of the COCO image corpus, benchmarking models from simple fully connected networks to transformer architectures. The binary Vision Transformer (ViT) achieved 90% mean test accuracy, significantly exceeding multiclass classifiers such as convolutional networks (approximately 35%) and CLIP-based models (approximately 62-64%). A one-way ANOVA (F = 61.37, p < 0.001) confirmed these differences are statistically significant. Qualitative analysis with SHAP explainer and LeGrad heatmaps indicated that the ViT localizes pose-specific regions (e.g., lower limbs for walking or running), while simpler feed-forward models often focus on background textures, explaining their errors. These findings emphasize the data efficiency of transformer representations and the importance of explainability techniques in diagnosing class-specific failures.

* 7 pages, 9 figures

Via

Access Paper or Ask Questions

DateLogicQA: Benchmarking Temporal Biases in Large Language Models

Dec 17, 2024

Gagan Bhatia, MingZe Tang, Cristina Mahanta, Madiha Kazi

Figure 1 for DateLogicQA: Benchmarking Temporal Biases in Large Language Models

Figure 2 for DateLogicQA: Benchmarking Temporal Biases in Large Language Models

Figure 3 for DateLogicQA: Benchmarking Temporal Biases in Large Language Models

Figure 4 for DateLogicQA: Benchmarking Temporal Biases in Large Language Models

Abstract:This paper introduces DateLogicQA, a benchmark with 190 questions covering diverse date formats, temporal contexts, and reasoning types. We propose the Semantic Integrity Metric to assess tokenization quality and analyse two biases: Representation-Level Bias, affecting embeddings, and Logical-Level Bias, influencing reasoning outputs. Our findings provide a comprehensive evaluation of LLMs' capabilities and limitations in temporal reasoning, highlighting key challenges in handling temporal data accurately. The GitHub repository for our work is available at https://github.com/gagan3012/EAIS-Temporal-Bias

Via

Access Paper or Ask Questions