Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Topic:Topic

Notes on Kernel Methods in Machine Learning

Nov 18, 2025

Diego Armando Pérez-Rosero, Danna Valentina Salazar-Dubois, Juan Camilo Lugo-Rojas, Andrés Marino Álvarez-Meza, Germán Castellanos-Dominguez

Abstract:These notes provide a self-contained introduction to kernel methods and their geometric foundations in machine learning. Starting from the construction of Hilbert spaces, we develop the theory of positive definite kernels, reproducing kernel Hilbert spaces (RKHS), and Hilbert-Schmidt operators, emphasizing their role in statistical estimation and representation of probability measures. Classical concepts such as covariance, regression, and information measures are revisited through the lens of Hilbert space geometry. We also introduce kernel density estimation, kernel embeddings of distributions, and the Maximum Mean Discrepancy (MMD). The exposition is designed to serve as a foundation for more advanced topics, including Gaussian processes, kernel Bayesian inference, and functional analytic approaches to modern machine learning.

Via

Access Paper or Ask Questions

BEDLAM2.0: Synthetic Humans and Cameras in Motion

Nov 18, 2025

Joachim Tesch, Giorgio Becherini, Prerana Achar, Anastasios Yiannakidis, Muhammed Kocabas, Priyanka Patel, Michael J. Black

Abstract:Inferring 3D human motion from video remains a challenging problem with many applications. While traditional methods estimate the human in image coordinates, many applications require human motion to be estimated in world coordinates. This is particularly challenging when there is both human and camera motion. Progress on this topic has been limited by the lack of rich video data with ground truth human and camera movement. We address this with BEDLAM2.0, a new dataset that goes beyond the popular BEDLAM dataset in important ways. In addition to introducing more diverse and realistic cameras and camera motions, BEDLAM2.0 increases diversity and realism of body shape, motions, clothing, hair, and 3D environments. Additionally, it adds shoes, which were missing in BEDLAM. BEDLAM has become a key resource for training 3D human pose and motion regressors today and we show that BEDLAM2.0 is significantly better, particularly for training methods that estimate humans in world coordinates. We compare state-of-the art methods trained on BEDLAM and BEDLAM2.0, and find that BEDLAM2.0 significantly improves accuracy over BEDLAM. For research purposes, we provide the rendered videos, ground truth body parameters, and camera motions. We also provide the 3D assets to which we have rights and links to those from third parties.

* NeurIPS 2025 (Datasets and Benchmarks track, oral). Project website: https://bedlam2.is.tue.mpg.de

Via

Access Paper or Ask Questions

Dual-Variable Force Characterisation method for Human-Robot Interaction in Wearable Robotics

Nov 18, 2025

Felipe Ballen-Moreno, Pasquale Ferrentino, Milan Amighi, Bram Vanderborght, Tom Verstraten

Figure 1 for Dual-Variable Force Characterisation method for Human-Robot Interaction in Wearable Robotics

Figure 2 for Dual-Variable Force Characterisation method for Human-Robot Interaction in Wearable Robotics

Figure 3 for Dual-Variable Force Characterisation method for Human-Robot Interaction in Wearable Robotics

Figure 4 for Dual-Variable Force Characterisation method for Human-Robot Interaction in Wearable Robotics

Abstract:Understanding the physical interaction with wearable robots is essential to ensure safety and comfort. However, this interaction is complex in two key aspects: (1) the motion involved, and (2) the non-linear behaviour of soft tissues. Multiple approaches have been undertaken to better understand this interaction and to improve the quantitative metrics of physical interfaces or cuffs. As these two topics are closely interrelated, finite modelling and soft tissue characterisation offer valuable insights into pressure distribution and shear stress induced by the cuff. Nevertheless, current characterisation methods typically rely on a single fitting variable along one degree of freedom, which limits their applicability, given that interactions with wearable robots often involve multiple degrees of freedom. To address this limitation, this work introduces a dual-variable characterisation method, involving normal and tangential forces, aimed at identifying reliable material parameters and evaluating the impact of single-variable fitting on force and torque responses. This method demonstrates the importance of incorporating two variables into the characterisation process by analysing the normalized mean square error (NMSE) across different scenarios and material models, providing a foundation for simulation at the closest possible level, with a focus on the cuff and the human limb involved in the physical interaction between the user and the wearable robot.

* 36 pages, 10 figures, submitted and under-review in Journal of the Mechanical Behavior of Biomedical Materials

Via

Access Paper or Ask Questions

NeuCLIRBench: A Modern Evaluation Collection for Monolingual, Cross-Language, and Multilingual Information Retrieval

Nov 18, 2025

Dawn Lawrie, James Mayfield, Eugene Yang, Andrew Yates, Sean MacAvaney, Ronak Pradeep, Scott Miller, Paul McNamee, Luca Soldani

Figure 1 for NeuCLIRBench: A Modern Evaluation Collection for Monolingual, Cross-Language, and Multilingual Information Retrieval

Figure 2 for NeuCLIRBench: A Modern Evaluation Collection for Monolingual, Cross-Language, and Multilingual Information Retrieval

Figure 3 for NeuCLIRBench: A Modern Evaluation Collection for Monolingual, Cross-Language, and Multilingual Information Retrieval

Figure 4 for NeuCLIRBench: A Modern Evaluation Collection for Monolingual, Cross-Language, and Multilingual Information Retrieval

Abstract:To measure advances in retrieval, test collections with relevance judgments that can faithfully distinguish systems are required. This paper presents NeuCLIRBench, an evaluation collection for cross-language and multilingual retrieval. The collection consists of documents written natively in Chinese, Persian, and Russian, as well as those same documents machine translated into English. The collection supports several retrieval scenarios including: monolingual retrieval in English, Chinese, Persian, or Russian; cross-language retrieval with English as the query language and one of the other three languages as the document language; and multilingual retrieval, again with English as the query language and relevant documents in all three languages. NeuCLIRBench combines the TREC NeuCLIR track topics of 2022, 2023, and 2024. The 250,128 judgments across approximately 150 queries for the monolingual and cross-language tasks and 100 queries for multilingual retrieval provide strong statistical discriminatory power to distinguish retrieval approaches. A fusion baseline of strong neural retrieval systems is included with the collection so that developers of reranking algorithms are no longer reliant on BM25 as their first-stage retriever. NeuCLIRBench is publicly available.

* 14 pages, 1 figure

Via

Access Paper or Ask Questions

Harmful Traits of AI Companions

Nov 18, 2025

W. Bradley Knox, Katie Bradford, Samanta Varela Castro, Desmond C. Ong, Sean Williams, Jacob Romanow, Carly Nations, Peter Stone, Samuel Baker

Figure 1 for Harmful Traits of AI Companions

Abstract:Amid the growing prevalence of human -- AI interaction, large language models and other AI-based entities increasingly provide forms of companionship to human users. Such AI companionship -- i.e., bonded relationships between humans and AI systems that resemble the relationships people have with family members, friends, and romantic partners -- might substantially benefit humans. Yet such relationships can also do profound harm. We propose a framework for analyzing potential negative impacts of AI companionship by identifying specific harmful traits of AI companions and speculatively mapping causal pathways back from these traits to possible causes and forward to potential harmful effects. We provide detailed, structured analysis of four potentially harmful traits -- the absence of natural endpoints for relationships, vulnerability to product sunsetting, high attachment anxiety, and propensity to engender protectiveness -- and briefly discuss fourteen others. For each trait, we propose hypotheses connecting causes -- such as misaligned optimization objectives and the digital nature of AI companions -- to fundamental harms -- including reduced autonomy, diminished quality of human relationships, and deception. Each hypothesized causal connection identifies a target for potential empirical evaluation. Our analysis examines harms at three levels: to human partners directly, to their relationships with other humans, and to society broadly. We examine how existing law struggles to address these emerging harms, discuss potential benefits of AI companions, and conclude with design recommendations for mitigating risks. This analysis offers immediate suggestions for reducing risks while laying a foundation for deeper investigation of this critical but understudied topic.

Via

Access Paper or Ask Questions

Efficient Reinforcement Learning for Zero-Shot Coordination in Evolving Games

Nov 18, 2025

Bingyu Hui, Lebin Yu, Quanming Yao, Yunpeng Qu, Xudong Zhang, Jian Wang

Abstract:Zero-shot coordination(ZSC), a key challenge in multi-agent game theory, has become a hot topic in reinforcement learning (RL) research recently, especially in complex evolving games. It focuses on the generalization ability of agents, requiring them to coordinate well with collaborators from a diverse, potentially evolving, pool of partners that are not seen before without any fine-tuning. Population-based training, which approximates such an evolving partner pool, has been proven to provide good zero-shot coordination performance; nevertheless, existing methods are limited by computational resources, mainly focusing on optimizing diversity in small populations while neglecting the potential performance gains from scaling population size. To address this issue, this paper proposes the Scalable Population Training (ScaPT), an efficient RL training framework comprising two key components: a meta-agent that efficiently realizes a population by selectively sharing parameters across agents, and a mutual information regularizer that guarantees population diversity. To empirically validate the effectiveness of ScaPT, this paper evaluates it along with representational frameworks in Hanabi cooperative game and confirms its superiority.

Via

Access Paper or Ask Questions

Computational Measurement of Political Positions: A Review of Text-Based Ideal Point Estimation Algorithms

Nov 17, 2025

Patrick Parschan, Charlott Jakob

Abstract:This article presents the first systematic review of unsupervised and semi-supervised computational text-based ideal point estimation (CT-IPE) algorithms, methods designed to infer latent political positions from textual data. These algorithms are widely used in political science, communication, computational social science, and computer science to estimate ideological preferences from parliamentary speeches, party manifestos, and social media. Over the past two decades, their development has closely followed broader NLP trends -- beginning with word-frequency models and most recently turning to large language models (LLMs). While this trajectory has greatly expanded the methodological toolkit, it has also produced a fragmented field that lacks systematic comparison and clear guidance for applied use. To address this gap, we identified 25 CT-IPE algorithms through a systematic literature review and conducted a manual content analysis of their modeling assumptions and development contexts. To compare them meaningfully, we introduce a conceptual framework that distinguishes how algorithms generate, capture, and aggregate textual variance. On this basis, we identify four methodological families -- word-frequency, topic modeling, word embedding, and LLM-based approaches -- and critically assess their assumptions, interpretability, scalability, and limitations. Our review offers three contributions. First, it provides a structured synthesis of two decades of algorithm development, clarifying how diverse methods relate to one another. Second, it translates these insights into practical guidance for applied researchers, highlighting trade-offs in transparency, technical requirements, and validation strategies that shape algorithm choice. Third, it emphasizes that differences in estimation outcomes across algorithms are themselves informative, underscoring the need for systematic benchmarking.

* 46 pages, 8 figures, 2 tables, accepted for publication in Quality & Quantity

Via

Access Paper or Ask Questions

MTP: Exploring Multimodal Urban Traffic Profiling with Modality Augmentation and Spectrum Fusion

Nov 17, 2025

Haolong Xiang, Peisi Wang, Xiaolong Xu, Kun Yi, Xuyun Zhang, Quanzheng Sheng, Amin Beheshti, Wei Fan

Abstract:With rapid urbanization in the modern era, traffic signals from various sensors have been playing a significant role in monitoring the states of cities, which provides a strong foundation in ensuring safe travel, reducing traffic congestion and optimizing urban mobility. Most existing methods for traffic signal modeling often rely on the original data modality, i.e., numerical direct readings from the sensors in cities. However, this unimodal approach overlooks the semantic information existing in multimodal heterogeneous urban data in different perspectives, which hinders a comprehensive understanding of traffic signals and limits the accurate prediction of complex traffic dynamics. To address this problem, we propose a novel Multimodal framework, MTP, for urban Traffic Profiling, which learns multimodal features through numeric, visual, and textual perspectives. The three branches drive for a multimodal perspective of urban traffic signal learning in the frequency domain, while the frequency learning strategies delicately refine the information for extraction. Specifically, we first conduct the visual augmentation for the traffic signals, which transforms the original modality into frequency images and periodicity images for visual learning. Also, we augment descriptive texts for the traffic signals based on the specific topic, background information and item description for textual learning. To complement the numeric information, we utilize frequency multilayer perceptrons for learning on the original modality. We design a hierarchical contrastive learning on the three branches to fuse the spectrum of three modalities. Finally, extensive experiments on six real-world datasets demonstrate superior performance compared with the state-of-the-art approaches.

Via

Access Paper or Ask Questions

Passive Dementia Screening via Facial Temporal Micro-Dynamics Analysis of In-the-Wild Talking-Head Video

Nov 17, 2025

Filippo Cenacchi. Longbing Cao, Mitchell McEwan, Deborah Richards

Figure 1 for Passive Dementia Screening via Facial Temporal Micro-Dynamics Analysis of In-the-Wild Talking-Head Video

Figure 2 for Passive Dementia Screening via Facial Temporal Micro-Dynamics Analysis of In-the-Wild Talking-Head Video

Figure 3 for Passive Dementia Screening via Facial Temporal Micro-Dynamics Analysis of In-the-Wild Talking-Head Video

Figure 4 for Passive Dementia Screening via Facial Temporal Micro-Dynamics Analysis of In-the-Wild Talking-Head Video

Abstract:We target passive dementia screening from short camera-facing talking head video, developing a facial temporal micro dynamics analysis for language free detection of early neuro cognitive change. This enables unscripted, in the wild video analysis at scale to capture natural facial behaviors, transferrable across devices, topics, and cultures without active intervention by clinicians or researchers during recording. Most existing resources prioritize speech or scripted interviews, limiting use outside clinics and coupling predictions to language and transcription. In contrast, we identify and analyze whether temporal facial kinematics, including blink dynamics, small mouth jaw motions, gaze variability, and subtle head adjustments, are sufficient for dementia screening without speech or text. By stabilizing facial signals, we convert these micro movements into interpretable facial microdynamic time series, smooth them, and summarize short windows into compact clip level statistics for screening. Each window is encoded by its activity mix (the relative share of motion across streams), thus the predictor analyzes the distribution of motion across streams rather than its magnitude, making per channel effects transparent. We also introduce YT DemTalk, a new dataset curated from publicly available, in the wild camera facing videos. It contains 300 clips (150 with self reported dementia, 150 controls) to test our model and offer a first benchmarking of the corpus. On YT DemTalk, ablations identify gaze lability and mouth/jaw dynamics as the most informative cues, and light weighted shallow classifiers could attain a dementia prediction performance of (AUROC) 0.953, 0.961 Average Precision (AP), 0.851 F1-score, and 0.857 accuracy.

Via

Access Paper or Ask Questions

AA-Omniscience: Evaluating Cross-Domain Knowledge Reliability in Large Language Models

Nov 17, 2025

Declan Jackson, William Keating, George Cameron, Micah Hill-Smith

Figure 1 for AA-Omniscience: Evaluating Cross-Domain Knowledge Reliability in Large Language Models

Figure 2 for AA-Omniscience: Evaluating Cross-Domain Knowledge Reliability in Large Language Models

Figure 3 for AA-Omniscience: Evaluating Cross-Domain Knowledge Reliability in Large Language Models

Figure 4 for AA-Omniscience: Evaluating Cross-Domain Knowledge Reliability in Large Language Models

Abstract:Existing language model evaluations primarily measure general capabilities, yet reliable use of these models across a range of domains demands factual accuracy and recognition of knowledge gaps. We introduce AA-Omniscience, a benchmark designed to measure both factual recall and knowledge calibration across 6,000 questions. Questions are derived from authoritative academic and industry sources, and cover 42 economically relevant topics within six different domains. The evaluation measures a model's Omniscience Index, a bounded metric (-100 to 100) measuring factual recall that jointly penalizes hallucinations and rewards abstention when uncertain, with 0 equating to a model that answers questions correctly as much as it does incorrectly. Among evaluated models, Claude 4.1 Opus attains the highest score (4.8), making it one of only three models to score above zero. These results reveal persistent factuality and calibration weaknesses across frontier models. Performance also varies by domain, with the models from three different research labs leading across the six domains. This performance variability suggests models should be chosen according to the demands of the use case rather than general performance for tasks where knowledge is important.

Via

Access Paper or Ask Questions

Topic:Topic

Papers and Code