Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wei-Ning Chiu

Doc2Query++: Topic-Coverage based Document Expansion and its Application to Dense Retrieval via Dual-Index Fusion

Oct 10, 2025

Tzu-Lin Kuo, Wei-Ning Chiu, Wei-Yun Ma, Pu-Jen Cheng

Figure 1 for Doc2Query++: Topic-Coverage based Document Expansion and its Application to Dense Retrieval via Dual-Index Fusion

Figure 2 for Doc2Query++: Topic-Coverage based Document Expansion and its Application to Dense Retrieval via Dual-Index Fusion

Figure 3 for Doc2Query++: Topic-Coverage based Document Expansion and its Application to Dense Retrieval via Dual-Index Fusion

Figure 4 for Doc2Query++: Topic-Coverage based Document Expansion and its Application to Dense Retrieval via Dual-Index Fusion

Abstract:Document expansion (DE) via query generation tackles vocabulary mismatch in sparse retrieval, yet faces limitations: uncontrolled generation producing hallucinated or redundant queries with low diversity; poor generalization from in-domain training (e.g., MS MARCO) to out-of-domain data like BEIR; and noise from concatenation harming dense retrieval. While Large Language Models (LLMs) enable cross-domain query generation, basic prompting lacks control, and taxonomy-based methods rely on domain-specific structures, limiting applicability. To address these challenges, we introduce Doc2Query++, a DE framework that structures query generation by first inferring a document's latent topics via unsupervised topic modeling for cross-domain applicability, then using hybrid keyword selection to create a diverse and relevant keyword set per document. This guides LLM not only to leverage keywords, which ensure comprehensive topic representation, but also to reduce redundancy through diverse, relevant terms. To prevent noise from query appending in dense retrieval, we propose Dual-Index Fusion strategy that isolates text and query signals, boosting performance in dense settings. Extensive experiments show Doc2Query++ significantly outperforms state-of-the-art baselines, achieving substantial gains in MAP, nDCG@10 and Recall@100 across diverse datasets on both sparse and dense retrieval.

* 11 pages, 4 figures

Via

Access Paper or Ask Questions

Uncovering Overconfident Failures in CXR Models via Augmentation-Sensitivity Risk Scoring

Oct 02, 2025

Han-Jay Shu, Wei-Ning Chiu, Shun-Ting Chang, Meng-Ping Huang, Takeshi Tohyama, Ahram Han, Po-Chih Kuo

Abstract:Deep learning models achieve strong performance in chest radiograph (CXR) interpretation, yet fairness and reliability concerns persist. Models often show uneven accuracy across patient subgroups, leading to hidden failures not reflected in aggregate metrics. Existing error detection approaches -- based on confidence calibration or out-of-distribution (OOD) detection -- struggle with subtle within-distribution errors, while image- and representation-level consistency-based methods remain underexplored in medical imaging. We propose an augmentation-sensitivity risk scoring (ASRS) framework to identify error-prone CXR cases. ASRS applies clinically plausible rotations ($\pm 15^\circ$/$\pm 30^\circ$) and measures embedding shifts with the RAD-DINO encoder. Sensitivity scores stratify samples into stability quartiles, where highly sensitive cases show substantially lower recall ($-0.2$ to $-0.3$) despite high AUROC and confidence. ASRS provides a label-free means for selective prediction and clinician review, improving fairness and safety in medical AI.

* 5 pages, 1 figures

Via

Access Paper or Ask Questions

Augment or Not? A Comparative Study of Pure and Augmented Large Language Model Recommenders

May 29, 2025

Wei-Hsiang Huang, Chen-Wei Ke, Wei-Ning Chiu, Yu-Xuan Su, Chun-Chun Yang, Chieh-Yuan Cheng, Yun-Nung Chen, Pu-Jen Cheng

Abstract:Large language models (LLMs) have introduced new paradigms for recommender systems by enabling richer semantic understanding and incorporating implicit world knowledge. In this study, we propose a systematic taxonomy that classifies existing approaches into two categories: (1) Pure LLM Recommenders, which rely solely on LLMs, and (2) Augmented LLM Recommenders, which integrate additional non-LLM techniques to enhance performance. This taxonomy provides a novel lens through which to examine the evolving landscape of LLM-based recommendation. To support fair comparison, we introduce a unified evaluation platform that benchmarks representative models under consistent experimental settings, highlighting key design choices that impact effectiveness. We conclude by discussing open challenges and outlining promising directions for future research. This work offers both a comprehensive overview and practical guidance for advancing next-generation LLM-powered recommender.

Via

Access Paper or Ask Questions

NoxTrader: LSTM-Based Stock Return Momentum Prediction

Oct 01, 2023

Hsiang-Hui Liu, Han-Jay Shu, Wei-Ning Chiu

Figure 1 for NoxTrader: LSTM-Based Stock Return Momentum Prediction

Figure 2 for NoxTrader: LSTM-Based Stock Return Momentum Prediction

Figure 3 for NoxTrader: LSTM-Based Stock Return Momentum Prediction

Figure 4 for NoxTrader: LSTM-Based Stock Return Momentum Prediction

Abstract:We introduce NoxTrader, which is designed for portfolio construction and trading execution, aims at generating profitable outcomes. The primary focus of NoxTrader is on stock market trading with an emphasis on cultivating moderate to long-term profits. The underlying learning process of NoxTrader hinges on the assimilation of insights gleaned from historical trading data, primarily hinging on time-series analysis due to the inherent nature of the employed dataset. We delineate the sequential progression encompassing data acquisition, feature engineering, predictive modeling, parameter configuration, establishment of a rigorous backtesting framework, and ultimately position NoxTrader as a testament to the prospective viability of algorithmic trading models within real-world trading scenarios.

Via

Access Paper or Ask Questions