Abstract:Large language models (LLM) often hallucinate, and while adding citations is a common solution, it is frequently insufficient for accountability as users struggle to verify how a cited source supports a generated claim. Existing methods are typically coarse-grained and fail to distinguish between direct quotes and complex reasoning. In this paper, we introduce Generation-time Fine-grained Provenance, a task where models must generate fluent answers while simultaneously producing structured, sentence-level provenance triples. To enable this, we present ReFInE (Relation-aware Fine-grained Interpretability & Evidence), a dataset featuring expert verified annotations that distinguish between Quotation, Compression, and Inference. Building on ReFInE, we propose GenProve, a framework that combines Supervised Fine-Tuning (SFT) with Group Relative Policy Optimization (GRPO). By optimizing a composite reward for answer fidelity and provenance correctness, GenProve significantly outperforms 14 strong LLMs in joint evaluation. Crucially, our analysis uncovers a reasoning gap where models excel at surface-level quotation but struggle significantly with inference-based provenance, suggesting that verifiable reasoning remains a frontier challenge distinct from surface-level citation.




Abstract:Convolutional neural networks (CNNs) have long been the paradigm of choice for robust medical image processing (MIP). Therefore, it is crucial to effectively and efficiently deploy CNNs on devices with different computing capabilities to support computer-aided diagnosis. Many methods employ factorized convolutional layers to alleviate the burden of limited computational resources at the expense of expressiveness. To this end, given weak medical image-driven CNN model optimization, a Singular value equalization generalizer-induced Factorized Convolution (SFConv) is proposed to improve the expressive power of factorized convolutions in MIP models. We first decompose the weight matrix of convolutional filters into two low-rank matrices to achieve model reduction. Then minimize the KL divergence between the two low-rank weight matrices and the uniform distribution, thereby reducing the number of singular value directions with significant variance. Extensive experiments on fundus and OCTA datasets demonstrate that our SFConv yields competitive expressiveness over vanilla convolutions while reducing complexity.
Abstract:Fault diagnosis is essential in industrial processes for monitoring the conditions of important machines. With the ever-increasing complexity of working conditions and demand for safety during production and operation, different diagnosis methods are required, and more importantly, an integrated fault diagnosis system that can cope with multiple tasks is highly desired. However, the diagnosis subtasks are often studied separately, and the currently available methods still need improvement for such a generalized system. To address this issue, we propose the Generalized Out-of-distribution Fault Diagnosis (GOOFD) framework to integrate diagnosis subtasks, such as fault detection, fault classification, and novel fault diagnosis. Additionally, a unified fault diagnosis method based on internal contrastive learning is put forward to underpin the proposed generalized framework. The method extracts features utilizing the internal contrastive learning technique and then recognizes the outliers based on the Mahalanobis distance. Experiments are conducted on a simulated benchmark dataset as well as two practical process datasets to evaluate the proposed framework. As demonstrated in the experiments, the proposed method achieves better performance compared with several existing techniques and thus verifies the effectiveness of the proposed framework.