Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Topic:Information Extraction

What is Information Extraction? Information extraction is the process of automatically extracting structured information from unstructured text data.

Exploring the Early Universe with Deep Learning

Sep 26, 2025

Emmanuel de Salis, Massimo De Santis, Davide Piras, Sambit K. Giri, Michele Bianco, Nicolas Cerardi, Philipp Denzel, Merve Selcuk-Simsek, Kelley M. Hess, M. Carmen Toribio(+2 more)

Abstract:Hydrogen is the most abundant element in our Universe. The first generation of stars and galaxies produced photons that ionized hydrogen gas, driving a cosmological event known as the Epoch of Reionization (EoR). The upcoming Square Kilometre Array Observatory (SKAO) will map the distribution of neutral hydrogen during this era, aiding in the study of the properties of these first-generation objects. Extracting astrophysical information will be challenging, as SKAO will produce a tremendous amount of data where the hydrogen signal will be contaminated with undesired foreground contamination and instrumental systematics. To address this, we develop the latest deep learning techniques to extract information from the 2D power spectra of the hydrogen signal expected from SKAO. We apply a series of neural network models to these measurements and quantify their ability to predict the history of cosmic hydrogen reionization, which is connected to the increasing number and efficiency of early photon sources. We show that the study of the early Universe benefits from modern deep learning technology. In particular, we demonstrate that dedicated machine learning algorithms can achieve more than a $0.95$ $R^2$ score on average in recovering the reionization history. This enables accurate and precise cosmological and astrophysical inference of structure formation in the early Universe.

* Valente de Oliveira, J., Leite, J., Rodrigues, J., Dias, J., Cardoso, P. (eds) Progress in Artificial Intelligence. EPIA 2025. Lecture Notes in Computer Science(), vol 16121. Springer, Cham
* EPIA 2025 preprint version, 12 pages, 3 figures

Via

Access Paper or Ask Questions

A Tale of LLMs and Induced Small Proxies: Scalable Agents for Knowledge Mining

Oct 01, 2025

Sipeng Zhang, Longfei Yun, Zilong Wang, Jingbo Shang, Letian Peng

Abstract:At the core of Deep Research is knowledge mining, the task of extracting structured information from massive unstructured text in response to user instructions. Large language models (LLMs) excel at interpreting such instructions but are prohibitively expensive to deploy at scale, while traditional pipelines of classifiers and extractors remain efficient yet brittle and unable to generalize to new tasks. We introduce Falconer, a collaborative framework that combines the agentic reasoning of LLMs with lightweight proxy models for scalable knowledge mining. In Falconer, LLMs act as planners, decomposing user instructions into executable pipelines, and as annotators, generating supervision to train small proxies. The framework unifies classification and extraction into two atomic operations, get label and get span, enabling a single instruction-following model to replace multiple task-specific components. To evaluate the consistency between proxy models incubated by Falconer and annotations provided by humans and large models, we construct new benchmarks covering both planning and end-to-end execution. Experiments show that Falconer closely matches state-of-the-art LLMs in instruction-following accuracy while reducing inference cost by up to 90% and accelerating large-scale knowledge mining by more than 20x, offering an efficient and scalable foundation for Deep Research.

Via

Access Paper or Ask Questions

SpecGuard: Spectral Projection-based Advanced Invisible Watermarking

Oct 08, 2025

Inzamamul Alam, Md Tanvir Islam, Khan Muhammad, Simon S. Woo

Figure 1 for SpecGuard: Spectral Projection-based Advanced Invisible Watermarking

Figure 2 for SpecGuard: Spectral Projection-based Advanced Invisible Watermarking

Figure 3 for SpecGuard: Spectral Projection-based Advanced Invisible Watermarking

Figure 4 for SpecGuard: Spectral Projection-based Advanced Invisible Watermarking

Abstract:Watermarking embeds imperceptible patterns into images for authenticity verification. However, existing methods often lack robustness against various transformations primarily including distortions, image regeneration, and adversarial perturbation, creating real-world challenges. In this work, we introduce SpecGuard, a novel watermarking approach for robust and invisible image watermarking. Unlike prior approaches, we embed the message inside hidden convolution layers by converting from the spatial domain to the frequency domain using spectral projection of a higher frequency band that is decomposed by wavelet projection. Spectral projection employs Fast Fourier Transform approximation to transform spatial data into the frequency domain efficiently. In the encoding phase, a strength factor enhances resilience against diverse attacks, including adversarial, geometric, and regeneration-based distortions, ensuring the preservation of copyrighted information. Meanwhile, the decoder leverages Parseval's theorem to effectively learn and extract the watermark pattern, enabling accurate retrieval under challenging transformations. We evaluate the proposed SpecGuard based on the embedded watermark's invisibility, capacity, and robustness. Comprehensive experiments demonstrate the proposed SpecGuard outperforms the state-of-the-art models. To ensure reproducibility, the full code is released on \href{https://github.com/inzamamulDU/SpecGuard_ICCV_2025}{\textcolor{blue}{\textbf{GitHub}}}.

* ICCV 2025 Accepted Paper

Via

Access Paper or Ask Questions

Sensing-Secure ISAC: Ambiguity Function Engineering for Impairing Unauthorized Sensing

Oct 02, 2025

Kawon Han, Kaitao Meng, Christos Masouros

Abstract:The deployment of integrated sensing and communication (ISAC) brings along unprecedented vulnerabilities to authorized sensing, necessitating the development of secure solutions. Sensing parameters are embedded within the target-reflected signal leaked to unauthorized passive radar sensing eavesdroppers (Eve), implying that they can silently extract sensory information without prior knowledge of the information data. To overcome this limitation, we propose a sensing-secure ISAC framework that ensures secure target detection and estimation for the legitimate system, while obfuscating unauthorized sensing without requiring any prior knowledge of Eve. By introducing artificial imperfections into the ambiguity function (AF) of ISAC signals, we introduce artificial targets into Eve's range profile which increase its range estimation ambiguity. In contrast, the legitimate sensing receiver (Alice) can suppress these AF artifacts using mismatched filtering, albeit at the expense of signal-to-noise ratio (SNR) loss. Employing an OFDM signal, a structured subcarrier power allocation scheme is designed to shape the secure autocorrelation function (ACF), inserting periodic peaks to mislead Eve's range estimation and degrade target detection performance. To quantify the sensing security, we introduce peak sidelobe level (PSL) and integrated sidelobe level (ISL) as key performance metrics. Then, we analyze the three-way trade-offs between communication, legitimate sensing, and sensing security, highlighting the impact of the proposed sensing-secure ISAC signaling on system performance. We formulate a convex optimization problem to maximize ISAC performance while guaranteeing a certain sensing security level. Numerical results validate the effectiveness of the proposed sensing-secure ISAC signaling, demonstrating its ability to degrade Eve's target estimation while preserving Alice's performance.

* 15 pages, 12 figures, accepted to IEEE Transactions on Wireless Communications

Via

Access Paper or Ask Questions

FT-MDT: Extracting Decision Trees from Medical Texts via a Novel Low-rank Adaptation Method

Oct 06, 2025

Yuheng Li, Jiechao Gao, Wei Han, Wenwen Ouyang, Wei Zhu, Hui Yi Leong

Abstract:Knowledge of the medical decision process, which can be modeled as medical decision trees (MDTs), is critical to building clinical decision support systems. However, current MDT construction methods rely heavily on time-consuming and laborious manual annotation. To address this challenge, we propose PI-LoRA (Path-Integrated LoRA), a novel low-rank adaptation method for automatically extracting MDTs from clinical guidelines and textbooks. We integrate gradient path information to capture synergistic effects between different modules, enabling more effective and reliable rank allocation. This framework ensures that the most critical modules receive appropriate rank allocations while less important ones are pruned, resulting in a more efficient and accurate model for extracting medical decision trees from clinical texts. Extensive experiments on medical guideline datasets demonstrate that our PI-LoRA method significantly outperforms existing parameter-efficient fine-tuning approaches for the Text2MDT task, achieving better accuracy with substantially reduced model complexity. The proposed method achieves state-of-the-art results while maintaining a lightweight architecture, making it particularly suitable for clinical decision support systems where computational resources may be limited.

* Accepted by EMNLP-2025 Industrial Track

Via

Access Paper or Ask Questions

Explicit vs. Implicit Biographies: Evaluating and Adapting LLM Information Extraction on Wikidata-Derived Texts

Sep 18, 2025

Alessandra Stramiglio, Andrea Schimmenti, Valentina Pasqual, Marieke van Erp, Francesco Sovrano, Fabio Vitali

Abstract:Text Implicitness has always been challenging in Natural Language Processing (NLP), with traditional methods relying on explicit statements to identify entities and their relationships. From the sentence "Zuhdi attends church every Sunday", the relationship between Zuhdi and Christianity is evident for a human reader, but it presents a challenge when it must be inferred automatically. Large language models (LLMs) have proven effective in NLP downstream tasks such as text comprehension and information extraction (IE). This study examines how textual implicitness affects IE tasks in pre-trained LLMs: LLaMA 2.3, DeepSeekV1, and Phi1.5. We generate two synthetic datasets of 10k implicit and explicit verbalization of biographic information to measure the impact on LLM performance and analyze whether fine-tuning implicit data improves their ability to generalize in implicit reasoning tasks. This research presents an experiment on the internal reasoning processes of LLMs in IE, particularly in dealing with implicit and explicit contexts. The results demonstrate that fine-tuning LLM models with LoRA (low-rank adaptation) improves their performance in extracting information from implicit texts, contributing to better model interpretability and reliability.

Via

Access Paper or Ask Questions

SemanticControl: A Training-Free Approach for Handling Loosely Aligned Visual Conditions in ControlNet

Sep 26, 2025

Woosung Joung, Daewon Chae, Jinkyu Kim

Abstract:ControlNet has enabled detailed spatial control in text-to-image diffusion models by incorporating additional visual conditions such as depth or edge maps. However, its effectiveness heavily depends on the availability of visual conditions that are precisely aligned with the generation goal specified by text prompt-a requirement that often fails in practice, especially for uncommon or imaginative scenes. For example, generating an image of a cat cooking in a specific pose may be infeasible due to the lack of suitable visual conditions. In contrast, structurally similar cues can often be found in more common settings-for instance, poses of humans cooking are widely available and can serve as rough visual guides. Unfortunately, existing ControlNet models struggle to use such loosely aligned visual conditions, often resulting in low text fidelity or visual artifacts. To address this limitation, we propose SemanticControl, a training-free method for effectively leveraging misaligned but semantically relevant visual conditions. Our approach adaptively suppresses the influence of the visual condition where it conflicts with the prompt, while strengthening guidance from the text. The key idea is to first run an auxiliary denoising process using a surrogate prompt aligned with the visual condition (e.g., "a human playing guitar" for a human pose condition) to extract informative attention masks, and then utilize these masks during the denoising of the actual target prompt (e.g., cat playing guitar). Experimental results demonstrate that our method improves performance under loosely aligned conditions across various conditions, including depth maps, edge maps, and human skeletons, outperforming existing baselines. Our code is available at https://mung3477.github.io/semantic-control.

* BMVC 2025

Via

Access Paper or Ask Questions

Benchmark on Monocular Metric Depth Estimation in Wildlife Setting

Oct 06, 2025

Niccolò Niccoli, Lorenzo Seidenari, Ilaria Greco, Francesco Rovero

Abstract:Camera traps are widely used for wildlife monitoring, but extracting accurate distance measurements from monocular images remains challenging due to the lack of depth information. While monocular depth estimation (MDE) methods have advanced significantly, their performance in natural wildlife environments has not been systematically evaluated. This work introduces the first benchmark for monocular metric depth estimation in wildlife monitoring conditions. We evaluate four state-of-the-art MDE methods (Depth Anything V2, ML Depth Pro, ZoeDepth, and Metric3D) alongside a geometric baseline on 93 camera trap images with ground truth distances obtained using calibrated ChARUCO patterns. Our results demonstrate that Depth Anything V2 achieves the best overall performance with a mean absolute error of 0.454m and correlation of 0.962, while methods like ZoeDepth show significant degradation in outdoor natural environments (MAE: 3.087m). We find that median-based depth extraction consistently outperforms mean-based approaches across all deep learning methods. Additionally, we analyze computational efficiency, with ZoeDepth being fastest (0.17s per image) but least accurate, while Depth Anything V2 provides an optimal balance of accuracy and speed (0.22s per image). This benchmark establishes performance baselines for wildlife applications and provides practical guidance for implementing depth estimation in conservation monitoring systems.

Via

Access Paper or Ask Questions

HARP-NeXt: High-Speed and Accurate Range-Point Fusion Network for 3D LiDAR Semantic Segmentation

Oct 08, 2025

Samir Abou Haidar, Alexandre Chariot, Mehdi Darouich, Cyril Joly, Jean-Emmanuel Deschaud

Figure 1 for HARP-NeXt: High-Speed and Accurate Range-Point Fusion Network for 3D LiDAR Semantic Segmentation

Figure 2 for HARP-NeXt: High-Speed and Accurate Range-Point Fusion Network for 3D LiDAR Semantic Segmentation

Figure 3 for HARP-NeXt: High-Speed and Accurate Range-Point Fusion Network for 3D LiDAR Semantic Segmentation

Figure 4 for HARP-NeXt: High-Speed and Accurate Range-Point Fusion Network for 3D LiDAR Semantic Segmentation

Abstract:LiDAR semantic segmentation is crucial for autonomous vehicles and mobile robots, requiring high accuracy and real-time processing, especially on resource-constrained embedded systems. Previous state-of-the-art methods often face a trade-off between accuracy and speed. Point-based and sparse convolution-based methods are accurate but slow due to the complexity of neighbor searching and 3D convolutions. Projection-based methods are faster but lose critical geometric information during the 2D projection. Additionally, many recent methods rely on test-time augmentation (TTA) to improve performance, which further slows the inference. Moreover, the pre-processing phase across all methods increases execution time and is demanding on embedded platforms. Therefore, we introduce HARP-NeXt, a high-speed and accurate LiDAR semantic segmentation network. We first propose a novel pre-processing methodology that significantly reduces computational overhead. Then, we design the Conv-SE-NeXt feature extraction block to efficiently capture representations without deep layer stacking per network stage. We also employ a multi-scale range-point fusion backbone that leverages information at multiple abstraction levels to preserve essential geometric details, thereby enhancing accuracy. Experiments on the nuScenes and SemanticKITTI benchmarks show that HARP-NeXt achieves a superior speed-accuracy trade-off compared to all state-of-the-art methods, and, without relying on ensemble models or TTA, is comparable to the top-ranked PTv3, while running 24$\times$ faster. The code is available at https://github.com/SamirAbouHaidar/HARP-NeXt

* Accepted at IROS 2025 (IEEE/RSJ International Conference on Intelligent Robots and Systems)

Via

Access Paper or Ask Questions

Comprehensiveness Metrics for Automatic Evaluation of Factual Recall in Text Generation

Oct 09, 2025

Adam Dejl, James Barry, Alessandra Pascale, Javier Carnerero Cano

Figure 1 for Comprehensiveness Metrics for Automatic Evaluation of Factual Recall in Text Generation

Figure 2 for Comprehensiveness Metrics for Automatic Evaluation of Factual Recall in Text Generation

Figure 3 for Comprehensiveness Metrics for Automatic Evaluation of Factual Recall in Text Generation

Figure 4 for Comprehensiveness Metrics for Automatic Evaluation of Factual Recall in Text Generation

Abstract:Despite demonstrating remarkable performance across a wide range of tasks, large language models (LLMs) have also been found to frequently produce outputs that are incomplete or selectively omit key information. In sensitive domains, such omissions can result in significant harm comparable to that posed by factual inaccuracies, including hallucinations. In this study, we address the challenge of evaluating the comprehensiveness of LLM-generated texts, focusing on the detection of missing information or underrepresented viewpoints. We investigate three automated evaluation strategies: (1) an NLI-based method that decomposes texts into atomic statements and uses natural language inference (NLI) to identify missing links, (2) a Q&A-based approach that extracts question-answer pairs and compares responses across sources, and (3) an end-to-end method that directly identifies missing content using LLMs. Our experiments demonstrate the surprising effectiveness of the simple end-to-end approach compared to more complex methods, though at the cost of reduced robustness, interpretability and result granularity. We further assess the comprehensiveness of responses from several popular open-weight LLMs when answering user queries based on multiple sources.

Via

Access Paper or Ask Questions

Topic:Information Extraction

Papers and Code