Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andreas Weiler

Institute of Computer Science, Zurich University of Applied Sciences, Winterthur, Switzerland

Query Carefully: Detecting the Unanswerables in Text-to-SQL Tasks

Dec 19, 2025

Jasmin Saxer, Isabella Maria Aigner, Luise Linzmeier, Andreas Weiler, Kurt Stockinger

Abstract:Text-to-SQL systems allow non-SQL experts to interact with relational databases using natural language. However, their tendency to generate executable SQL for ambiguous, out-of-scope, or unanswerable queries introduces a hidden risk, as outputs may be misinterpreted as correct. This risk is especially serious in biomedical contexts, where precision is critical. We therefore present Query Carefully, a pipeline that integrates LLM-based SQL generation with explicit detection and handling of unanswerable inputs. Building on the OncoMX component of ScienceBenchmark, we construct OncoMX-NAQ (No-Answer Questions), a set of 80 no-answer questions spanning 8 categories (non-SQL, out-of-schema/domain, and multiple ambiguity types). Our approach employs llama3.3:70b with schema-aware prompts, explicit No-Answer Rules (NAR), and few-shot examples drawn from both answerable and unanswerable questions. We evaluate SQL exact match, result accuracy, and unanswerable-detection accuracy. On the OncoMX dev split, few-shot prompting with answerable examples increases result accuracy, and adding unanswerable examples does not degrade performance. On OncoMX-NAQ, balanced prompting achieves the highest unanswerable-detection accuracy (0.8), with near-perfect results for structurally defined categories (non-SQL, missing columns, out-of-domain) but persistent challenges for missing-value queries (0.5) and column ambiguity (0.3). A lightweight user interface surfaces interim SQL, execution results, and abstentions, supporting transparent and reliable text-to-SQL in biomedical applications.

* Accepted to the HC@AIxIA + HYDRA 2025

Via

Access Paper or Ask Questions

Advanced Multi-Variate Analysis Methods for New Physics Searches at the Large Hadron Collider

May 16, 2021

Anna Stakia, Tommaso Dorigo, Giovanni Banelli, Daniela Bortoletto, Alessandro Casa, Pablo de Castro, Christophe Delaere, Julien Donini, Livio Finos, Michele Gallinaro(+15 more)

Figure 1 for Advanced Multi-Variate Analysis Methods for New Physics Searches at the Large Hadron Collider

Figure 2 for Advanced Multi-Variate Analysis Methods for New Physics Searches at the Large Hadron Collider

Figure 3 for Advanced Multi-Variate Analysis Methods for New Physics Searches at the Large Hadron Collider

Figure 4 for Advanced Multi-Variate Analysis Methods for New Physics Searches at the Large Hadron Collider

Abstract:Between the years 2015 and 2019, members of the Horizon 2020-funded Innovative Training Network named "AMVA4NewPhysics" studied the customization and application of advanced multivariate analysis methods and statistical learning tools to high-energy physics problems, as well as developed entirely new ones. Many of those methods were successfully used to improve the sensitivity of data analyses performed by the ATLAS and CMS experiments at the CERN Large Hadron Collider; several others, still in the testing phase, promise to further improve the precision of measurements of fundamental physics parameters and the reach of searches for new phenomena. In this paper, the most relevant new tools, among those studied and developed, are presented along with the evaluation of their performances.

* 95 pages, 21 figures, submitted to Elsevier

Via

Access Paper or Ask Questions

Improving Sample Efficiency and Multi-Agent Communication in RL-based Train Rescheduling

Apr 28, 2020

Dano Roost, Ralph Meier, Stephan Huschauer, Erik Nygren, Adrian Egli, Andreas Weiler, Thilo Stadelmann

Figure 1 for Improving Sample Efficiency and Multi-Agent Communication in RL-based Train Rescheduling

Figure 2 for Improving Sample Efficiency and Multi-Agent Communication in RL-based Train Rescheduling

Abstract:We present preliminary results from our sixth placed entry to the Flatland international competition for train rescheduling, including two improvements for optimized reinforcement learning (RL) training efficiency, and two hypotheses with respect to the prospect of deep RL for complex real-world control tasks: first, that current state of the art policy gradient methods seem inappropriate in the domain of high-consequence environments; second, that learning explicit communication actions (an emerging machine-to-machine language, so to speak) might offer a remedy. These hypotheses need to be confirmed by future work. If confirmed, they hold promises with respect to optimizing highly efficient logistics ecosystems like the Swiss Federal Railways railway network.

* Accepted for publication at the 7th Swiss Conference on Data Science (SDS 2020)

Via

Access Paper or Ask Questions