Alert button
Picture for Haibin Wang

Haibin Wang

Alert button

Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

Sep 08, 2023
Dawei Gao, Haibin Wang, Yaliang Li, Xiuyu Sun, Yichen Qian, Bolin Ding, Jingren Zhou

Figure 1 for Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation
Figure 2 for Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation
Figure 3 for Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation
Figure 4 for Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

Large language models (LLMs) have emerged as a new paradigm for Text-to-SQL task. However, the absence of a systematical benchmark inhibits the development of designing effective, efficient and economic LLM-based Text-to-SQL solutions. To address this challenge, in this paper, we first conduct a systematical and extensive comparison over existing prompt engineering methods, including question representation, example selection and example organization, and with these experimental results, we elaborate their pros and cons. Based on these findings, we propose a new integrated solution, named DAIL-SQL, which refreshes the Spider leaderboard with 86.6% execution accuracy and sets a new bar. To explore the potential of open-source LLM, we investigate them in various scenarios, and further enhance their performance with supervised fine-tuning. Our explorations highlight open-source LLMs' potential in Text-to-SQL, as well as the advantages and disadvantages of the supervised fine-tuning. Additionally, towards an efficient and economic LLM-based Text-to-SQL solution, we emphasize the token efficiency in prompt engineering and compare the prior studies under this metric. We hope that our work provides a deeper understanding of Text-to-SQL with LLMs, and inspires further investigations and broad applications.

* We have released code on https://github.com/BeachWang/DAIL-SQL 
Viaarxiv icon

HIE-SQL: History Information Enhanced Network for Context-Dependent Text-to-SQL Semantic Parsing

Apr 02, 2022
Yanzhao Zheng, Haibin Wang, Baohua Dong, Xingjun Wang, Changshan Li

Figure 1 for HIE-SQL: History Information Enhanced Network for Context-Dependent Text-to-SQL Semantic Parsing
Figure 2 for HIE-SQL: History Information Enhanced Network for Context-Dependent Text-to-SQL Semantic Parsing
Figure 3 for HIE-SQL: History Information Enhanced Network for Context-Dependent Text-to-SQL Semantic Parsing
Figure 4 for HIE-SQL: History Information Enhanced Network for Context-Dependent Text-to-SQL Semantic Parsing

Recently, context-dependent text-to-SQL semantic parsing which translates natural language into SQL in an interaction process has attracted a lot of attention. Previous works leverage context-dependence information either from interaction history utterances or the previous predicted SQL queries but fail in taking advantage of both since of the mismatch between natural language and logic-form SQL. In this work, we propose a History Information Enhanced text-to-SQL model (HIE-SQL) to exploit context-dependence information from both history utterances and the last predicted SQL query. In view of the mismatch, we treat natural language and SQL as two modalities and propose a bimodal pre-trained model to bridge the gap between them. Besides, we design a schema-linking graph to enhance connections from utterances and the SQL query to the database schema. We show our history information enhanced methods improve the performance of HIE-SQL by a significant margin, which achieves new state-of-the-art results on the two context-dependent text-to-SQL benchmarks, the SparC and CoSQL datasets, at the writing time.

* Accepted at ACL 2022 Findings 
Viaarxiv icon

Sequential Mechanisms for Multi-type Resource Allocation

Feb 21, 2021
Sujoy Sikdar, Xiaoxi Guo, Haibin Wang, Lirong Xia, Yongzhi Cao

Figure 1 for Sequential Mechanisms for Multi-type Resource Allocation
Figure 2 for Sequential Mechanisms for Multi-type Resource Allocation

Several resource allocation problems involve multiple types of resources, with a different agency being responsible for "locally" allocating the resources of each type, while a central planner wishes to provide a guarantee on the properties of the final allocation given agents' preferences. We study the relationship between properties of the local mechanisms, each responsible for assigning all of the resources of a designated type, and the properties of a sequential mechanism which is composed of these local mechanisms, one for each type, applied sequentially, under lexicographic preferences, a well studied model of preferences over multiple types of resources in artificial intelligence and economics. We show that when preferences are O-legal, meaning that agents share a common importance order on the types, sequential mechanisms satisfy the desirable properties of anonymity, neutrality, non-bossiness, or Pareto-optimality if and only if every local mechanism also satisfies the same property, and they are applied sequentially according to the order O. Our main results are that under O-legal lexicographic preferences, every mechanism satisfying strategyproofness and a combination of these properties must be a sequential composition of local mechanisms that are also strategyproof, and satisfy the same combinations of properties.

Viaarxiv icon

Probabilistic Serial Mechanism for Multi-Type Resource Allocation

Apr 25, 2020
Xiaoxi Guo, Sujoy Sikdar, Haibin Wang, Lirong Xia, Yongzhi Cao, Hanpin Wang

Figure 1 for Probabilistic Serial Mechanism for Multi-Type Resource Allocation
Figure 2 for Probabilistic Serial Mechanism for Multi-Type Resource Allocation

In multi-type resource allocation (MTRA) problems, there are p $\ge$ 2 types of items, and n agents, who each demand one unit of items of each type, and have strict linear preferences over bundles consisting of one item of each type. For MTRAs with indivisible items, our first result is an impossibility theorem that is in direct contrast to the single type (p = 1) setting: No mechanism, the output of which is always decomposable into a probability distribution over discrete assignments (where no item is split between agents), can satisfy both sd-efficiency and sd-envy-freeness. To circumvent this impossibility result, we consider the natural assumption of lexicographic preference, and provide an extension of the probabilistic serial (PS), called lexicographic probabilistic serial (LexiPS).We prove that LexiPS satisfies sd-efficiency and sd-envy-freeness, retaining the desirable properties of PS. Moreover, LexiPS satisfies sd-weak-strategyproofness when agents are not allowed to misreport their importance orders. For MTRAs with divisible items, we show that the existing multi-type probabilistic serial (MPS) mechanism satisfies the stronger efficiency notion of lexi-efficiency, and is sd-envy-free under strict linear preferences, and sd-weak-strategyproof under lexicographic preferences. We also prove that MPS can be characterized both by leximin-ptimality and by item-wise ordinal fairness, and the family of eating algorithms which MPS belongs to can be characterized by no-generalized-cycle condition.

Viaarxiv icon

Multi-type Resource Allocation with Partial Preferences

Jun 13, 2019
Haibin Wang, Sujoy Sikdar, Xiaoxi Guo, Lirong Xia, Yongzhi Cao, Hanpin Wang

Figure 1 for Multi-type Resource Allocation with Partial Preferences
Figure 2 for Multi-type Resource Allocation with Partial Preferences
Figure 3 for Multi-type Resource Allocation with Partial Preferences
Figure 4 for Multi-type Resource Allocation with Partial Preferences

We propose multi-type probabilistic serial (MPS) and multi-type random priority (MRP) as extensions of the well known PS and RP mechanisms to the multi-type resource allocation problem (MTRA) with partial preferences. In our setting, there are multiple types of divisible items, and a group of agents who have partial order preferences over bundles consisting of one item of each type. We show that for the unrestricted domain of partial order preferences, no mechanism satisfies both sd-efficiency and sd-envy-freeness. Notwithstanding this impossibility result, our main message is positive: When agents' preferences are represented by acyclic CP-nets, MPS satisfies sd-efficiency, sd-envy-freeness, ordinal fairness, and upper invariance, while MRP satisfies ex-post-efficiency, sd-strategy-proofness, and upper invariance, recovering the properties of PS and RP.

Viaarxiv icon

A Neutrosophic Description Logic

Mar 14, 2008
Haibin Wang, Andre Rogatko, Florentin Smarandache, Rajshekhar Sunderraman

Figure 1 for A Neutrosophic Description Logic
Figure 2 for A Neutrosophic Description Logic

Description Logics (DLs) are appropriate, widely used, logics for managing structured knowledge. They allow reasoning about individuals and concepts, i.e. set of individuals with common properties. Typically, DLs are limited to dealing with crisp, well defined concepts. That is, concepts for which the problem whether an individual is an instance of it is yes/no question. More often than not, the concepts encountered in the real world do not have a precisely defined criteria of membership: we may say that an individual is an instance of a concept only to a certain degree, depending on the individual's properties. The DLs that deal with such fuzzy concepts are called fuzzy DLs. In order to deal with fuzzy, incomplete, indeterminate and inconsistent concepts, we need to extend the fuzzy DLs, combining the neutrosophic logic with a classical DL. In particular, concepts become neutrosophic (here neutrosophic means fuzzy, incomplete, indeterminate, and inconsistent), thus reasoning about neutrosophic concepts is supported. We'll define its syntax, its semantics, and describe its properties.

* Proceedings of 2006 IEEE International Conference on Granular Computing, edited by Yan-Qing Zhang and Tsau Young Lin, Georgia State University, Atlanta, pp. 305-308, 2006  
* 18 pages. Presented at the IEEE International Conference on Granular Computing, Georgia State University, Atlanta, USA, May 2006 
Viaarxiv icon