Tabular data analysis is crucial in various fields, and large language models show promise in this area. However, current research mostly focuses on rudimentary tasks like Text2SQL and TableQA, neglecting advanced analysis like forecasting and chart generation. To address this gap, we developed the Text2Analysis benchmark, incorporating advanced analysis tasks that go beyond the SQL-compatible operations and require more in-depth analysis. We also develop five innovative and effective annotation methods, harnessing the capabilities of large language models to enhance data quality and quantity. Additionally, we include unclear queries that resemble real-world user questions to test how well models can understand and tackle such challenges. Finally, we collect 2249 query-result pairs with 347 tables. We evaluate five state-of-the-art models using three different metrics and the results show that our benchmark presents introduces considerable challenge in the field of tabular data analysis, paving the way for more advanced research opportunities.
Resource allocation is of crucial importance in wireless communications. However, it is extremely challenging to design efficient resource allocation schemes for future wireless communication networks since the formulated resource allocation problems are generally non-convex and consist of various coupled variables. Moreover, the dynamic changes of practical wireless communication environment and user service requirements thirst for efficient real-time resource allocation. To tackle these issues, a novel partially observable deep multi-agent active inference (PODMAI) framework is proposed for realizing intelligent resource allocation. A belief based learning method is exploited for updating the policy by minimizing the variational free energy. A decentralized training with a decentralized execution multi-agent strategy is designed to overcome the limitations of the partially observable state information. Exploited the proposed framework, an intelligent spectrum allocation and trajectory optimization scheme is developed for a spectrum sharing unmanned aerial vehicle (UAV) network with dynamic transmission rate requirements as an example. Simulation results demonstrate that our proposed framework can significantly improve the sum transmission rate of the secondary network compared to various benchmark schemes. Moreover, the convergence speed of the proposed PODMAI is significantly improved compared with the conventional reinforcement learning framework. Overall, our proposed framework can enrich the intelligent resource allocation frameworks and pave the way for realizing real-time resource allocation.
Due to inappropriate sample selection and limited training data, a distribution shift often exists between the training and test sets. This shift can adversely affect the test performance of Graph Neural Networks (GNNs). Existing approaches mitigate this issue by either enhancing the robustness of GNNs to distribution shift or reducing the shift itself. However, both approaches necessitate retraining the model, which becomes unfeasible when the model structure and parameters are inaccessible. To address this challenge, we propose FR-GNN, a general framework for GNNs to conduct feature reconstruction. FRGNN constructs a mapping relationship between the output and input of a well-trained GNN to obtain class representative embeddings and then uses these embeddings to reconstruct the features of labeled nodes. These reconstructed features are then incorporated into the message passing mechanism of GNNs to influence the predictions of unlabeled nodes at test time. Notably, the reconstructed node features can be directly utilized for testing the well-trained model, effectively reducing the distribution shift and leading to improved test performance. This remarkable achievement is attained without any modifications to the model structure or parameters. We provide theoretical guarantees for the effectiveness of our framework. Furthermore, we conduct comprehensive experiments on various public datasets. The experimental results demonstrate the superior performance of FRGNN in comparison to mainstream methods.
Unmanned aerial vehicle (UAV) communication is of crucial importance for diverse practical applications. However, it is susceptible to the severe spectrum scarcity problem and interference since it operates in the unlicensed spectrum band. In order to tackle those issues, a dynamic spectrum sharing network is considered with the anti-jamming technique. Moreover, an intelligent spectrum allocation and trajectory optimization scheme is proposed to adapt to diverse jamming models by exploiting our designed novel online-offline multi-agent actor-critic and deep deterministic policy-gradient framework. Simulation results demonstrate the high efficiency of our proposed framework. It is also shown that our proposed scheme achieves the largest transmission rate among all benchmark schemes.
Exploring data is crucial in data analysis, as it helps users understand and interpret the data more effectively. However, performing effective data exploration requires in-depth knowledge of the dataset and expertise in data analysis techniques. Not being familiar with either can create obstacles that make the process time-consuming and overwhelming for data analysts. To address this issue, we introduce InsightPilot, an LLM (Large Language Model)-based, automated data exploration system designed to simplify the data exploration process. InsightPilot automatically selects appropriate analysis intents, such as understanding, summarizing, and explaining. Then, these analysis intents are concretized by issuing corresponding intentional queries (IQueries) to create a meaningful and coherent exploration sequence. In brief, an IQuery is an abstraction and automation of data analysis operations, which mimics the approach of data analysts and simplifies the exploration process for users. By employing an LLM to iteratively collaborate with a state-of-the-art insight engine via IQueries, InsightPilot is effective in analyzing real-world datasets, enabling users to gain valuable insights through natural language inquiries. We demonstrate the effectiveness of InsightPilot in a case study, showing how it can help users gain valuable insights from their datasets.
Non-line-of-sight (NLOS) imaging aims to reconstruct the three-dimensional hidden scenes from the data measured in the line-of-sight, which uses photon time-of-flight information encoded in light after multiple diffuse reflections. The under-sampled scanning data can facilitate fast imaging. However, the resulting reconstruction problem becomes a serious ill-posed inverse problem, the solution of which is of high possibility to be degraded due to noises and distortions. In this paper, we propose two novel NLOS reconstruction models based on curvature regularization, i.e., the object-domain curvature regularization model and the dual (i.e., signal and object)-domain curvature regularization model. Fast numerical optimization algorithms are developed relying on the alternating direction method of multipliers (ADMM) with the backtracking stepsize rule, which are further accelerated by GPU implementation. We evaluate the proposed algorithms on both synthetic and real datasets, which achieve state-of-the-art performance, especially in the compressed sensing setting. All our codes and data are available at https://github.com/Duanlab123/CurvNLOS.
In light of the growing popularity of Exploratory Data Analysis (EDA), understanding the underlying causes of the knowledge acquired by EDA is crucial, but remains under-researched. This study promotes for the first time a transparent and explicable perspective on data analysis, called eXplainable Data Analysis (XDA). XDA provides data analysis with qualitative and quantitative explanations of causal and non-causal semantics. This way, XDA will significantly improve human understanding and confidence in the outcomes of data analysis, facilitating accurate data interpretation and decision-making in the real world. For this purpose, we present XInsight, a general framework for XDA. XInsight is a three-module, end-to-end pipeline designed to extract causal graphs, translate causal primitives into XDA semantics, and quantify the quantitative contribution of each explanation to a data fact. XInsight uses a set of design concepts and optimizations to address the inherent difficulties associated with integrating causality into XDA. Experiments on synthetic and real-world datasets as well as human evaluations demonstrate the highly promising capabilities of XInsight.
In the practical applications of computed tomography imaging, the projection data may be acquired within a limited-angle range and corrupted by noises due to the limitation of scanning conditions. The noisy incomplete projection data results in the ill-posedness of the inverse problems. In this work, we theoretically verify that the low-resolution reconstruction problem has better numerical stability than the high-resolution problem. In what follows, a novel low-resolution image prior based CT reconstruction model is proposed to make use of the low-resolution image to improve the reconstruction quality. More specifically, we build up a low-resolution reconstruction problem on the down-sampled projection data, and use the reconstructed low-resolution image as prior knowledge for the original limited-angle CT problem. We solve the constrained minimization problem by the alternating direction method with all subproblems approximated by the convolutional neural networks. Numerical experiments demonstrate that our double-resolution network outperforms both the variational method and popular learning-based reconstruction methods on noisy limited-angle reconstruction problems.
Automatic modulation classification is of crucial importance in wireless communication networks. Deep learning based automatic modulation classification schemes have attracted extensive attention due to the superior accuracy. However, the data-driven method relies on a large amount of training samples and the classification accuracy is poor in the low signal-to-noise radio (SNR). In order to tackle these problems, a novel data-and-knowledge dual-driven automatic modulation classification scheme based on radio frequency machine learning is proposed by exploiting the attribute features of different modulations. The visual model is utilized to extract visual features. The attribute learning model is used to learn the attribute semantic representations. The transformation model is proposed to convert the attribute representation into the visual space. Extensive simulation results demonstrate that our proposed automatic modulation classification scheme can achieve better performance than the benchmark schemes in terms of the classification accuracy, especially in the low SNR. Moreover, the confusion among high-order modulations is reduced by using our proposed scheme compared with other traditional schemes.