Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xin Xia

University of Georgia

KAPPA: A Generic Patent Analysis Framework with Keyphrase-Based Portraits

Feb 18, 2025

Xin Xia, Yujin Wang, Jun Zhou, Guisheng Zhong, Linning Cai, Chen Zhang

Figure 1 for KAPPA: A Generic Patent Analysis Framework with Keyphrase-Based Portraits

Figure 2 for KAPPA: A Generic Patent Analysis Framework with Keyphrase-Based Portraits

Figure 3 for KAPPA: A Generic Patent Analysis Framework with Keyphrase-Based Portraits

Figure 4 for KAPPA: A Generic Patent Analysis Framework with Keyphrase-Based Portraits

Abstract:Patent analysis highly relies on concise and interpretable document representations, referred to as patent portraits. Keyphrases, both present and absent, are ideal candidates for patent portraits due to their brevity, representativeness, and clarity. In this paper, we introduce KAPPA, an integrated framework designed to construct keyphrase-based patent portraits and enhance patent analysis. KAPPA operates in two phases: patent portrait construction and portrait-based analysis. To ensure effective portrait construction, we propose a semantic-calibrated keyphrase generation paradigm that integrates pre-trained language models with a prompt-based hierarchical decoding strategy to leverage the multi-level structural characteristics of patents. For portrait-based analysis, we develop a comprehensive framework that employs keyphrase-based patent portraits to enable efficient and accurate patent analysis. Extensive experiments on benchmark datasets of keyphrase generation, the proposed model achieves significant improvements compared to state-of-the-art baselines. Further experiments conducted on real-world patent applications demonstrate that our keyphrase-based portraits effectively capture domain-specific knowledge and enrich semantic representation for patent analysis tasks.

Via

Access Paper or Ask Questions

Can LLMs Replace Human Evaluators? An Empirical Study of LLM-as-a-Judge in Software Engineering

Feb 10, 2025

Ruiqi Wang, Jiyu Guo, Cuiyun Gao, Guodong Fan, Chun Yong Chong, Xin Xia

Abstract:Recently, large language models (LLMs) have been deployed to tackle various software engineering (SE) tasks like code generation, significantly advancing the automation of SE tasks. However, assessing the quality of these LLM-generated code and text remains challenging. The commonly used Pass@k metric necessitates extensive unit tests and configured environments, demands a high labor cost, and is not suitable for evaluating LLM-generated text. Conventional metrics like BLEU, which measure only lexical rather than semantic similarity, have also come under scrutiny. In response, a new trend has emerged to employ LLMs for automated evaluation, known as LLM-as-a-judge. These LLM-as-a-judge methods are claimed to better mimic human assessment than conventional metrics without relying on high-quality reference answers. Nevertheless, their exact human alignment in SE tasks remains unexplored. In this paper, we empirically explore LLM-as-a-judge methods for evaluating SE tasks, focusing on their alignment with human judgments. We select seven LLM-as-a-judge methods that utilize general-purpose LLMs, alongside two LLMs specifically fine-tuned for evaluation. After generating and manually scoring LLM responses on three recent SE datasets of code translation, code generation, and code summarization, we then prompt these methods to evaluate each response. Finally, we compare the scores generated by these methods with human evaluation. The results indicate that output-based methods reach the highest Pearson correlation of 81.32 and 68.51 with human scores in code translation and generation, achieving near-human evaluation, noticeably outperforming ChrF++, one of the best conventional metrics, at 34.23 and 64.92. Such output-based methods prompt LLMs to output judgments directly, and exhibit more balanced score distributions that resemble human score patterns. Finally, we provide...

* Accepted by ISSTA 2025

Via

Access Paper or Ask Questions

Diffusion Adversarial Post-Training for One-Step Video Generation

Jan 14, 2025

Shanchuan Lin, Xin Xia, Yuxi Ren, Ceyuan Yang, Xuefeng Xiao, Lu Jiang

Figure 1 for Diffusion Adversarial Post-Training for One-Step Video Generation

Figure 2 for Diffusion Adversarial Post-Training for One-Step Video Generation

Figure 3 for Diffusion Adversarial Post-Training for One-Step Video Generation

Figure 4 for Diffusion Adversarial Post-Training for One-Step Video Generation

Abstract:The diffusion models are widely used for image and video generation, but their iterative generation process is slow and expansive. While existing distillation approaches have demonstrated the potential for one-step generation in the image domain, they still suffer from significant quality degradation. In this work, we propose Adversarial Post-Training (APT) against real data following diffusion pre-training for one-step video generation. To improve the training stability and quality, we introduce several improvements to the model architecture and training procedures, along with an approximated R1 regularization objective. Empirically, our experiments show that our adversarial post-trained model, Seaweed-APT, can generate 2-second, 1280x720, 24fps videos in real time using a single forward evaluation step. Additionally, our model is capable of generating 1024px images in a single step, achieving quality comparable to state-of-the-art methods.

Via

Access Paper or Ask Questions

Verified Lifting of Deep learning Operators

Dec 30, 2024

Qi Zhan, Xing Hu, Xin Xia, Shanping Li

Figure 1 for Verified Lifting of Deep learning Operators

Figure 2 for Verified Lifting of Deep learning Operators

Figure 3 for Verified Lifting of Deep learning Operators

Figure 4 for Verified Lifting of Deep learning Operators

Abstract:Deep learning operators are fundamental components of modern deep learning frameworks. With the growing demand for customized operators, it has become increasingly common for developers to create their own. However, designing and implementing operators is complex and error-prone, due to hardware-specific optimizations and the need for numerical stability. There is a pressing need for tools that can summarize the functionality of both existing and user-defined operators. To address this gap, this work introduces a novel framework for the verified lifting of deep learning operators, which synthesizes high-level mathematical formulas from low-level implementations. Our approach combines symbolic execution, syntax-guided synthesis, and SMT-based verification to produce readable and formally verified mathematical formulas. In synthesis, we employ a combination of top-down and bottom-up strategies to explore the vast search space efficiently; In verification, we design invariant synthesis patterns and leverage SMT solvers to validate the correctness of the derived summaries; In simplification, we use egraph-based techniques with custom rules to restore complex formulas to their natural, intuitive forms. Evaluated on a dataset of deep learning operators implemented in Triton from the real world, our method demonstrates the effectiveness of synthesis and verification compared to existing techniques. This framework bridges the gap between low-level implementations and high-level abstractions, improving understanding and reliability in deep learning operator development.

Via

Access Paper or Ask Questions

AC-LIO: Towards Asymptotic and Consistent Convergence in LiDAR-Inertial Odometry

Dec 08, 2024

Tianxiang Zhang, Xuanxuan Zhang, Wenlei Fan, Xin Xia, You Li

Figure 1 for AC-LIO: Towards Asymptotic and Consistent Convergence in LiDAR-Inertial Odometry

Figure 2 for AC-LIO: Towards Asymptotic and Consistent Convergence in LiDAR-Inertial Odometry

Figure 3 for AC-LIO: Towards Asymptotic and Consistent Convergence in LiDAR-Inertial Odometry

Figure 4 for AC-LIO: Towards Asymptotic and Consistent Convergence in LiDAR-Inertial Odometry

Abstract:Existing LiDAR-Inertial Odometry (LIO) frameworks typically utilize prior state trajectories derived from IMU integration to compensate for the motion distortion within LiDAR frames, and demonstrate outstanding accuracy and stability in regular low-speed and smooth scenes. However, in high-speed or intense motion scenarios, the residual distortion may increase due to the limitation of IMU's accuracy and frequency, which will degrade the consistency between the LiDAR frame with its represented geometric environment, leading pointcloud registration to fall into local optima and consequently increasing the drift in long-time and large-scale localization. To address the issue, we propose a novel asymptotically and consistently converging LIO framework called AC-LIO. First, during the iterative state estimation, we backwards propagate the update term based on the prior state chain, and asymptotically compensate the residual distortion before next iteration. Second, considering the weak correlation between the initial error and motion distortion of current frame, we propose a convergence criteria based on pointcloud constraints to control the back propagation. The approach of guiding the asymptotic distortion compensation based on convergence criteria can promote the consistent convergence of pointcloud registration and increase the accuracy and robustness of LIO. Experiments show that our AC-LIO framework, compared to other state-of-the-art frameworks, effectively promotes consistent convergence in state estimation and further improves the accuracy of long-time and large-scale localization and mapping.

* 7 pages, 5 figures

Via

Access Paper or Ask Questions

V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction

Dec 02, 2024

Zewei Zhou, Hao Xiang, Zhaoliang Zheng, Seth Z. Zhao, Mingyue Lei, Yun Zhang, Tianhui Cai, Xinyi Liu, Johnson Liu, Maheswari Bajji(+5 more)

Figure 1 for V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction

Figure 2 for V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction

Figure 3 for V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction

Figure 4 for V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction

Abstract:Vehicle-to-everything (V2X) technologies offer a promising paradigm to mitigate the limitations of constrained observability in single-vehicle systems. Prior work primarily focuses on single-frame cooperative perception, which fuses agents' information across different spatial locations but ignores temporal cues and temporal tasks (e.g., temporal perception and prediction). In this paper, we focus on temporal perception and prediction tasks in V2X scenarios and design one-step and multi-step communication strategies (when to transmit) as well as examine their integration with three fusion strategies - early, late, and intermediate (what to transmit), providing comprehensive benchmarks with various fusion models (how to fuse). Furthermore, we propose V2XPnP, a novel intermediate fusion framework within one-step communication for end-to-end perception and prediction. Our framework employs a unified Transformer-based architecture to effectively model complex spatiotemporal relationships across temporal per-frame, spatial per-agent, and high-definition map. Moreover, we introduce the V2XPnP Sequential Dataset that supports all V2X cooperation modes and addresses the limitations of existing real-world datasets, which are restricted to single-frame or single-mode cooperation. Extensive experiments demonstrate our framework outperforms state-of-the-art methods in both perception and prediction tasks.

* Website link: https://mobility-lab.seas.ucla.edu/v2xpnp/

Via

Access Paper or Ask Questions

Balancing property optimization and constraint satisfaction for constrained multi-property molecular optimization

Nov 19, 2024

Xin Xia, Yajie Zhang, Xiangxiang Zeng, Xingyi Zhang, Chunhou Zheng, Yansen Su

Figure 1 for Balancing property optimization and constraint satisfaction for constrained multi-property molecular optimization

Figure 2 for Balancing property optimization and constraint satisfaction for constrained multi-property molecular optimization

Figure 3 for Balancing property optimization and constraint satisfaction for constrained multi-property molecular optimization

Figure 4 for Balancing property optimization and constraint satisfaction for constrained multi-property molecular optimization

Abstract:Molecular optimization, which aims to discover improved molecules from a vast chemical search space, is a critical step in chemical development. Various artificial intelligence technologies have demonstrated high effectiveness and efficiency on molecular optimization tasks. However, few of these technologies focus on balancing property optimization with constraint satisfaction, making it difficult to obtain high-quality molecules that not only possess desirable properties but also meet various constraints. To address this issue, we propose a constrained multi-property molecular optimization framework (CMOMO), which is a flexible and efficient method to simultaneously optimize multiple molecular properties while satisfying several drug-like constraints. CMOMO improves multiple properties of molecules with constraints based on dynamic cooperative optimization, which dynamically handles the constraints across various scenarios. Besides, CMOMO evaluates multiple properties within discrete chemical spaces cooperatively with the evolution of molecules within an implicit molecular space to guide the evolutionary search. Experimental results show the superior performance of the proposed CMOMO over five state-of-the-art molecular optimization methods on two benchmark tasks of simultaneously optimizing multiple non-biological activity properties while satisfying two structural constraints. Furthermore, the practical applicability of CMOMO is verified on two practical tasks, where it identified a collection of candidate ligands of $\beta$2-adrenoceptor GPCR and candidate inhibitors of glycogen synthase kinase-3$\beta$ with high properties and under drug-like constraints.

Via

Access Paper or Ask Questions

B4: Towards Optimal Assessment of Plausible Code Solutions with Plausible Tests

Sep 13, 2024

Mouxiang Chen, Zhongxin Liu, He Tao, Yusu Hong, David Lo, Xin Xia, Jianling Sun

Figure 1 for B4: Towards Optimal Assessment of Plausible Code Solutions with Plausible Tests

Figure 2 for B4: Towards Optimal Assessment of Plausible Code Solutions with Plausible Tests

Figure 3 for B4: Towards Optimal Assessment of Plausible Code Solutions with Plausible Tests

Figure 4 for B4: Towards Optimal Assessment of Plausible Code Solutions with Plausible Tests

Abstract:Selecting the best code solution from multiple generated ones is an essential task in code generation, which can be achieved by using some reliable validators (e.g., developer-written test cases) for assistance. Since reliable test cases are not always available and can be expensive to build in practice, researchers propose to automatically generate test cases to assess code solutions. However, when both code solutions and test cases are plausible and not reliable, selecting the best solution becomes challenging. Although some heuristic strategies have been proposed to tackle this problem, they lack a strong theoretical guarantee and it is still an open question whether an optimal selection strategy exists. Our work contributes in two ways. First, we show that within a Bayesian framework, the optimal selection strategy can be defined based on the posterior probability of the observed passing states between solutions and tests. The problem of identifying the best solution is then framed as an integer programming problem. Second, we propose an efficient approach for approximating this optimal (yet uncomputable) strategy, where the approximation error is bounded by the correctness of prior knowledge. We then incorporate effective prior knowledge to tailor code generation tasks. Both theoretical and empirical studies confirm that existing heuristics are limited in selecting the best solutions with plausible test cases. Our proposed approximated optimal strategy B4 significantly surpasses existing heuristics in selecting code solutions generated by large language models (LLMs) with LLM-generated tests, achieving a relative performance improvement by up to 50% over the strongest heuristic and 246% over the random selection in the most challenging scenarios. Our code is publicly available at https://github.com/ZJU-CTAG/B4.

* accepted by ASE' 24 (full paper)

Via

Access Paper or Ask Questions

AS-LIO: Spatial Overlap Guided Adaptive Sliding Window LiDAR-Inertial Odometry for Aggressive FOV Variation

Aug 21, 2024

Tianxiang Zhang, Xuanxuan Zhang, Zongbo Liao, Xin Xia, You Li

Figure 1 for AS-LIO: Spatial Overlap Guided Adaptive Sliding Window LiDAR-Inertial Odometry for Aggressive FOV Variation

Figure 2 for AS-LIO: Spatial Overlap Guided Adaptive Sliding Window LiDAR-Inertial Odometry for Aggressive FOV Variation

Figure 3 for AS-LIO: Spatial Overlap Guided Adaptive Sliding Window LiDAR-Inertial Odometry for Aggressive FOV Variation

Figure 4 for AS-LIO: Spatial Overlap Guided Adaptive Sliding Window LiDAR-Inertial Odometry for Aggressive FOV Variation

Abstract:LiDAR-Inertial Odometry (LIO) demonstrates outstanding accuracy and stability in general low-speed and smooth motion scenarios. However, in high-speed and intense motion scenarios, such as sharp turns, two primary challenges arise: firstly, due to the limitations of IMU frequency, the error in estimating significantly non-linear motion states escalates; secondly, drastic changes in the Field of View (FOV) may diminish the spatial overlap between LiDAR frame and pointcloud map (or between frames), leading to insufficient data association and constraint degradation. To address these issues, we propose a novel Adaptive Sliding window LIO framework (AS-LIO) guided by the Spatial Overlap Degree (SOD). Initially, we assess the SOD between the LiDAR frames and the registered map, directly evaluating the adverse impact of current FOV variation on pointcloud alignment. Subsequently, we design an adaptive sliding window to manage the continuous LiDAR stream and control state updates, dynamically adjusting the update step according to the SOD. This strategy enables our odometry to adaptively adopt higher update frequency to precisely characterize trajectory during aggressive FOV variation, thus effectively reducing the non-linear error in positioning. Meanwhile, the historical constraints within the sliding window reinforce the frame-to-map data association, ensuring the robustness of state estimation. Experiments show that our AS-LIO framework can quickly perceive and respond to challenging FOV change, outperforming other state-of-the-art LIO frameworks in terms of accuracy and robustness.

* 8 pages, 6 figures

Via

Access Paper or Ask Questions

CooPre: Cooperative Pretraining for V2X Cooperative Perception

Aug 20, 2024

Seth Z. Zhao, Hao Xiang, Chenfeng Xu, Xin Xia, Bolei Zhou, Jiaqi Ma

Figure 1 for CooPre: Cooperative Pretraining for V2X Cooperative Perception

Figure 2 for CooPre: Cooperative Pretraining for V2X Cooperative Perception

Figure 3 for CooPre: Cooperative Pretraining for V2X Cooperative Perception

Figure 4 for CooPre: Cooperative Pretraining for V2X Cooperative Perception

Abstract:Existing Vehicle-to-Everything (V2X) cooperative perception methods rely on accurate multi-agent 3D annotations. Nevertheless, it is time-consuming and expensive to collect and annotate real-world data, especially for V2X systems. In this paper, we present a self-supervised learning method for V2X cooperative perception, which utilizes the vast amount of unlabeled 3D V2X data to enhance the perception performance. Beyond simply extending the previous pre-training methods for point-cloud representation learning, we introduce a novel self-supervised Cooperative Pretraining framework (termed as CooPre) customized for a collaborative scenario. We point out that cooperative point-cloud sensing compensates for information loss among agents. This motivates us to design a novel proxy task for the 3D encoder to reconstruct LiDAR point clouds across different agents. Besides, we develop a V2X bird-eye-view (BEV) guided masking strategy which effectively allows the model to pay attention to 3D features across heterogeneous V2X agents (i.e., vehicles and infrastructure) in the BEV space. Noticeably, such a masking strategy effectively pretrains the 3D encoder and is compatible with mainstream cooperative perception backbones. Our approach, validated through extensive experiments on representative datasets (i.e., V2X-Real, V2V4Real, and OPV2V), leads to a performance boost across all V2X settings. Additionally, we demonstrate the framework's improvements in cross-domain transferability, data efficiency, and robustness under challenging scenarios. The code will be made publicly available.

Via

Access Paper or Ask Questions