Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zitong Wang

DeepForm: Reasoning Large Language Model for Communication System Formulation

Jun 11, 2025

Panlong Wu, Ting Wang, Yifei Zhong, Haoqi Zhang, Zitong Wang, Fangxin Wang

Abstract:Communication system formulation is critical for advancing 6G and future wireless technologies, yet it remains a complex, expertise-intensive task. While Large Language Models (LLMs) offer potential, existing general-purpose models often lack the specialized domain knowledge, nuanced reasoning capabilities, and access to high-quality, domain-specific training data required for adapting a general LLM into an LLM specially for communication system formulation. To bridge this gap, we introduce DeepForm, the first reasoning LLM specially for automated communication system formulation. We propose the world-first large-scale, open-source dataset meticulously curated for this domain called Communication System Formulation Reasoning Corpus (CSFRC). Our framework employs a two-stage training strategy: first, Supervised Fine-Tuning (SFT) with Chain-of-Thought (CoT) data to distill domain knowledge; second, a novel rule-based Reinforcement Learning (RL) algorithm, C-ReMax based on ReMax, to cultivate advanced modeling capabilities and elicit sophisticated reasoning patterns like self-correction and verification. Extensive experiments demonstrate that our model achieves state-of-the-art performance, significantly outperforming larger proprietary LLMs on diverse senerios. We will release related resources to foster further research in this area after the paper is accepted.

Via

Access Paper or Ask Questions

DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers

May 30, 2025

Zitong Wang, Hang Zhao, Qianyu Zhou, Xuequan Lu, Xiangtai Li, Yiren Song

Abstract:Diffusion models have recently motivated great success in many generation tasks like object removal. Nevertheless, existing image decomposition methods struggle to disentangle semi-transparent or transparent layer occlusions due to mask prior dependencies, static object assumptions, and the lack of datasets. In this paper, we delve into a novel task: Layer-Wise Decomposition of Alpha-Composited Images, aiming to recover constituent layers from single overlapped images under the condition of semi-transparent/transparent alpha layer non-linear occlusion. To address challenges in layer ambiguity, generalization, and data scarcity, we first introduce AlphaBlend, the first large-scale and high-quality dataset for transparent and semi-transparent layer decomposition, supporting six real-world subtasks (e.g., translucent flare removal, semi-transparent cell decomposition, glassware decomposition). Building on this dataset, we present DiffDecompose, a diffusion Transformer-based framework that learns the posterior over possible layer decompositions conditioned on the input image, semantic prompts, and blending type. Rather than regressing alpha mattes directly, DiffDecompose performs In-Context Decomposition, enabling the model to predict one or multiple layers without per-layer supervision, and introduces Layer Position Encoding Cloning to maintain pixel-level correspondence across layers. Extensive experiments on the proposed AlphaBlend dataset and public LOGO dataset verify the effectiveness of DiffDecompose. The code and dataset will be available upon paper acceptance. Our code will be available at: https://github.com/Wangzt1121/DiffDecompose.

Via

Access Paper or Ask Questions

DMS-Net:Dual-Modal Multi-Scale Siamese Network for Binocular Fundus Image Classification

Apr 25, 2025

Guohao Huo, Zibo Lin, Zitong Wang, Ruiting Dai, Hao Tang

Abstract:Ophthalmic diseases pose a significant global health challenge, yet traditional diagnosis methods and existing single-eye deep learning approaches often fail to account for binocular pathological correlations. To address this, we propose DMS-Net, a dual-modal multi-scale Siamese network for binocular fundus image classification. Our framework leverages weight-shared Siamese ResNet-152 backbones to extract deep semantic features from paired fundus images. To tackle challenges such as lesion boundary ambiguity and scattered pathological distributions, we introduce a Multi-Scale Context-Aware Module (MSCAM) that integrates adaptive pooling and attention mechanisms for multi-resolution feature aggregation. Additionally, a Dual-Modal Feature Fusion (DMFF) module enhances cross-modal interaction through spatial-semantic recalibration and bidirectional attention, effectively combining global context and local edge features. Evaluated on the ODIR-5K dataset, DMS-Net achieves state-of-the-art performance with 80.5% accuracy, 86.1% recall, and 83.8% Cohen's kappa, demonstrating superior capability in detecting symmetric pathologies and advancing clinical decision-making for ocular diseases.

Via

Access Paper or Ask Questions

Stochastic Trajectory Optimization for Demonstration Imitation

Aug 07, 2024

Chenlin Ming, Zitong Wang, Boxuan Zhang, Xiaoming Duan, Jianping He

Figure 1 for Stochastic Trajectory Optimization for Demonstration Imitation

Figure 2 for Stochastic Trajectory Optimization for Demonstration Imitation

Figure 3 for Stochastic Trajectory Optimization for Demonstration Imitation

Abstract:Humans often learn new skills by imitating the experts and gradually developing their proficiency. In this work, we introduce Stochastic Trajectory Optimization for Demonstration Imitation (STODI), a trajectory optimization framework for robots to imitate the shape of demonstration trajectories with improved dynamic performance. Consistent with the human learning process, demonstration imitation serves as an initial step, while trajectory optimization aims to enhance robot motion performance. By generating random noise and constructing proper cost functions, the STODI effectively explores and exploits generated noisy trajectories while preserving the demonstration shape characteristics. We employ three metrics to measure the similarity of trajectories in both the time and frequency domains to help with demonstration imitation. Theoretical analysis reveals relationships among these metrics, emphasizing the benefits of frequency-domain analysis for specific tasks. Experiments on a 7-DOF robotic arm in the PyBullet simulator validate the efficacy of the STODI framework, showcasing the improved optimization performance and stability compared to previous methods.

Via

Access Paper or Ask Questions

Imbalanced Graph-Level Anomaly Detection via Counterfactual Augmentation and Feature Learning

Jul 13, 2024

Zitong Wang, Xuexiong Luo, Enfeng Song, Qiuqing Bai, Fu Lin

Figure 1 for Imbalanced Graph-Level Anomaly Detection via Counterfactual Augmentation and Feature Learning

Figure 2 for Imbalanced Graph-Level Anomaly Detection via Counterfactual Augmentation and Feature Learning

Figure 3 for Imbalanced Graph-Level Anomaly Detection via Counterfactual Augmentation and Feature Learning

Figure 4 for Imbalanced Graph-Level Anomaly Detection via Counterfactual Augmentation and Feature Learning

Abstract:Graph-level anomaly detection (GLAD) has already gained significant importance and has become a popular field of study, attracting considerable attention across numerous downstream works. The core focus of this domain is to capture and highlight the anomalous information within given graph datasets. In most existing studies, anomalies are often the instances of few. The stark imbalance misleads current GLAD methods to focus on learning the patterns of normal graphs more, further impacting anomaly detection performance. Moreover, existing methods predominantly utilize the inherent features of nodes to identify anomalous graph patterns which is approved suboptimal according to our experiments. In this work, we propose an imbalanced GLAD method via counterfactual augmentation and feature learning. Specifically, we first construct anomalous samples based on counterfactual learning, aiming to expand and balance the datasets. Additionally, we construct a module based on Graph Neural Networks (GNNs), which allows us to utilize degree attributes to complement the inherent attribute features of nodes. Then, we design an adaptive weight learning module to integrate features tailored to different datasets effectively to avoid indiscriminately treating all features as equivalent. Furthermore, extensive baseline experiments conducted on public datasets substantiate the robustness and effectiveness. Besides, we apply the model to brain disease datasets, which can prove the generalization capability of our work. The source code of our work is available online.

* 12 pages, 4 figures, SSDBM2024

Via

Access Paper or Ask Questions

Learning from Sparse Offline Datasets via Conservative Density Estimation

Jan 16, 2024

Zhepeng Cen, Zuxin Liu, Zitong Wang, Yihang Yao, Henry Lam, Ding Zhao

Abstract:Offline reinforcement learning (RL) offers a promising direction for learning policies from pre-collected datasets without requiring further interactions with the environment. However, existing methods struggle to handle out-of-distribution (OOD) extrapolation errors, especially in sparse reward or scarce data settings. In this paper, we propose a novel training algorithm called Conservative Density Estimation (CDE), which addresses this challenge by explicitly imposing constraints on the state-action occupancy stationary distribution. CDE overcomes the limitations of existing approaches, such as the stationary distribution correction method, by addressing the support mismatch issue in marginal importance sampling. Our method achieves state-of-the-art performance on the D4RL benchmark. Notably, CDE consistently outperforms baselines in challenging tasks with sparse rewards or insufficient data, demonstrating the advantages of our approach in addressing the extrapolation error problem in offline RL.

* ICLR 2024

Via

Access Paper or Ask Questions

Resampling Stochastic Gradient Descent Cheaply for Efficient Uncertainty Quantification

Oct 17, 2023

Henry Lam, Zitong Wang

Abstract:Stochastic gradient descent (SGD) or stochastic approximation has been widely used in model training and stochastic optimization. While there is a huge literature on analyzing its convergence, inference on the obtained solutions from SGD has only been recently studied, yet is important due to the growing need for uncertainty quantification. We investigate two computationally cheap resampling-based methods to construct confidence intervals for SGD solutions. One uses multiple, but few, SGDs in parallel via resampling with replacement from the data, and another operates this in an online fashion. Our methods can be regarded as enhancements of established bootstrap schemes to substantially reduce the computation effort in terms of resampling requirements, while at the same time bypassing the intricate mixing conditions in existing batching methods. We achieve these via a recent so-called cheap bootstrap idea and Berry-Esseen-type bound for SGD.

Via

Access Paper or Ask Questions

Discriminative Graph-level Anomaly Detection via Dual-students-teacher Model

Aug 03, 2023

Fu Lin, Xuexiong Luo, Jia Wu, Jian Yang, Shan Xue, Zitong Wang, Haonan Gong

Figure 1 for Discriminative Graph-level Anomaly Detection via Dual-students-teacher Model

Figure 2 for Discriminative Graph-level Anomaly Detection via Dual-students-teacher Model

Figure 3 for Discriminative Graph-level Anomaly Detection via Dual-students-teacher Model

Figure 4 for Discriminative Graph-level Anomaly Detection via Dual-students-teacher Model

Abstract:Different from the current node-level anomaly detection task, the goal of graph-level anomaly detection is to find abnormal graphs that significantly differ from others in a graph set. Due to the scarcity of research on the work of graph-level anomaly detection, the detailed description of graph-level anomaly is insufficient. Furthermore, existing works focus on capturing anomalous graph information to learn better graph representations, but they ignore the importance of an effective anomaly score function for evaluating abnormal graphs. Thus, in this work, we first define anomalous graph information including node and graph property anomalies in a graph set and adopt node-level and graph-level information differences to identify them, respectively. Then, we introduce a discriminative graph-level anomaly detection framework with dual-students-teacher model, where the teacher model with a heuristic loss are trained to make graph representations more divergent. Then, two competing student models trained by normal and abnormal graphs respectively fit graph representations of the teacher model in terms of node-level and graph-level representation perspectives. Finally, we combine representation errors between two student models to discriminatively distinguish anomalous graphs. Extensive experiment analysis demonstrates that our method is effective for the graph-level anomaly detection task on graph datasets in the real world.

Via

Access Paper or Ask Questions

Multi-representations Space Separation based Graph-level Anomaly-aware Detection

Jul 22, 2023

Fu Lin, Haonan Gong, Mingkang Li, Zitong Wang, Yue Zhang, Xuexiong Luo

Figure 1 for Multi-representations Space Separation based Graph-level Anomaly-aware Detection

Figure 2 for Multi-representations Space Separation based Graph-level Anomaly-aware Detection

Figure 3 for Multi-representations Space Separation based Graph-level Anomaly-aware Detection

Figure 4 for Multi-representations Space Separation based Graph-level Anomaly-aware Detection

Abstract:Graph structure patterns are widely used to model different area data recently. How to detect anomalous graph information on these graph data has become a popular research problem. The objective of this research is centered on the particular issue that how to detect abnormal graphs within a graph set. The previous works have observed that abnormal graphs mainly show node-level and graph-level anomalies, but these methods equally treat two anomaly forms above in the evaluation of abnormal graphs, which is contrary to the fact that different types of abnormal graph data have different degrees in terms of node-level and graph-level anomalies. Furthermore, abnormal graphs that have subtle differences from normal graphs are easily escaped detection by the existing methods. Thus, we propose a multi-representations space separation based graph-level anomaly-aware detection framework in this paper. To consider the different importance of node-level and graph-level anomalies, we design an anomaly-aware module to learn the specific weight between them in the abnormal graph evaluation process. In addition, we learn strictly separate normal and abnormal graph representation spaces by four types of weighted graph representations against each other including anchor normal graphs, anchor abnormal graphs, training normal graphs, and training abnormal graphs. Based on the distance error between the graph representations of the test graph and both normal and abnormal graph representation spaces, we can accurately determine whether the test graph is anomalous. Our approach has been extensively evaluated against baseline methods using ten public graph datasets, and the results demonstrate its effectiveness.

* 35th International Conference on Scientific and Statistical Database Management (SSDBM 2023), July 10--12, 2023, Los Angeles, CA, USA
* 11 pages, 12 figures

Via

Access Paper or Ask Questions

A serial dual-channel library occupancy detection system based on Faster RCNN

Jun 28, 2023

Guoqiang Yang, Xiaowen Chang, Zitong Wang, Min Yang, Xin Chen

Figure 1 for A serial dual-channel library occupancy detection system based on Faster RCNN

Figure 2 for A serial dual-channel library occupancy detection system based on Faster RCNN

Figure 3 for A serial dual-channel library occupancy detection system based on Faster RCNN

Figure 4 for A serial dual-channel library occupancy detection system based on Faster RCNN

Abstract:The phenomenon of seat occupancy in university libraries is a prevalent issue. However, existing solutions, such as software-based seat reservations and sensors-based occupancy detection, have proven to be inadequate in effectively addressing this problem. In this study, we propose a novel approach: a serial dual-channel object detection model based on Faster RCNN. Furthermore, we develop a user-friendly Web interface and mobile APP to create a computer vision-based platform for library seat occupancy detection. To construct our dataset, we combine real-world data collec-tion with UE5 virtual reality. The results of our tests also demonstrate that the utilization of per-sonalized virtual dataset significantly enhances the performance of the convolutional neural net-work (CNN) in dedicated scenarios. The serial dual-channel detection model comprises three es-sential steps. Firstly, we employ Faster RCNN algorithm to determine whether a seat is occupied by an individual. Subsequently, we utilize an object classification algorithm based on transfer learning, to classify and identify images of unoccupied seats. This eliminates the need for manual judgment regarding whether a person is suspected of occupying a seat. Lastly, the Web interface and APP provide seat information to librarians and students respectively, enabling comprehensive services. By leveraging deep learning methodologies, this research effectively addresses the issue of seat occupancy in library systems. It significantly enhances the accuracy of seat occupancy recognition, reduces the computational resources required for training CNNs, and greatly improves the effi-ciency of library seat management.

Via

Access Paper or Ask Questions