Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wei Wang

School of Physics and Astronomy, Shanghai Jiao Tong University, State Key Laboratory of Dark Matter Physics, Shanghai Jiao Tong University, Tsung-Dao Lee Institute, Shanghai Jiao Tong University

Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization

Aug 14, 2024

Yuxin Jiang, Bo Huang, Yufei Wang, Xingshan Zeng, Liangyou Li, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Wei Wang

Figure 1 for Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization

Figure 2 for Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization

Figure 3 for Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization

Figure 4 for Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization

Abstract:Direct preference optimization (DPO), a widely adopted offline preference optimization algorithm, aims to align large language models (LLMs) with human-desired behaviors using pairwise preference data. However, the winning response and the losing response within pairwise data are generated isolatedly, leading to weak correlations between them as well as suboptimal alignment performance. To address this issue, we propose an effective framework named BMC, for bridging and modeling correlations in pairwise data. Firstly, we increase the consistency and informativeness of the pairwise preference signals by targeted modifications, synthesizing a pseudo winning response through improving the losing response based on the winning response. Secondly, we identify that DPO alone is insufficient to model these correlations and capture nuanced variations. Therefore, we propose learning token-level correlations by dynamically leveraging the policy model's confidence during training. Comprehensive experiments on QA, math, and instruction-following tasks demonstrate the effectiveness of our approach, significantly surpassing competitive baselines, including DPO. Additionally, our in-depth quantitative analysis reveals the reasons behind our method's superior performance over DPO and showcases its versatility to other DPO variants.

* 18 pages, 8 figures, 8 tables, working in progress

Via

Access Paper or Ask Questions

CTISum: A New Benchmark Dataset For Cyber Threat Intelligence Summarization

Aug 13, 2024

Wei Peng, Junmei Ding, Wei Wang, Lei Cui, Wei Cai, Zhiyu Hao, Xiaochun Yun

Figure 1 for CTISum: A New Benchmark Dataset For Cyber Threat Intelligence Summarization

Figure 2 for CTISum: A New Benchmark Dataset For Cyber Threat Intelligence Summarization

Figure 3 for CTISum: A New Benchmark Dataset For Cyber Threat Intelligence Summarization

Figure 4 for CTISum: A New Benchmark Dataset For Cyber Threat Intelligence Summarization

Abstract:Cyber Threat Intelligence (CTI) summarization task requires the system to generate concise and accurate highlights from raw intelligence data, which plays an important role in providing decision-makers with crucial information to quickly detect and respond to cyber threats in the cybersecurity domain. However, efficient techniques for summarizing CTI reports, including facts, analytical insights, attack processes, etc., have largely been unexplored, primarily due to the lack of available dataset. To this end, we present CTISum, a new benchmark for CTI summarization task. Considering the importance of attack process, a novel fine-grained subtask of attack process summarization is proposed to enable defenders to assess risk, identify security gaps, vulnerabilities, and so on. Specifically, we first design a multi-stage annotation pipeline to gather and annotate the CTI data, and then benchmark the CTISum with a collection of extractive and abstractive summarization methods. Experimental results show that current state-of-the-art models exhibit limitations when applied to CTISum, underscoring the fact that automatically producing concise summaries of CTI reports remains an open research challenge.

Via

Access Paper or Ask Questions

Inverse designing metamaterials with programmable nonlinear functional responses in graph space

Aug 12, 2024

Marco Maurizi, Derek Xu, Yu-Tong Wang, Desheng Yao, David Hahn, Mourad Oudich, Anish Satpati, Mathieu Bauchy, Wei Wang, Yizhou Sun(+2 more)

Figure 1 for Inverse designing metamaterials with programmable nonlinear functional responses in graph space

Figure 2 for Inverse designing metamaterials with programmable nonlinear functional responses in graph space

Figure 3 for Inverse designing metamaterials with programmable nonlinear functional responses in graph space

Figure 4 for Inverse designing metamaterials with programmable nonlinear functional responses in graph space

Abstract:Material responses to static and dynamic stimuli, represented as nonlinear curves, are design targets for engineering functionalities like structural support, impact protection, and acoustic and photonic bandgaps. Three-dimensional metamaterials offer significant tunability due to their internal structure, yet existing methods struggle to capture their complex behavior-to-structure relationships. We present GraphMetaMat, a graph-based framework capable of designing three-dimensional metamaterials with programmable responses and arbitrary manufacturing constraints. Integrating graph networks, physics biases, reinforcement learning, and tree search, GraphMetaMat can target stress-strain curves spanning four orders of magnitude and complex behaviors, as well as viscoelastic transmission responses with varying attenuation gaps. GraphMetaMat can create cushioning materials for protective equipment and vibration-damping panels for electric vehicles, outperforming commercial materials, and enabling the automatic design of materials with on-demand functionalities.

* 19 pages, 5 figures

Via

Access Paper or Ask Questions

CTE-MLO: Continuous-time and Efficient Multi-LiDAR Odometry with Localizability-aware Point Cloud Sampling

Aug 09, 2024

Hongming Shen, Zhenyu Wu, Wei Wang, Qiyang Lyu, Huiqin Zhou, Tianchen Deng, Yeqing Zhu, Danwei Wang

Figure 1 for CTE-MLO: Continuous-time and Efficient Multi-LiDAR Odometry with Localizability-aware Point Cloud Sampling

Figure 2 for CTE-MLO: Continuous-time and Efficient Multi-LiDAR Odometry with Localizability-aware Point Cloud Sampling

Figure 3 for CTE-MLO: Continuous-time and Efficient Multi-LiDAR Odometry with Localizability-aware Point Cloud Sampling

Figure 4 for CTE-MLO: Continuous-time and Efficient Multi-LiDAR Odometry with Localizability-aware Point Cloud Sampling

Abstract:In recent years, LiDAR-based localization and mapping methods have achieved significant progress thanks to their reliable and real-time localization capability. Considering single LiDAR odometry often faces hardware failures and degradation in practical scenarios, Multi-LiDAR Odometry (MLO), as an emerging technology, is studied to enhance the performance of LiDAR-based localization and mapping systems. However, MLO can suffer from high computational complexity introduced by dense point clouds that are fused from multiple LiDARs, and the continuous-time measurement characteristic is constantly neglected by existing LiDAR odometry. This motivates us to develop a Continuous-Time and Efficient MLO, namely CTE-MLO, which can achieve accurate and real-time state estimation using multi-LiDAR measurements through a continuous-time perspective. In this paper, the Gaussian process estimation is naturally combined with the Kalman filter, which enables each LiDAR point in a point stream to query the corresponding continuous-time trajectory within its time instants. A decentralized multi-LiDAR synchronization scheme also be devised to combine points from separate LiDARs into a single point cloud without the requirement for primary LiDAR assignment. Moreover, with the aim of improving the real-time performance of MLO without sacrificing robustness, a point cloud sampling strategy is designed with the consideration of localizability. The effectiveness of the proposed method is demonstrated through various scenarios, including public datasets and real-world autonomous driving experiments. The results demonstrate that the proposed CTE-MLO can achieve high-accuracy continuous-time state estimations in real-time and is demonstratively competitive compared to other state-of-the-art methods.

Via

Access Paper or Ask Questions

MDM: Advancing Multi-Domain Distribution Matching for Automatic Modulation Recognition Dataset Synthesis

Aug 05, 2024

Dongwei Xu, Jiajun Chen, Yao Lu, Tianhao Xia, Qi Xuan, Wei Wang, Yun Lin, Xiaoniu Yang

Abstract:Recently, deep learning technology has been successfully introduced into Automatic Modulation Recognition (AMR) tasks. However, the success of deep learning is all attributed to the training on large-scale datasets. Such a large amount of data brings huge pressure on storage, transmission and model training. In order to solve the problem of large amount of data, some researchers put forward the method of data distillation, which aims to compress large training data into smaller synthetic datasets to maintain its performance. While numerous data distillation techniques have been developed within the realm of image processing, the unique characteristics of signals set them apart. Signals exhibit distinct features across various domains, necessitating specialized approaches for their analysis and processing. To this end, a novel dataset distillation method--Multi-domain Distribution Matching (MDM) is proposed. MDM employs the Discrete Fourier Transform (DFT) to translate timedomain signals into the frequency domain, and then uses a model to compute distribution matching losses between the synthetic and real datasets, considering both the time and frequency domains. Ultimately, these two losses are integrated to update the synthetic dataset. We conduct extensive experiments on three AMR datasets. Experimental results show that, compared with baseline methods, our method achieves better performance under the same compression ratio. Furthermore, we conduct crossarchitecture generalization experiments on several models, and the experimental results show that our synthetic datasets can generalize well on other unseen models.

Via

Access Paper or Ask Questions

UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model

Aug 05, 2024

Zhaowei Li, Wei Wang, YiQing Cai, Xu Qi, Pengyu Wang, Dong Zhang, Hang Song, Botian Jiang, Zhida Huang, Tao Wang

Figure 1 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model

Figure 2 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model

Figure 3 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model

Figure 4 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model

Abstract:Significant advancements has recently been achieved in the field of multi-modal large language models (MLLMs), demonstrating their remarkable capabilities in understanding and reasoning across diverse tasks. However, these models are often trained for specific tasks and rely on task-specific input-output formats, limiting their applicability to a broader range of tasks. This raises a fundamental question: Can we develop a unified approach to represent and handle different multi-modal tasks to maximize the generalizability of MLLMs? In this paper, we propose UnifiedMLLM, a comprehensive model designed to represent various tasks using a unified representation. Our model exhibits strong capabilities in comprehending the implicit intent of user instructions and preforming reasoning. In addition to generating textual responses, our model also outputs task tokens and grounding tokens, serving as indicators of task types and task granularity. These outputs are subsequently routed through the task router and directed to specific expert models for task completion. To train our model, we construct a task-specific dataset and an 100k multi-task dataset encompassing complex scenarios. Employing a three-stage training strategy, we equip our model with robust reasoning and task processing capabilities while preserving its generalization capacity and knowledge reservoir. Extensive experiments showcase the impressive performance of our unified representation approach across various tasks, surpassing existing methodologies. Furthermore, our approach exhibits exceptional scalability and generality. Our code, model, and dataset will be available at \url{https://github.com/lzw-lzw/UnifiedMLLM}.

Via

Access Paper or Ask Questions

DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain Feature Extraction and Interaction Attention

Aug 01, 2024

Wei Wang, Jixing He, Xin Wang

Figure 1 for DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain Feature Extraction and Interaction Attention

Figure 2 for DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain Feature Extraction and Interaction Attention

Figure 3 for DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain Feature Extraction and Interaction Attention

Figure 4 for DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain Feature Extraction and Interaction Attention

Abstract:It is helpful in preventing colorectal cancer to detect and treat polyps in the gastrointestinal tract early. However, there have been few studies to date on designing polyp image classification networks that balance efficiency and accuracy. This challenge is mainly attributed to the fact that polyps are similar to other pathologies and have complex features influenced by texture, color, and morphology. In this paper, we propose a novel network DFE-IANet based on both spectral transformation and feature interaction. Firstly, to extract detailed features and multi-scale features, the features are transformed by the multi-scale frequency domain feature extraction (MSFD) block to extract texture details at the fine-grained level in the frequency domain. Secondly, the multi-scale interaction attention (MSIA) block is designed to enhance the network's capability of extracting critical features. This block introduces multi-scale features into self-attention, aiming to adaptively guide the network to concentrate on vital regions. Finally, with a compact parameter of only 4M, DFE-IANet outperforms the latest and classical networks in terms of efficiency. Furthermore, DFE-IANet achieves state-of-the-art (SOTA) results on the challenging Kvasir dataset, demonstrating a remarkable Top-1 accuracy of 93.94%. This outstanding accuracy surpasses ViT by 8.94%, ResNet50 by 1.69%, and VMamba by 1.88%. Our code is publicly available at https://github.com/PURSUETHESUN/DFE-IANet.

* ICIC 2024, Tianjin, China, Poster Volume 1, pp.492-509, August 5-8, 2024
* This paper has been accepted by 2024 International Conference on Intelligent Computing (ICIC 2024). It can be accessed at http://poster-openaccess.com

Via

Access Paper or Ask Questions

Universal Approximation Theory: Foundations for Parallelism in Neural Networks

Jul 31, 2024

Wei Wang, Qing Li

Abstract:Neural networks are increasingly evolving towards training large models with big data, a method that has demonstrated superior performance across many tasks. However, this approach introduces an urgent problem: current deep learning models are predominantly serial, meaning that as the number of network layers increases, so do the training and inference times. This is unacceptable if deep learning is to continue advancing. Therefore, this paper proposes a deep learning parallelization strategy based on the Universal Approximation Theorem (UAT). From this foundation, we designed a parallel network called Para-Former to test our theory. Unlike traditional serial models, the inference time of Para-Former does not increase with the number of layers, significantly accelerating the inference speed of multi-layer networks. Experimental results validate the effectiveness of this network.

Via

Access Paper or Ask Questions

Deep Learning-Based Longitudinal Prediction of Childhood Myopia Progression Using Fundus Image Sequences and Baseline Refraction Data

Jul 31, 2024

Mengtian Kang, Yansong Hu, Shuo Gao, Yuanyuan Liu, Hongbei Meng, Xuemeng Li, Xuhang Chen, Hubin Zhao, Jing Fu, Guohua Hu(+6 more)

$Figure 1 for Deep Learning-Based Longitudinal Prediction of Childhood Myopia Progression Using Fundus Image Sequences and Baseline Refraction Data$

$Figure 2 for Deep Learning-Based Longitudinal Prediction of Childhood Myopia Progression Using Fundus Image Sequences and Baseline Refraction Data$

$Figure 3 for Deep Learning-Based Longitudinal Prediction of Childhood Myopia Progression Using Fundus Image Sequences and Baseline Refraction Data$

Abstract:Childhood myopia constitutes a significant global health concern. It exhibits an escalating prevalence and has the potential to evolve into severe, irreversible conditions that detrimentally impact familial well-being and create substantial economic costs. Contemporary research underscores the importance of precisely predicting myopia progression to enable timely and effective interventions, thereby averting severe visual impairment in children. Such predictions predominantly rely on subjective clinical assessments, which are inherently biased and resource-intensive, thus hindering their widespread application. In this study, we introduce a novel, high-accuracy method for quantitatively predicting the myopic trajectory and myopia risk in children using only fundus images and baseline refraction data. This approach was validated through a six-year longitudinal study of 3,408 children in Henan, utilizing 16,211 fundus images and corresponding refractive data. Our method based on deep learning demonstrated predictive accuracy with an error margin of 0.311D per year and AUC scores of 0.944 and 0.995 for forecasting the risks of developing myopia and high myopia, respectively. These findings confirm the utility of our model in supporting early intervention strategies and in significantly reducing healthcare costs, particularly by obviating the need for additional metadata and repeated consultations. Furthermore, our method was designed to rely only on fundus images and refractive error data, without the need for meta data or multiple inquiries from doctors, strongly reducing the associated medical costs and facilitating large-scale screening. Our model can even provide good predictions based on only a single time measurement. Consequently, the proposed method is an important means to reduce medical inequities caused by economic disparities.

Via

Access Paper or Ask Questions

Improved Bounds for Pure Private Agnostic Learning: Item-Level and User-Level Privacy

Jul 30, 2024

Bo Li, Wei Wang, Peng Ye

Figure 1 for Improved Bounds for Pure Private Agnostic Learning: Item-Level and User-Level Privacy

Abstract:Machine Learning has made remarkable progress in a wide range of fields. In many scenarios, learning is performed on datasets involving sensitive information, in which privacy protection is essential for learning algorithms. In this work, we study pure private learning in the agnostic model -- a framework reflecting the learning process in practice. We examine the number of users required under item-level (where each user contributes one example) and user-level (where each user contributes multiple examples) privacy and derive several improved upper bounds. For item-level privacy, our algorithm achieves a near optimal bound for general concept classes. We extend this to the user-level setting, rendering a tighter upper bound than the one proved by Ghazi et al. (2023). Lastly, we consider the problem of learning thresholds under user-level privacy and present an algorithm with a nearly tight user complexity.

Via

Access Paper or Ask Questions