Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Songyang Zhang

University of Louisiana at Lafayette, USA

Efficient Transmission of Radiomaps via Physics-Enhanced Semantic Communications

Jan 18, 2025

Yueling Zhou, Achintha Wijesinghe, Yue Wang, Songyang Zhang, Zhipeng Cai

Figure 1 for Efficient Transmission of Radiomaps via Physics-Enhanced Semantic Communications

Figure 2 for Efficient Transmission of Radiomaps via Physics-Enhanced Semantic Communications

Figure 3 for Efficient Transmission of Radiomaps via Physics-Enhanced Semantic Communications

Figure 4 for Efficient Transmission of Radiomaps via Physics-Enhanced Semantic Communications

Abstract:Enriching information of spectrum coverage, radiomap plays an important role in many wireless communication applications, such as resource allocation and network optimization. To enable real-time, distributed spectrum management, particularly in the scenarios with unstable and dynamic environments, the efficient transmission of spectrum coverage information for radiomaps from edge devices to the central server emerges as a critical problem. In this work, we propose an innovative physics-enhanced semantic communication framework tailored for efficient radiomap transmission based on generative learning models. Specifically, instead of bit-wise message passing, we only transmit the key "semantics" in radiomaps characterized by the radio propagation behavior and surrounding environments, where semantic compression schemes are utilized to reduce the communication overhead. Incorporating the novel concepts of Radio Depth Maps, the radiomaps are reconstructed from the delivered semantic information backboned on the conditional generative adversarial networks. Our framework is further extended to facilitate its implementation in the scenarios of multi-user edge computing, by integrating with federated learning for collaborative model training while preserving the data privacy. Experimental results show that our approach achieves high accuracy in radio coverage information recovery at ultra-high bandwidth efficiency, which has great potentials in many wireless-generated data transmission applications.

* To appear in 2025 IEEE International Conference on Communications

Via

Access Paper or Ask Questions

DualGFL: Federated Learning with a Dual-Level Coalition-Auction Game

Dec 20, 2024

Xiaobing Chen, Xiangwei Zhou, Songyang Zhang, Mingxuan Sun

Abstract:Despite some promising results in federated learning using game-theoretical methods, most existing studies mainly employ a one-level game in either a cooperative or competitive environment, failing to capture the complex dynamics among participants in practice. To address this issue, we propose DualGFL, a novel Federated Learning framework with a Dual-level Game in cooperative-competitive environments. DualGFL includes a lower-level hedonic game where clients form coalitions and an upper-level multi-attribute auction game where coalitions bid for training participation. At the lower-level DualGFL, we introduce a new auction-aware utility function and propose a Pareto-optimal partitioning algorithm to find a Pareto-optimal partition based on clients' preference profiles. At the upper-level DualGFL, we formulate a multi-attribute auction game with resource constraints and derive equilibrium bids to maximize coalitions' winning probabilities and profits. A greedy algorithm is proposed to maximize the utility of the central server. Extensive experiments on real-world datasets demonstrate DualGFL's effectiveness in improving both server utility and client utility.

* 12 pages, 6 figures. Accepted by AAAI25

Via

Access Paper or Ask Questions

LaMI-GO: Latent Mixture Integration for Goal-Oriented Communications Achieving High Spectrum Efficiency

Dec 18, 2024

Achintha Wijesinghe, Suchinthaka Wanninayaka, Weiwei Wang, Yu-Chieh Chao, Songyang Zhang, Zhi Ding

Figure 1 for LaMI-GO: Latent Mixture Integration for Goal-Oriented Communications Achieving High Spectrum Efficiency

Figure 2 for LaMI-GO: Latent Mixture Integration for Goal-Oriented Communications Achieving High Spectrum Efficiency

Figure 3 for LaMI-GO: Latent Mixture Integration for Goal-Oriented Communications Achieving High Spectrum Efficiency

Figure 4 for LaMI-GO: Latent Mixture Integration for Goal-Oriented Communications Achieving High Spectrum Efficiency

Abstract:The recent rise of semantic-style communications includes the development of goal-oriented communications (GOCOMs) remarkably efficient multimedia information transmissions. The concept of GO-COMS leverages advanced artificial intelligence (AI) tools to address the rising demand for bandwidth efficiency in applications, such as edge computing and Internet-of-Things (IoT). Unlike traditional communication systems focusing on source data accuracy, GO-COMs provide intelligent message delivery catering to the special needs critical to accomplishing downstream tasks at the receiver. In this work, we present a novel GO-COM framework, namely LaMI-GO that utilizes emerging generative AI for better quality-of-service (QoS) with ultra-high communication efficiency. Specifically, we design our LaMI-GO system backbone based on a latent diffusion model followed by a vector-quantized generative adversarial network (VQGAN) for efficient latent embedding and information representation. The system trains a common feature codebook the receiver side. Our experimental results demonstrate substantial improvement in perceptual quality, accuracy of downstream tasks, and bandwidth consumption over the state-of-the-art GOCOM systems and establish the power of our proposed LaMI-GO communication framework.

* Under review

Via

Access Paper or Ask Questions

Are Your LLMs Capable of Stable Reasoning?

Dec 17, 2024

Junnan Liu, Hongwei Liu, Linchen Xiao, Ziyi Wang, Kuikun Liu, Songyang Gao, Wenwei Zhang, Songyang Zhang, Kai Chen

Figure 1 for Are Your LLMs Capable of Stable Reasoning?

Figure 2 for Are Your LLMs Capable of Stable Reasoning?

Figure 3 for Are Your LLMs Capable of Stable Reasoning?

Figure 4 for Are Your LLMs Capable of Stable Reasoning?

Abstract:The rapid advancement of Large Language Models (LLMs) has demonstrated remarkable progress in complex reasoning tasks. However, a significant discrepancy persists between benchmark performances and real-world applications. We identify this gap as primarily stemming from current evaluation protocols and metrics, which inadequately capture the full spectrum of LLM capabilities, particularly in complex reasoning tasks where both accuracy and consistency are crucial. This work makes two key contributions. First, we introduce G-Pass@k, a novel evaluation metric that provides a continuous assessment of model performance across multiple sampling attempts, quantifying both the model's peak performance potential and its stability. Second, we present LiveMathBench, a dynamic benchmark comprising challenging, contemporary mathematical problems designed to minimize data leakage risks during evaluation. Through extensive experiments using G-Pass@k on state-of-the-art LLMs with LiveMathBench, we provide comprehensive insights into both their maximum capabilities and operational consistency. Our findings reveal substantial room for improvement in LLMs' "realistic" reasoning capabilities, highlighting the need for more robust evaluation methods. The benchmark and detailed results are available at: https://github.com/open-compass/GPassK.

* Preprint

Via

Access Paper or Ask Questions

Diff-GO$^\text{n}$: Enhancing Diffusion Models for Goal-Oriented Communications

Dec 09, 2024

Suchinthaka Wanninayaka, Achintha Wijesinghe, Weiwei Wang, Yu-Chieh Chao, Songyang Zhang, Zhi Ding

$Figure 1 for Diff-GO$^\text{n}$: Enhancing Diffusion Models for Goal-Oriented Communications$

$Figure 2 for Diff-GO$^\text{n}$: Enhancing Diffusion Models for Goal-Oriented Communications$

$Figure 3 for Diff-GO$^\text{n}$: Enhancing Diffusion Models for Goal-Oriented Communications$

$Figure 4 for Diff-GO$^\text{n}$: Enhancing Diffusion Models for Goal-Oriented Communications$

Abstract:The rapid expansion of edge devices and Internet-of-Things (IoT) continues to heighten the demand for data transport under limited spectrum resources. The goal-oriented communications (GO-COM), unlike traditional communication systems designed for bit-level accuracy, prioritizes more critical information for specific application goals at the receiver. To improve the efficiency of generative learning models for GO-COM, this work introduces a novel noise-restricted diffusion-based GO-COM (Diff-GO$^\text{n}$) framework for reducing bandwidth overhead while preserving the media quality at the receiver. Specifically, we propose an innovative Noise-Restricted Forward Diffusion (NR-FD) framework to accelerate model training and reduce the computation burden for diffusion-based GO-COMs by leveraging a pre-sampled pseudo-random noise bank (NB). Moreover, we design an early stopping criterion for improving computational efficiency and convergence speed, allowing high-quality generation in fewer training steps. Our experimental results demonstrate superior perceptual quality of data transmission at a reduced bandwidth usage and lower computation, making Diff-GO$^\text{n}$ well-suited for real-time communications and downstream applications.

Via

Access Paper or Ask Questions

CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Oct 21, 2024

Maosong Cao, Alexander Lam, Haodong Duan, Hongwei Liu, Songyang Zhang, Kai Chen

Figure 1 for CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Figure 2 for CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Figure 3 for CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Figure 4 for CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Abstract:Efficient and accurate evaluation is crucial for the continuous improvement of large language models (LLMs). Among various assessment methods, subjective evaluation has garnered significant attention due to its superior alignment with real-world usage scenarios and human preferences. However, human-based evaluations are costly and lack reproducibility, making precise automated evaluators (judgers) vital in this process. In this report, we introduce \textbf{CompassJudger-1}, the first open-source \textbf{all-in-one} judge LLM. CompassJudger-1 is a general-purpose LLM that demonstrates remarkable versatility. It is capable of: 1. Performing unitary scoring and two-model comparisons as a reward model; 2. Conducting evaluations according to specified formats; 3. Generating critiques; 4. Executing diverse tasks like a general LLM. To assess the evaluation capabilities of different judge models under a unified setting, we have also established \textbf{JudgerBench}, a new benchmark that encompasses various subjective evaluation tasks and covers a wide range of topics. CompassJudger-1 offers a comprehensive solution for various evaluation tasks while maintaining the flexibility to adapt to diverse requirements. Both CompassJudger and JudgerBench are released and available to the research community athttps://github.com/open-compass/CompassJudger. We believe that by open-sourcing these tools, we can foster collaboration and accelerate progress in LLM evaluation methodologies.

* Technical Report, Code and Models: https://github.com/open-compass/CompassJudger

Via

Access Paper or Ask Questions

ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Oct 16, 2024

Jingming Zhuo, Songyang Zhang, Xinyu Fang, Haodong Duan, Dahua Lin, Kai Chen

Figure 1 for ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Figure 2 for ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Figure 3 for ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Figure 4 for ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Abstract:Large language models (LLMs) have demonstrated impressive capabilities across various tasks, but their performance is highly sensitive to the prompts utilized. This variability poses challenges for accurate assessment and user satisfaction. Current research frequently overlooks instance-level prompt variations and their implications on subjective evaluations. To address these shortcomings, we introduce ProSA, a framework designed to evaluate and comprehend prompt sensitivity in LLMs. ProSA incorporates a novel sensitivity metric, PromptSensiScore, and leverages decoding confidence to elucidate underlying mechanisms. Our extensive study, spanning multiple tasks, uncovers that prompt sensitivity fluctuates across datasets and models, with larger models exhibiting enhanced robustness. We observe that few-shot examples can alleviate this sensitivity issue, and subjective evaluations are also susceptible to prompt sensitivities, particularly in complex, reasoning-oriented tasks. Furthermore, our findings indicate that higher model confidence correlates with increased prompt robustness. We believe this work will serve as a helpful tool in studying prompt sensitivity of LLMs. The project is released at: https://github.com/open-compass/ProSA .

* EMNLP 2024, Findings

Via

Access Paper or Ask Questions

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Sep 24, 2024

Haoran Que, Feiyu Duan, Liqun He, Yutao Mou, Wangchunshu Zhou, Jiaheng Liu, Wenge Rong, Zekun Moore Wang, Jian Yang, Ge Zhang(+4 more)

Figure 1 for HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Figure 2 for HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Figure 3 for HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Figure 4 for HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Abstract:In recent years, Large Language Models (LLMs) have demonstrated remarkable capabilities in various tasks (e.g., long-context understanding), and many benchmarks have been proposed. However, we observe that long text generation capabilities are not well investigated. Therefore, we introduce the Hierarchical Long Text Generation Benchmark (HelloBench), a comprehensive, in-the-wild, and open-ended benchmark to evaluate LLMs' performance in generating long text. Based on Bloom's Taxonomy, HelloBench categorizes long text generation tasks into five subtasks: open-ended QA, summarization, chat, text completion, and heuristic text generation. Besides, we propose Hierarchical Long Text Evaluation (HelloEval), a human-aligned evaluation method that significantly reduces the time and effort required for human evaluation while maintaining a high correlation with human evaluation. We have conducted extensive experiments across around 30 mainstream LLMs and observed that the current LLMs lack long text generation capabilities. Specifically, first, regardless of whether the instructions include explicit or implicit length constraints, we observe that most LLMs cannot generate text that is longer than 4000 words. Second, we observe that while some LLMs can generate longer text, many issues exist (e.g., severe repetition and quality degradation). Third, to demonstrate the effectiveness of HelloEval, we compare HelloEval with traditional metrics (e.g., ROUGE, BLEU, etc.) and LLM-as-a-Judge methods, which show that HelloEval has the highest correlation with human evaluation. We release our code in https://github.com/Quehry/HelloBench.

Via

Access Paper or Ask Questions

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios

Aug 30, 2024

Baichuan Zhou, Haote Yang, Dairong Chen, Junyan Ye, Tianyi Bai, Jinhua Yu, Songyang Zhang, Dahua Lin, Conghui He, Weijia Li

Abstract:Recent evaluations of Large Multimodal Models (LMMs) have explored their capabilities in various domains, with only few benchmarks specifically focusing on urban environments. Moreover, existing urban benchmarks have been limited to evaluating LMMs with basic region-level urban tasks under singular views, leading to incomplete evaluations of LMMs' abilities in urban environments. To address these issues, we present UrBench, a comprehensive benchmark designed for evaluating LMMs in complex multi-view urban scenarios. UrBench contains 11.6K meticulously curated questions at both region-level and role-level that cover 4 task dimensions: Geo-Localization, Scene Reasoning, Scene Understanding, and Object Understanding, totaling 14 task types. In constructing UrBench, we utilize data from existing datasets and additionally collect data from 11 cities, creating new annotations using a cross-view detection-matching method. With these images and annotations, we then integrate LMM-based, rule-based, and human-based methods to construct large-scale high-quality questions. Our evaluations on 21 LMMs show that current LMMs struggle in the urban environments in several aspects. Even the best performing GPT-4o lags behind humans in most tasks, ranging from simple tasks such as counting to complex tasks such as orientation, localization and object attribute recognition, with an average performance gap of 17.4%. Our benchmark also reveals that LMMs exhibit inconsistent behaviors with different urban views, especially with respect to understanding cross-view relations. UrBench datasets and benchmark results will be publicly available at https://opendatalab.github.io/UrBench/.

* 9 pages, 6 figures

Via

Access Paper or Ask Questions

LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models

Jul 22, 2024

Xi Chen, Songyang Zhang, Qibing Bai, Kai Chen, Satoshi Nakamura

Figure 1 for LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models

Figure 2 for LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models

Figure 3 for LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models

Figure 4 for LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models

Abstract:We introduces LLaST, a framework for building high-performance Large Language model based Speech-to-text Translation systems. We address the limitations of end-to-end speech translation(E2E ST) models by exploring model architecture design and optimization techniques tailored for LLMs. Our approach includes LLM-based speech translation architecture design, ASR-augmented training, multilingual data augmentation, and dual-LoRA optimization. Our approach demonstrates superior performance on the CoVoST-2 benchmark and showcases exceptional scaling capabilities powered by LLMs. We believe this effective method will serve as a strong baseline for speech translation and provide insights for future improvements of the LLM-based speech translation framework. We release the data, code and models in https://github.com/openaudiolab/LLaST.

Via

Access Paper or Ask Questions