Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yang Luo

PAF-Net: Phase-Aligned Frequency Decoupling Network for Multi-Process Manufacturing Quality Prediction

Jul 30, 2025

Yang Luo, Haoyang Luan, Haoyun Pan, Yongquan Jia, Xiaofeng Gao, Guihai Chen

Abstract:Accurate quality prediction in multi-process manufacturing is critical for industrial efficiency but hindered by three core challenges: time-lagged process interactions, overlapping operations with mixed periodicity, and inter-process dependencies in shared frequency bands. To address these, we propose PAF-Net, a frequency decoupled time series prediction framework with three key innovations: (1) A phase-correlation alignment method guided by frequency domain energy to synchronize time-lagged quality series, resolving temporal misalignment. (2) A frequency independent patch attention mechanism paired with Discrete Cosine Transform (DCT) decomposition to capture heterogeneous operational features within individual series. (3) A frequency decoupled cross attention module that suppresses noise from irrelevant frequencies, focusing exclusively on meaningful dependencies within shared bands. Experiments on 4 real-world datasets demonstrate PAF-Net's superiority. It outperforms 10 well-acknowledged baselines by 7.06% lower MSE and 3.88% lower MAE. Our code is available at https://github.com/StevenLuan904/PAF-Net-Official.

* 7 pages, 5 figures

Via

Access Paper or Ask Questions

PrefixAgent: An LLM-Powered Design Framework for Efficient Prefix Adder Optimization

Jul 08, 2025

Dongsheng Zuo, Jiadong Zhu, Yang Luo, Yuzhe Ma

Abstract:Prefix adders are fundamental arithmetic circuits, but their design space grows exponentially with bit-width, posing significant optimization challenges. Previous works face limitations in performance, generalization, and scalability. To address these challenges, we propose PrefixAgent, a large language model (LLM)-powered framework that enables efficient prefix adder optimization. Specifically, PrefixAgent reformulates the problem into subtasks including backbone synthesis and structure refinement, which effectively reduces the search space. More importantly, this new design perspective enables us to efficiently collect enormous high-quality data and reasoning traces with E-graph, which further results in an effective fine-tuning of LLM. Experimental results show that PrefixAgent synthesizes prefix adders with consistently smaller areas compared to baseline methods, while maintaining scalability and generalization in commercial EDA flows.

Via

Access Paper or Ask Questions

Widely Linear Augmented Extreme Learning Machine Based Impairments Compensation for Satellite Communications

Jun 17, 2025

Yang Luo, Arunprakash Jayaprakash, Gaojie Chen, Chong Huang, Qu Luo, De Mi, Pei Xiao

Abstract:Satellite communications are crucial for the evolution beyond fifth-generation networks. However, the dynamic nature of satellite channels and their inherent impairments present significant challenges. In this paper, a novel post-compensation scheme that combines the complex-valued extreme learning machine with augmented hidden layer (CELMAH) architecture and widely linear processing (WLP) is developed to address these issues by exploiting signal impropriety in satellite communications. Although CELMAH shares structural similarities with WLP, it employs a different core algorithm and does not fully exploit the signal impropriety. By incorporating WLP principles, we derive a tailored formulation suited to the network structure and propose the CELM augmented by widely linear least squares (CELM-WLLS) for post-distortion. The proposed approach offers enhanced communication robustness and is highly effective for satellite communication scenarios characterized by dynamic channel conditions and non-linear impairments. CELM-WLLS is designed to improve signal recovery performance and outperform traditional methods such as least square (LS) and minimum mean square error (MMSE). Compared to CELMAH, CELM-WLLS demonstrates approximately 0.8 dB gain in BER performance, and also achieves a two-thirds reduction in computational complexity, making it a more efficient solution.

* 12 pages, accepted for pulication in IEEE Transactions on Vehicular Technology

Via

Access Paper or Ask Questions

Info-Coevolution: An Efficient Framework for Data Model Coevolution

Jun 09, 2025

Ziheng Qin, Hailun Xu, Wei Chee Yew, Qi Jia, Yang Luo, Kanchan Sarkar, Danhui Guan, Kai Wang, Yang You

Abstract:Machine learning relies heavily on data, yet the continuous growth of real-world data poses challenges for efficient dataset construction and training. A fundamental yet unsolved question is: given our current model and data, does a new data (sample/batch) need annotation/learning? Conventional approaches retain all available data, leading to non-optimal data and training efficiency. Active learning aims to reduce data redundancy by selecting a subset of samples to annotate, while it increases pipeline complexity and introduces bias. In this work, we propose Info-Coevolution, a novel framework that efficiently enables models and data to coevolve through online selective annotation with no bias. Leveraging task-specific models (and open-source models), it selectively annotates and integrates online and web data to improve datasets efficiently. For real-world datasets like ImageNet-1K, Info-Coevolution reduces annotation and training costs by 32\% without performance loss. It is able to automatically give the saving ratio without tuning the ratio. It can further reduce the annotation ratio to 50\% with semi-supervised learning. We also explore retrieval-based dataset enhancement using unlabeled open-source data. Code is available at https://github.com/NUS-HPC-AI-Lab/Info-Coevolution/.

* ICML 2025
* V1

Via

Access Paper or Ask Questions

Enhance-A-Video: Better Generated Video for Free

Feb 11, 2025

Yang Luo, Xuanlei Zhao, Mengzhao Chen, Kaipeng Zhang, Wenqi Shao, Kai Wang, Zhangyang Wang, Yang You

Abstract:DiT-based video generation has achieved remarkable results, but research into enhancing existing models remains relatively unexplored. In this work, we introduce a training-free approach to enhance the coherence and quality of DiT-based generated videos, named Enhance-A-Video. The core idea is enhancing the cross-frame correlations based on non-diagonal temporal attention distributions. Thanks to its simple design, our approach can be easily applied to most DiT-based video generation frameworks without any retraining or fine-tuning. Across various DiT-based video generation models, our approach demonstrates promising improvements in both temporal consistency and visual quality. We hope this research can inspire future explorations in video generation enhancement.

Via

Access Paper or Ask Questions

PICBench: Benchmarking LLMs for Photonic Integrated Circuits Design

Feb 05, 2025

Yuchao Wu, Xiaofei Yu, Hao Chen, Yang Luo, Yeyu Tong, Yuzhe Ma

Abstract:While large language models (LLMs) have shown remarkable potential in automating various tasks in digital chip design, the field of Photonic Integrated Circuits (PICs)-a promising solution to advanced chip designs-remains relatively unexplored in this context. The design of PICs is time-consuming and prone to errors due to the extensive and repetitive nature of code involved in photonic chip design. In this paper, we introduce PICBench, the first benchmarking and evaluation framework specifically designed to automate PIC design generation using LLMs, where the generated output takes the form of a netlist. Our benchmark consists of dozens of meticulously crafted PIC design problems, spanning from fundamental device designs to more complex circuit-level designs. It automatically evaluates both the syntax and functionality of generated PIC designs by comparing simulation outputs with expert-written solutions, leveraging an open-source simulator. We evaluate a range of existing LLMs, while also conducting comparative tests on various prompt engineering techniques to enhance LLM performance in automated PIC design. The results reveal the challenges and potential of LLMs in the PIC design domain, offering insights into the key areas that require further research and development to optimize automation in this field. Our benchmark and evaluation code is available at https://github.com/PICDA/PICBench.

Via

Access Paper or Ask Questions

A Bilayer Segmentation-Recombination Network for Accurate Segmentation of Overlapping C. elegans

Nov 26, 2024

Mengqian Dinga, Jun Liua, Yang Luo, Jinshan Tang

Figure 1 for A Bilayer Segmentation-Recombination Network for Accurate Segmentation of Overlapping C. elegans

Figure 2 for A Bilayer Segmentation-Recombination Network for Accurate Segmentation of Overlapping C. elegans

Figure 3 for A Bilayer Segmentation-Recombination Network for Accurate Segmentation of Overlapping C. elegans

Figure 4 for A Bilayer Segmentation-Recombination Network for Accurate Segmentation of Overlapping C. elegans

Abstract:Caenorhabditis elegans (C. elegans) is an excellent model organism because of its short lifespan and high degree of homology with human genes, and it has been widely used in a variety of human health and disease models. However, the segmentation of C. elegans remains challenging due to the following reasons: 1) the activity trajectory of C. elegans is uncontrollable, and multiple nematodes often overlap, resulting in blurred boundaries of C. elegans. This makes it impossible to clearly study the life trajectory of a certain nematode; and 2) in the microscope images of overlapping C. elegans, the translucent tissues at the edges obscure each other, leading to inaccurate boundary segmentation. To solve these problems, a Bilayer Segmentation-Recombination Network (BR-Net) for the segmentation of C. elegans instances is proposed. The network consists of three parts: A Coarse Mask Segmentation Module (CMSM), a Bilayer Segmentation Module (BSM), and a Semantic Consistency Recombination Module (SCRM). The CMSM is used to extract the coarse mask, and we introduce a Unified Attention Module (UAM) in CMSM to make CMSM better aware of nematode instances. The Bilayer Segmentation Module (BSM) segments the aggregated C. elegans into overlapping and non-overlapping regions. This is followed by integration by the SCRM, where semantic consistency regularization is introduced to segment nematode instances more accurately. Finally, the effectiveness of the method is verified on the C. elegans dataset. The experimental results show that BR-Net exhibits good competitiveness and outperforms other recently proposed instance segmentation methods in processing C. elegans occlusion images.

Via

Access Paper or Ask Questions

AE-DENet: Enhancement for Deep Learning-based Channel Estimation in OFDM Systems

Nov 13, 2024

Ephrem Fola, Yang Luo, Chunbo Luo

Abstract:Deep learning (DL)-based methods have demonstrated remarkable achievements in addressing orthogonal frequency division multiplexing (OFDM) channel estimation challenges. However, existing DL-based methods mainly rely on separate real and imaginary inputs while ignoring the inherent correlation between the two streams, such as amplitude and phase information that are fundamental in communication signal processing. This paper proposes AE-DENet, a novel autoencoder(AE)-based data enhancement network to improve the performance of existing DL-based channel estimation methods. AE-DENet focuses on enriching the classic least square (LS) estimation input commonly used in DL-based methods by employing a learning-based data enhancement method, which extracts interaction features from the real and imaginary components and fuses them with the original real/imaginary streams to generate an enhanced input for better channel inference. Experimental findings in terms of the mean square error (MSE) results demonstrate that the proposed method enhances the performance of all state-of-the-art DL-based channel estimators with negligible added complexity. Furthermore, the proposed approach is shown to be robust to channel variations and high user mobility.

* This paper is accepted for IEEE GLOBECOM 2024

Via

Access Paper or Ask Questions

FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process

Sep 11, 2024

Yang Luo, Yiheng Zhang, Zhaofan Qiu, Ting Yao, Zhineng Chen, Yu-Gang Jiang, Tao Mei

Figure 1 for FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process

Figure 2 for FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process

Figure 3 for FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process

Figure 4 for FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process

Abstract:The emergence of text-to-image generation models has led to the recognition that image enhancement, performed as post-processing, would significantly improve the visual quality of the generated images. Exploring diffusion models to enhance the generated images nevertheless is not trivial and necessitates to delicately enrich plentiful details while preserving the visual appearance of key content in the original image. In this paper, we propose a novel framework, namely FreeEnhance, for content-consistent image enhancement using the off-the-shelf image diffusion models. Technically, FreeEnhance is a two-stage process that firstly adds random noise to the input image and then capitalizes on a pre-trained image diffusion model (i.e., Latent Diffusion Models) to denoise and enhance the image details. In the noising stage, FreeEnhance is devised to add lighter noise to the region with higher frequency to preserve the high-frequent patterns (e.g., edge, corner) in the original image. In the denoising stage, we present three target properties as constraints to regularize the predicted noise, enhancing images with high acutance and high visual quality. Extensive experiments conducted on the HPDv2 dataset demonstrate that our FreeEnhance outperforms the state-of-the-art image enhancement models in terms of quantitative metrics and human preference. More remarkably, FreeEnhance also shows higher human preference compared to the commercial image enhancement solution of Magnific AI.

* ACM Multimedia 2024

Via

Access Paper or Ask Questions

Robust Domain Generalization for Multi-modal Object Recognition

Aug 11, 2024

Yuxin Qiao, Keqin Li, Junhong Lin, Rong Wei, Chufeng Jiang, Yang Luo, Haoyu Yang

Figure 1 for Robust Domain Generalization for Multi-modal Object Recognition

Figure 2 for Robust Domain Generalization for Multi-modal Object Recognition

Figure 3 for Robust Domain Generalization for Multi-modal Object Recognition

Figure 4 for Robust Domain Generalization for Multi-modal Object Recognition

Abstract:In multi-label classification, machine learning encounters the challenge of domain generalization when handling tasks with distributions differing from the training data. Existing approaches primarily focus on vision object recognition and neglect the integration of natural language. Recent advancements in vision-language pre-training leverage supervision from extensive visual-language pairs, enabling learning across diverse domains and enhancing recognition in multi-modal scenarios. However, these approaches face limitations in loss function utilization, generality across backbones, and class-aware visual fusion. This paper proposes solutions to these limitations by inferring the actual loss, broadening evaluations to larger vision-language backbones, and introducing Mixup-CLIPood, which incorporates a novel mix-up loss for enhanced class-aware visual fusion. Our method demonstrates superior performance in domain generalization across multiple datasets.

* 6 pages, 2 figures. This is a preprint version of the article. The final version will be published in the proceedings of the IEEE conference

Via

Access Paper or Ask Questions