Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wei Yu

University of Toronto

Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

Apr 06, 2024

Juncheng Yang, Zuchao Li, Shuai Xie, Wei Yu, Shijun Li, Bo Du

Figure 1 for Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

Figure 2 for Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

Figure 3 for Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

Figure 4 for Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

Abstract:The chain-of-thought technique has been received well in multi-modal tasks. It is a step-by-step linear reasoning process that adjusts the length of the chain to improve the performance of generated prompts. However, human thought processes are predominantly non-linear, as they encompass multiple aspects simultaneously and employ dynamic adjustment and updating mechanisms. Therefore, we propose a novel Aggregation-Graph-of-Thought (AGoT) mechanism for soft-prompt tuning in multi-modal representation learning. The proposed AGoT models the human thought process not only as a chain but also models each step as a reasoning aggregation graph to cope with the overlooked multiple aspects of thinking in single-step reasoning. This turns the entire reasoning process into prompt aggregation and prompt flow operations. Experiments show that our multi-modal model enhanced with AGoT soft-prompting achieves good results in several tasks such as text-image retrieval, visual question answering, and image recognition. In addition, we demonstrate that it has good domain generalization performance due to better reasoning.

* This paper is accepted to LREC-COLING 2024

Via

Access Paper or Ask Questions

Theoretical Modeling and Bio-inspired Trajectory Optimization of A Multiple-locomotion Origami Robot

Mar 19, 2024

Keqi Zhu, Haotian Guo, Wei Yu, Hassen Nigatu, Tong Li, Huixu Dong

Figure 1 for Theoretical Modeling and Bio-inspired Trajectory Optimization of A Multiple-locomotion Origami Robot

Figure 2 for Theoretical Modeling and Bio-inspired Trajectory Optimization of A Multiple-locomotion Origami Robot

Figure 3 for Theoretical Modeling and Bio-inspired Trajectory Optimization of A Multiple-locomotion Origami Robot

Figure 4 for Theoretical Modeling and Bio-inspired Trajectory Optimization of A Multiple-locomotion Origami Robot

Abstract:Recent research on mobile robots has focused on increasing their adaptability to unpredictable and unstructured environments using soft materials and structures. However, the determination of key design parameters and control over these compliant robots are predominantly iterated through experiments, lacking a solid theoretical foundation. To improve their efficiency, this paper aims to provide mathematics modeling over two locomotion, crawling and swimming. Specifically, a dynamic model is first devised to reveal the influence of the contact surfaces' frictional coefficients on displacements in different motion phases. Besides, a swimming kinematics model is provided using coordinate transformation, based on which, we further develop an algorithm that systematically plans human-like swimming gaits, with maximum thrust obtained. The proposed algorithm is highly generalizable and has the potential to be applied in other soft robots with multiple joints. Simulation experiments have been conducted to illustrate the effectiveness of the proposed modeling.

* 8 pages

Via

Access Paper or Ask Questions

Meta-Learning-Based Fronthaul Compression for Cloud Radio Access Networks

Mar 13, 2024

Ruihua Qiao, Tao Jiang, Wei Yu

Figure 1 for Meta-Learning-Based Fronthaul Compression for Cloud Radio Access Networks

Figure 2 for Meta-Learning-Based Fronthaul Compression for Cloud Radio Access Networks

Figure 3 for Meta-Learning-Based Fronthaul Compression for Cloud Radio Access Networks

Figure 4 for Meta-Learning-Based Fronthaul Compression for Cloud Radio Access Networks

Abstract:This paper investigates the fronthaul compression problem in a user-centric cloud radio access network, in which single-antenna users are served by a central processor (CP) cooperatively via a cluster of remote radio heads (RRHs). To satisfy the fronthaul capacity constraint, this paper proposes a transform-compress-forward scheme, which consists of well-designed transformation matrices and uniform quantizers. The transformation matrices perform dimension reduction in the uplink and dimension expansion in the downlink. To reduce the communication overhead for designing the transformation matrices, this paper further proposes a deep learning framework to first learn a suboptimal transformation matrix at each RRH based on the local channel state information (CSI), and then to refine it iteratively. To facilitate the refinement process, we propose an efficient signaling scheme that only requires the transmission of low-dimensional effective CSI and its gradient between the CP and RRH, and further, a meta-learning based gated recurrent unit network to reduce the number of signaling transmission rounds. For the sum-rate maximization problem, simulation results show that the proposed two-stage neural network can perform close to the fully cooperative global CSI based benchmark with significantly reduced communication overhead for both the uplink and the downlink. Moreover, using the first stage alone can already outperform the existing local CSI based benchmark.

* 15 Pages, 13 Figures; accepted in IEEE Transactions on Wireless Communications

Via

Access Paper or Ask Questions

Active Sensing for Reciprocal MIMO Channels

Feb 29, 2024

Tao Jiang, Wei Yu

Abstract:This paper addresses the design of transmit precoder and receive combiner matrices to support $N_{\rm s}$ independent data streams over a time-division duplex (TDD) point-to-point massive multiple-input multiple-output (MIMO) channel with either a fully digital or a hybrid structure. The optimal precoder and combiner design amounts to finding the top-$N_{\rm s}$ singular vectors of the channel matrix, but the explicit estimation of the entire high-dimensional channel would require significant pilot overhead. Alternatively, prior works seek to find the precoding and combining matrices directly by exploiting channel reciprocity and by using the power iteration method, but its performance degrades in the low SNR regime. To tackle this challenging problem, this paper proposes a learning-based active sensing framework, where the transmitter and the receiver send pilots alternately using sensing beamformers that are actively designed as functions of previously received pilots. This is accomplished by using recurrent neural networks to summarize information from the historical observations into hidden state vectors, then using fully connected neural networks to learn the appropriate sensing beamformers in the next pilot stage and finally the transmit precoding and receive combiner matrices for data communications. Simulations demonstrate that the learning-based method outperforms existing approaches significantly and maintains superior performance even in low SNR regimes both in fully digital and hybrid MIMO scenarios.

Via

Access Paper or Ask Questions

Dual-Context Aggregation for Universal Image Matting

Feb 28, 2024

Qinglin Liu, Xiaoqian Lv, Wei Yu, Changyong Guo, Shengping Zhang

Abstract:Natural image matting aims to estimate the alpha matte of the foreground from a given image. Various approaches have been explored to address this problem, such as interactive matting methods that use guidance such as click or trimap, and automatic matting methods tailored to specific objects. However, existing matting methods are designed for specific objects or guidance, neglecting the common requirement of aggregating global and local contexts in image matting. As a result, these methods often encounter challenges in accurately identifying the foreground and generating precise boundaries, which limits their effectiveness in unforeseen scenarios. In this paper, we propose a simple and universal matting framework, named Dual-Context Aggregation Matting (DCAM), which enables robust image matting with arbitrary guidance or without guidance. Specifically, DCAM first adopts a semantic backbone network to extract low-level features and context features from the input image and guidance. Then, we introduce a dual-context aggregation network that incorporates global object aggregators and local appearance aggregators to iteratively refine the extracted context features. By performing both global contour segmentation and local boundary refinement, DCAM exhibits robustness to diverse types of guidance and objects. Finally, we adopt a matting decoder network to fuse the low-level features and the refined context features for alpha matte estimation. Experimental results on five matting datasets demonstrate that the proposed DCAM outperforms state-of-the-art matting methods in both automatic matting and interactive matting tasks, which highlights the strong universality and high performance of DCAM. The source code is available at \url{https://github.com/Windaway/DCAM}.

* Multimed Tools Appl (2023)

Via

Access Paper or Ask Questions

Hybrid Online-Offline Learning for Task Offloading in Mobile Edge Computing Systems

Feb 27, 2024

Muhammad Sohaib, Sang-Woon Jeon, Wei Yu

Figure 1 for Hybrid Online-Offline Learning for Task Offloading in Mobile Edge Computing Systems

Figure 2 for Hybrid Online-Offline Learning for Task Offloading in Mobile Edge Computing Systems

Figure 3 for Hybrid Online-Offline Learning for Task Offloading in Mobile Edge Computing Systems

Figure 4 for Hybrid Online-Offline Learning for Task Offloading in Mobile Edge Computing Systems

Abstract:We consider a multi-user multi-server mobile edge computing (MEC) system, in which users arrive on a network randomly over time and generate computation tasks, which will be computed either locally on their own computing devices or be offloaded to one of the MEC servers. Under such a dynamic network environment, we propose a novel task offloading policy based on hybrid online-offline learning, which can efficiently reduce the overall computation delay and energy consumption only with information available at nearest MEC servers from each user. We provide a practical signaling and learning framework that can train deep neural networks for both online and offline learning and can adjust its offloading policy based on the queuing status of each MEC server and network dynamics. Numerical results demonstrate that the proposed scheme significantly reduces the average computation delay for a broad class of network environments compared to the conventional offloading methods. It is further shown that the proposed hybrid online-offline learning framework can be extended to a general cost function reflecting both delay and energy-dependent metrics.

* IEEE Transactions on Wireless Communications (2023)
* accepted by IEEE Transactions on Wireless Communications

Via

Access Paper or Ask Questions

Rate-Distortion-Perception Tradeoff Based on the Conditional-Distribution Perception Measure

Jan 22, 2024

Sadaf Salehkalaibar, Jun Chen, Ashish Khisti, Wei Yu

Figure 1 for Rate-Distortion-Perception Tradeoff Based on the Conditional-Distribution Perception Measure

Figure 2 for Rate-Distortion-Perception Tradeoff Based on the Conditional-Distribution Perception Measure

Abstract:We study the rate-distortion-perception (RDP) tradeoff for a memoryless source model in the asymptotic limit of large block-lengths. Our perception measure is based on a divergence between the distributions of the source and reconstruction sequences conditioned on the encoder output, which was first proposed in [1], [2]. We consider the case when there is no shared randomness between the encoder and the decoder. For the case of discrete memoryless sources we derive a single-letter characterization of the RDP function, thus settling a problem that remains open for the marginal metric introduced in Blau and Michaeli [3] (with no shared randomness). Our achievability scheme is based on lossy source coding with a posterior reference map proposed in [4]. For the case of continuous valued sources under squared error distortion measure and squared quadratic Wasserstein perception measure we also derive a single-letter characterization and show that a noise-adding mechanism at the decoder suffices to achieve the optimal representation. For the case of zero perception loss, we show that our characterization interestingly coincides with the results for the marginal metric derived in [5], [6] and again demonstrate that zero perception loss can be achieved with a $3$-dB penalty in the minimum distortion. Finally we specialize our results to the case of Gaussian sources. We derive the RDP function for vector Gaussian sources and propose a waterfilling type solution. We also partially characterize the RDP function for a mixture of vector Gaussians.

Via

Access Paper or Ask Questions

A Survey of Advances in Optimization Methods for Wireless Communication System Design

Jan 22, 2024

Ya-Feng Liu, Tsung-Hui Chang, Mingyi Hong, Zheyu Wu, Anthony Man-Cho So, Eduard A. Jorswieck, Wei Yu

Abstract:Mathematical optimization is now widely regarded as an indispensable modeling and solution tool for the design of wireless communications systems. While optimization has played a significant role in the revolutionary progress in wireless communication and networking technologies from 1G to 5G and onto the future 6G, the innovations in wireless technologies have also substantially transformed the nature of the underlying mathematical optimization problems upon which the system designs are based and have sparked significant innovations in the development of methodologies to understand, to analyze, and to solve those problems. In this paper, we provide a comprehensive survey of recent advances in mathematical optimization theory and algorithms for wireless communication system design. We begin by illustrating common features of mathematical optimization problems arising in wireless communication system design. We discuss various scenarios and use cases and their associated mathematical structures from an optimization perspective. We then provide an overview of recent advances in mathematical optimization theory and algorithms, from nonconvex optimization, global optimization, and integer programming, to distributed optimization and learning-based optimization. The key to successful solution of mathematical optimization problems is in carefully choosing and/or developing suitable optimization algorithms (or neural network architectures) that can exploit the underlying problem structure. We conclude the paper by identifying several open research challenges and outlining future research directions.

* 47 pages, 10 figures, submitted for possible publication

Via

Access Paper or Ask Questions

Localization with Reconfigurable Intelligent Surface: An Active Sensing Approach

Dec 15, 2023

Zhongze Zhang, Tao Jiang, Wei Yu

Abstract:This paper addresses an uplink localization problem in which a base station (BS) aims to locate a remote user with the help of reconfigurable intelligent surfaces (RISs). We propose a strategy in which the user transmits pilots sequentially and the BS adaptively adjusts the sensing vectors, including the BS beamforming vector and multiple RIS reflection coefficients based on the observations already made, to eventually produce an estimated user position. This is a challenging active sensing problem for which finding an optimal solution involves searching through a complicated functional space whose dimension increases with the number of measurements. We show that the long short-term memory (LSTM) network can be used to exploit the latent temporal correlation between measurements to automatically construct scalable state vectors. Subsequently, the state vector is mapped to the sensing vectors for the next time frame via a deep neural network (DNN). A final DNN is used to map the state vector to the estimated user position. Numerical result illustrates the advantage of the active sensing design as compared to non-active sensing methods. The proposed solution produces interpretable results and is generalizable in the number of sensing stages. Remarkably, we show that a network with one BS and multiple RISs can outperform a comparable setting with multiple BSs.

* Accepted in IEEE Transactions on Wireless Communications. This is an extended version of the previous arXiv paper arXiv:2310.13160

Via

Access Paper or Ask Questions

Levenshtein Distance Embedding with Poisson Regression for DNA Storage

Dec 13, 2023

Xiang Wei, Alan J. X. Guo, Sihan Sun, Mengyi Wei, Wei Yu

Figure 1 for Levenshtein Distance Embedding with Poisson Regression for DNA Storage

Figure 2 for Levenshtein Distance Embedding with Poisson Regression for DNA Storage

Figure 3 for Levenshtein Distance Embedding with Poisson Regression for DNA Storage

Figure 4 for Levenshtein Distance Embedding with Poisson Regression for DNA Storage

Abstract:Efficient computation or approximation of Levenshtein distance, a widely-used metric for evaluating sequence similarity, has attracted significant attention with the emergence of DNA storage and other biological applications. Sequence embedding, which maps Levenshtein distance to a conventional distance between embedding vectors, has emerged as a promising solution. In this paper, a novel neural network-based sequence embedding technique using Poisson regression is proposed. We first provide a theoretical analysis of the impact of embedding dimension on model performance and present a criterion for selecting an appropriate embedding dimension. Under this embedding dimension, the Poisson regression is introduced by assuming the Levenshtein distance between sequences of fixed length following a Poisson distribution, which naturally aligns with the definition of Levenshtein distance. Moreover, from the perspective of the distribution of embedding distances, Poisson regression approximates the negative log likelihood of the chi-squared distribution and offers advancements in removing the skewness. Through comprehensive experiments on real DNA storage data, we demonstrate the superior performance of the proposed method compared to state-of-the-art approaches.

Via

Access Paper or Ask Questions