Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chen Qiu

Learning Versatile Humanoid Manipulation with Touch Dreaming

Apr 14, 2026

Yaru Niu, Zhenlong Fang, Binghong Chen, Shuai Zhou, Revanth Senthilkumaran, Hao Zhang, Bingqing Chen, Chen Qiu, H. Eric Tseng, Jonathan Francis(+1 more)

Abstract:Humanoid robots promise general-purpose assistance, yet real-world humanoid loco-manipulation remains challenging because it requires whole-body stability, dexterous hands, and contact-aware perception under frequent contact changes. In this work, we study dexterous, contact-rich humanoid loco-manipulation. We first develop an RL-based whole-body controller that provides stable lower-body and torso execution during complex manipulation. Built on this controller, we develop a whole-body humanoid data collection system that combines VR-based teleoperation with human-to-humanoid motion mapping, enabling efficient collection of real-world demonstrations. We then propose Humanoid Transformer with Touch Dreaming (HTD), a multimodal encoder--decoder Transformer that models touch as a core modality alongside multi-view vision and proprioception. HTD is trained in a single stage with behavioral cloning augmented by touch dreaming: in addition to predicting action chunks, the policy predicts future hand-joint forces and future tactile latents, encouraging the shared Transformer trunk to learn contact-aware representations for dexterous interaction. Across five contact-rich tasks, Insert-T, Book Organization, Towel Folding, Cat Litter Scooping, and Tea Serving, HTD achieves a 90.9% relative improvement in average success rate over the stronger baseline. Ablation results further show that latent-space tactile prediction is more effective than raw tactile prediction, yielding a 30% relative gain in success rate. These results demonstrate that combining robust whole-body execution, scalable humanoid data collection, and predictive touch-centered learning enables versatile, high-dexterity humanoid manipulation in the real world. Project webpage: humanoid-touch-dream.github.io.

Via

Access Paper or Ask Questions

Delta6: A Low-Cost, 6-DOF Force-Sensing Flexible End-Effector

Apr 07, 2026

Yue Feng, Weicheng Huang, Chen Qiu, Huixu Dong, I-Ming Chen

Abstract:This paper presents Delta6, a low-cost, six-degree-of-freedom (6-DOF) force/torque end-effector that combines antagonistic springs with magnetic encoders to deliver accurate wrench sensing while remaining as simple to assemble as flat-pack furniture. A fully 3D-printed prototype, assembled entirely from off-the-shelf parts, withstands peak forces above +/-14.4 N and torques of +/-0.33 N.m per axis; these limits can be further extended by leveraging the proposed parametric analytical model. Without calibration, Delta6 attains a 99th-percentile error of 7% full scale (FS). With lightweight sequence models, the error is reduced to 3.8% FS by the best-performing network. Benchmarks on multiple computing platforms confirm that the device's bandwidth is adjustable, enabling balanced trade-offs among update rate, accuracy, and cost, while durability, thermal drift, and zero-calibration tests confirm its robustness. With Delta6 mounted on a robot arm governed by a force-impedance controller, the system successfully performs two contact-rich tasks: buffing curved surfaces and tight assemblies. Experiments validate the design, showing that Delta6 is a robust, low-cost alternative to existing 6-DOF force sensing solutions. Open-source site: https://wings-robotics.github.io/delta6 .

* This work has been submitted to the IEEE for possible publication

Via

Access Paper or Ask Questions

VQ-VAE Based Digital Semantic Communication with Importance-Aware OFDM Transmission

Aug 12, 2025

Ming Lyu, Hao Chen, Dan Wang, Chen Qiu, Guangyin Feng, Nan Ma, Xiaodong Xu

Figure 1 for VQ-VAE Based Digital Semantic Communication with Importance-Aware OFDM Transmission

Figure 2 for VQ-VAE Based Digital Semantic Communication with Importance-Aware OFDM Transmission

Figure 3 for VQ-VAE Based Digital Semantic Communication with Importance-Aware OFDM Transmission

Figure 4 for VQ-VAE Based Digital Semantic Communication with Importance-Aware OFDM Transmission

Abstract:Semantic communication (SemCom) significantly reduces redundant data and improves transmission efficiency by extracting the latent features of information. However, most of the conventional deep learning-based SemCom systems focus on analog transmission and lack in compatibility with practical digital communications. This paper proposes a vector quantized-variational autoencoder (VQ-VAE) based digital SemCom system that directly transmits the semantic features and incorporates the importance-aware orthogonal frequency division multiplexing (OFDM) transmission to enhance the SemCom performance, where the VQ-VAE generates a discrete codebook shared between the transmitter and receiver. At transmitter, the latent semantic features are firstly extracted by VQ-VAE, and then the shared codebook is adopted to match these features, which are subsequently transformed into a discrete version to adapt the digital transmission. To protect the semantic information, an importance-aware OFDM transmission strategy is proposed to allocate the key features near the OFDM reference signals, where the feature importance is derived from the gradient-based method. At the receiver, the features are rematched with the shared codebook to further correct errors. Finally, experimental results demonstrate that our proposed scheme outperforms the conventional DeepSC and achieves better reconstruction performance under low SNR region.

* 6 pages, 5 figures, conference

Via

Access Paper or Ask Questions

Reference Signal-Based Waveform Design for Integrated Sensing and Communications System

Nov 12, 2024

Ming Lyu, Hao Chen, Dan Wang, Guangyin Feng, Chen Qiu, Xiaodong Xu

Figure 1 for Reference Signal-Based Waveform Design for Integrated Sensing and Communications System

Figure 2 for Reference Signal-Based Waveform Design for Integrated Sensing and Communications System

Figure 3 for Reference Signal-Based Waveform Design for Integrated Sensing and Communications System

Figure 4 for Reference Signal-Based Waveform Design for Integrated Sensing and Communications System

Abstract:Integrated sensing and communications (ISAC) as one of the key technologies is capable of supporting high-speed communication and high-precision sensing for the upcoming 6G. This paper studies a waveform strategy by designing the orthogonal frequency division multiplexing (OFDM)-based reference signal (RS) for sensing and communication in ISAC system. We derive the closed-form expressions of Cram\'er-Rao Bound (CRB) for the distance and velocity estimations, and obtain the communication rate under the mean square error of channel estimation. Then, a weighted sum CRB minimization problem on the distance and velocity estimations is formulated by considering communication rate requirement and RS intervals constraints, which is a mixed-integer problem due to the discrete RS interval values. To solve this problem, some numerical methods are typically adopted to obtain the optimal solutions, whose computational complexity grow exponentially with the number of symbols and subcarriers of OFDM. Therefore, we propose a relaxation and approximation method to transform the original discrete problem into a continuous convex one and obtain the sub-optimal solutions. Finally, our proposed scheme is compared with the exhaustive search method in numerical simulations, which show slight gap between the obtained sub-optimal and optimal solutions, and this gap further decreases with large weight factor.

* 6 pages, 4 figures

Via

Access Paper or Ask Questions

Anomaly Detection of Tabular Data Using LLMs

Jun 24, 2024

Aodong Li, Yunhan Zhao, Chen Qiu, Marius Kloft, Padhraic Smyth, Maja Rudolph, Stephan Mandt

Abstract:Large language models (LLMs) have shown their potential in long-context understanding and mathematical reasoning. In this paper, we study the problem of using LLMs to detect tabular anomalies and show that pre-trained LLMs are zero-shot batch-level anomaly detectors. That is, without extra distribution-specific model fitting, they can discover hidden outliers in a batch of data, demonstrating their ability to identify low-density data regions. For LLMs that are not well aligned with anomaly detection and frequently output factual errors, we apply simple yet effective data-generating processes to simulate synthetic batch-level anomaly detection datasets and propose an end-to-end fine-tuning strategy to bring out the potential of LLMs in detecting real anomalies. Experiments on a large anomaly detection benchmark (ODDS) showcase i) GPT-4 has on-par performance with the state-of-the-art transductive learning-based anomaly detection methods and ii) the efficacy of our synthetic dataset and fine-tuning strategy in aligning LLMs to this task.

* accepted at the Anomaly Detection with Foundation Models workshop

Via

Access Paper or Ask Questions

Uncertainty-aware Evaluation of Auxiliary Anomalies with the Expected Anomaly Posterior

May 22, 2024

Lorenzo Perini, Maja Rudolph, Sabrina Schmedding, Chen Qiu

Figure 1 for Uncertainty-aware Evaluation of Auxiliary Anomalies with the Expected Anomaly Posterior

Figure 2 for Uncertainty-aware Evaluation of Auxiliary Anomalies with the Expected Anomaly Posterior

Figure 3 for Uncertainty-aware Evaluation of Auxiliary Anomalies with the Expected Anomaly Posterior

Figure 4 for Uncertainty-aware Evaluation of Auxiliary Anomalies with the Expected Anomaly Posterior

Abstract:Anomaly detection is the task of identifying examples that do not behave as expected. Because anomalies are rare and unexpected events, collecting real anomalous examples is often challenging in several applications. In addition, learning an anomaly detector with limited (or no) anomalies often yields poor prediction performance. One option is to employ auxiliary synthetic anomalies to improve the model training. However, synthetic anomalies may be of poor quality: anomalies that are unrealistic or indistinguishable from normal samples may deteriorate the detector's performance. Unfortunately, no existing methods quantify the quality of auxiliary anomalies. We fill in this gap and propose the expected anomaly posterior (EAP), an uncertainty-based score function that measures the quality of auxiliary anomalies by quantifying the total uncertainty of an anomaly detector. Experimentally on 40 benchmark datasets of images and tabular data, we show that EAP outperforms 12 adapted data quality estimators in the majority of cases.

Via

Access Paper or Ask Questions

Model Selection of Anomaly Detectors in the Absence of Labeled Validation Data

Oct 16, 2023

Clement Fung, Chen Qiu, Aodong Li, Maja Rudolph

Figure 1 for Model Selection of Anomaly Detectors in the Absence of Labeled Validation Data

Figure 2 for Model Selection of Anomaly Detectors in the Absence of Labeled Validation Data

Figure 3 for Model Selection of Anomaly Detectors in the Absence of Labeled Validation Data

Figure 4 for Model Selection of Anomaly Detectors in the Absence of Labeled Validation Data

Abstract:Anomaly detection requires detecting abnormal samples in large unlabeled datasets. While progress in deep learning and the advent of foundation models has produced powerful unsupervised anomaly detection methods, their deployment in practice is often hindered by the lack of labeled data -- without it, the detection accuracy of an anomaly detector cannot be evaluated reliably. In this work, we propose a general-purpose framework for evaluating image-based anomaly detectors with synthetically generated validation data. Our method assumes access to a small support set of normal images which are processed with a pre-trained diffusion model (our proposed method requires no training or fine-tuning) to produce synthetic anomalies. When mixed with normal samples from the support set, the synthetic anomalies create detection tasks that compose a validation framework for anomaly detection evaluation and model selection. In an extensive empirical study, ranging from natural images to industrial applications, we find that our synthetic validation framework selects the same models and hyper-parameters as selection with a ground-truth validation set. In addition, we find that prompts selected by our method for CLIP-based anomaly detection outperforms all other prompt selection strategies, and leads to the overall best detection accuracy, even on the challenging MVTec-AD dataset.

* 16 pages

Via

Access Paper or Ask Questions

Text-driven Prompt Generation for Vision-Language Models in Federated Learning

Oct 09, 2023

Chen Qiu, Xingyu Li, Chaithanya Kumar Mummadi, Madan Ravi Ganesh, Zhenzhen Li, Lu Peng, Wan-Yi Lin

Figure 1 for Text-driven Prompt Generation for Vision-Language Models in Federated Learning

Figure 2 for Text-driven Prompt Generation for Vision-Language Models in Federated Learning

Figure 3 for Text-driven Prompt Generation for Vision-Language Models in Federated Learning

Figure 4 for Text-driven Prompt Generation for Vision-Language Models in Federated Learning

Abstract:Prompt learning for vision-language models, e.g., CoOp, has shown great success in adapting CLIP to different downstream tasks, making it a promising solution for federated learning due to computational reasons. Existing prompt learning techniques replace hand-crafted text prompts with learned vectors that offer improvements on seen classes, but struggle to generalize to unseen classes. Our work addresses this challenge by proposing Federated Text-driven Prompt Generation (FedTPG), which learns a unified prompt generation network across multiple remote clients in a scalable manner. The prompt generation network is conditioned on task-related text input, thus is context-aware, making it suitable to generalize for both seen and unseen classes. Our comprehensive empirical evaluations on nine diverse image classification datasets show that our method is superior to existing federated prompt learning methods, that achieve overall better generalization on both seen and unseen classes and is also generalizable to unseen datasets.

Via

Access Paper or Ask Questions

Data Curation for Image Captioning with Text-to-Image Generative Models

May 05, 2023

Wenyan Li, Jonas F. Lotz, Chen Qiu, Desmond Elliott

Figure 1 for Data Curation for Image Captioning with Text-to-Image Generative Models

Figure 2 for Data Curation for Image Captioning with Text-to-Image Generative Models

Figure 3 for Data Curation for Image Captioning with Text-to-Image Generative Models

Figure 4 for Data Curation for Image Captioning with Text-to-Image Generative Models

Abstract:Recent advances in image captioning are mainly driven by large-scale vision-language pretraining, relying heavily on computational resources and increasingly large multimodal datasets. Instead of scaling up pretraining data, we ask whether it is possible to improve performance by improving the quality of the samples in existing datasets. We pursue this question through two approaches to data curation: one that assumes that some examples should be avoided due to mismatches between the image and caption, and one that assumes that the mismatch can be addressed by replacing the image, for which we use the state-of-the-art Stable Diffusion model. These approaches are evaluated using the BLIP model on MS COCO and Flickr30K in both finetuning and few-shot learning settings. Our simple yet effective approaches consistently outperform baselines, indicating that better image captioning models can be trained by curating existing resources. Finally, we conduct a human study to understand the errors made by the Stable Diffusion model and highlight directions for future work in text-to-image generation.

Via

Access Paper or Ask Questions

Theoretical Model Construction of Deformation-Force for Soft Grippers Part I: Co-rotational Modeling and Force Control for Design Optimization

Mar 23, 2023

Huixu Dong, Haotian Guo, Sihao Yang, Chen Qiu, Jiansheng Dai, I-Ming Chen

Figure 1 for Theoretical Model Construction of Deformation-Force for Soft Grippers Part I: Co-rotational Modeling and Force Control for Design Optimization

Figure 2 for Theoretical Model Construction of Deformation-Force for Soft Grippers Part I: Co-rotational Modeling and Force Control for Design Optimization

Figure 3 for Theoretical Model Construction of Deformation-Force for Soft Grippers Part I: Co-rotational Modeling and Force Control for Design Optimization

Figure 4 for Theoretical Model Construction of Deformation-Force for Soft Grippers Part I: Co-rotational Modeling and Force Control for Design Optimization

Abstract:Compliant grippers, owing to adaptivity and safety, have attracted considerable attention for unstructured grasping in real applications, such as industrial or logistic scenarios. However, accurate construction of the mathematical model depicting the bidirectional relationship between shape deformation and contact force for such grippers, such as the Fin-Ray grippers, remains stagnant to date. To address this research gap, this article devises, presents, and experimentally validates a universal bidirectional force-displacement mathematical model for compliant grippers based on the co-rotational concept, which endows such grippers with an intrinsic force sensing capability and offers a better insight into the design optimization. In Part 1 of the article, we introduce the fundamental theory of the co-rotational approach, where arbitrary large deformation of beam elements can be modeled. Its intrinsic principle enables the theoretical modeling to consider various types of configurations and key design parameters with very few assumptions made. Further, a force control algorithm is proposed, providing accurate displacement estimations of the gripper under external forces with minor computational loads. The performance of the proposed method is experimentally verified through comparison with Finite Element Analysis, where the influence of four key design parameters on the gripper s performance is investigated, facilitating systematical design optimization. Part 2 of this article demonstrating the force sensing capabilities and the effects of representative co-rotational modeling parameters on model accuracy is released in Google Drive.

Via

Access Paper or Ask Questions