Alert button
Picture for Hao Feng

Hao Feng

Alert button

TextSquare: Scaling up Text-Centric Visual Instruction Tuning

Add code
Bookmark button
Alert button
Apr 19, 2024
Jingqun Tang, Chunhui Lin, Zhen Zhao, Shu Wei, Binghong Wu, Qi Liu, Hao Feng, Yang Li, Siqi Wang, Lei Liao, Wei Shi, Yuliang Liu, Hao Liu, Yuan Xie, Xiang Bai, Can Huang

Viaarxiv icon

Progressive Multi-modal Conditional Prompt Tuning

Add code
Bookmark button
Alert button
Apr 18, 2024
Xiaoyu Qiu, Hao Feng, Yuechen Wang, Wengang Zhou, Houqiang Li

Viaarxiv icon

Integration of Self-Supervised BYOL in Semi-Supervised Medical Image Recognition

Add code
Bookmark button
Alert button
Apr 16, 2024
Hao Feng, Yuanzhe Jia, Ruijia Xu, Mukesh Prasad, Ali Anaissi, Ali Braytee

Viaarxiv icon

TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding

Add code
Bookmark button
Alert button
Apr 15, 2024
Bozhi Luan, Hao Feng, Hong Chen, Yonghui Wang, Wengang Zhou, Houqiang Li

Viaarxiv icon

Enhanced Radar Perception via Multi-Task Learning: Towards Refined Data for Sensor Fusion Applications

Add code
Bookmark button
Alert button
Apr 09, 2024
Huawei Sun, Hao Feng, Gianfranco Mauro, Julius Ott, Georg Stettinger, Lorenzo Servadei, Robert Wille

Viaarxiv icon

DeepEraser: Deep Iterative Context Mining for Generic Text Eraser

Add code
Bookmark button
Alert button
Feb 29, 2024
Hao Feng, Wendi Wang, Shaokai Liu, Jiajun Deng, Wengang Zhou, Houqiang Li

Viaarxiv icon

UCE-FID: Using Large Unlabeled, Medium Crowdsourced-Labeled, and Small Expert-Labeled Tweets for Foodborne Illness Detection

Add code
Bookmark button
Alert button
Dec 02, 2023
Ruofan Hu, Dongyu Zhang, Dandan Tao, Huayi Zhang, Hao Feng, Elke Rundensteiner

Viaarxiv icon

DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding

Add code
Bookmark button
Alert button
Nov 30, 2023
Hao Feng, Qi Liu, Hao Liu, Wengang Zhou, Houqiang Li, Can Huang

Figure 1 for DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding
Figure 2 for DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding
Figure 3 for DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding
Figure 4 for DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding
Viaarxiv icon

Scalable AI Generative Content for Vehicular Network Semantic Communication

Add code
Bookmark button
Alert button
Nov 23, 2023
Hao Feng, Yi Yang, Zhu Han

Viaarxiv icon

Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs

Add code
Bookmark button
Alert button
Nov 22, 2023
Yonghui Wang, Wengang Zhou, Hao Feng, Keyi Zhou, Houqiang Li

Viaarxiv icon