Picture for Han Xiao

Han Xiao

UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents

Add code
May 27, 2025
Viaarxiv icon

MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

Add code
May 15, 2025
Viaarxiv icon

Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding

Add code
May 08, 2025
Viaarxiv icon

WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch

Add code
May 06, 2025
Viaarxiv icon

Robust Full-Space Physical Layer Security for STAR-RIS-Aided Wireless Networks: Eavesdropper with Uncertain Location and Channel

Add code
Mar 15, 2025
Viaarxiv icon

Fluid Antenna System Empowering 5G NR

Add code
Mar 07, 2025
Viaarxiv icon

ReaderLM-v2: Small Language Model for HTML to Markdown and JSON

Add code
Mar 03, 2025
Viaarxiv icon

AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark

Add code
Dec 17, 2024
Figure 1 for AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark
Figure 2 for AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark
Figure 3 for AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark
Figure 4 for AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark
Viaarxiv icon

jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images

Add code
Dec 11, 2024
Figure 1 for jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
Figure 2 for jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
Figure 3 for jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
Figure 4 for jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
Viaarxiv icon

MPBD-LSTM: A Predictive Model for Colorectal Liver Metastases Using Time Series Multi-phase Contrast-Enhanced CT Scans

Add code
Dec 02, 2024
Viaarxiv icon