Picture for Xiangyu Zhang

Xiangyu Zhang

Music Tempo Estimation on Solo Instrumental Performance

Add code
Apr 25, 2025
Viaarxiv icon

Step1X-Edit: A Practical Framework for General Image Editing

Add code
Apr 24, 2025
Viaarxiv icon

Large Language Models for Validating Network Protocol Parsers

Add code
Apr 18, 2025
Figure 1 for Large Language Models for Validating Network Protocol Parsers
Figure 2 for Large Language Models for Validating Network Protocol Parsers
Figure 3 for Large Language Models for Validating Network Protocol Parsers
Figure 4 for Large Language Models for Validating Network Protocol Parsers
Viaarxiv icon

Perception-R1: Pioneering Perception Policy with Reinforcement Learning

Add code
Apr 10, 2025
Figure 1 for Perception-R1: Pioneering Perception Policy with Reinforcement Learning
Figure 2 for Perception-R1: Pioneering Perception Policy with Reinforcement Learning
Figure 3 for Perception-R1: Pioneering Perception Policy with Reinforcement Learning
Figure 4 for Perception-R1: Pioneering Perception Policy with Reinforcement Learning
Viaarxiv icon

Perception in Reflection

Add code
Apr 09, 2025
Figure 1 for Perception in Reflection
Figure 2 for Perception in Reflection
Figure 3 for Perception in Reflection
Figure 4 for Perception in Reflection
Viaarxiv icon

Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation

Add code
Apr 04, 2025
Figure 1 for Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation
Figure 2 for Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation
Figure 3 for Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation
Figure 4 for Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation
Viaarxiv icon

Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness

Add code
Apr 02, 2025
Figure 1 for Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
Figure 2 for Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
Figure 3 for Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
Figure 4 for Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
Viaarxiv icon

$μ$KE: Matryoshka Unstructured Knowledge Editing of Large Language Models

Add code
Apr 01, 2025
Figure 1 for $μ$KE: Matryoshka Unstructured Knowledge Editing of Large Language Models
Figure 2 for $μ$KE: Matryoshka Unstructured Knowledge Editing of Large Language Models
Figure 3 for $μ$KE: Matryoshka Unstructured Knowledge Editing of Large Language Models
Figure 4 for $μ$KE: Matryoshka Unstructured Knowledge Editing of Large Language Models
Viaarxiv icon

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Add code
Mar 31, 2025
Figure 1 for Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Figure 2 for Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Figure 3 for Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Figure 4 for Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Viaarxiv icon

M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?

Add code
Mar 27, 2025
Figure 1 for M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?
Figure 2 for M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?
Figure 3 for M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?
Figure 4 for M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?
Viaarxiv icon