Picture for Yu Zhou

Yu Zhou

National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China, Fanyu AI Laboratory, Zhongke Fanyu Technology Co., Ltd, Beijing, China

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

InfinitePOD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers

Add code
Feb 07, 2025
Figure 1 for InfinitePOD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers
Figure 2 for InfinitePOD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers
Figure 3 for InfinitePOD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers
Figure 4 for InfinitePOD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers
Viaarxiv icon

Affine Frequency Division Multiplexing: Extending OFDM for Scenario-Flexibility and Resilience

Add code
Feb 07, 2025
Figure 1 for Affine Frequency Division Multiplexing: Extending OFDM for Scenario-Flexibility and Resilience
Figure 2 for Affine Frequency Division Multiplexing: Extending OFDM for Scenario-Flexibility and Resilience
Figure 3 for Affine Frequency Division Multiplexing: Extending OFDM for Scenario-Flexibility and Resilience
Figure 4 for Affine Frequency Division Multiplexing: Extending OFDM for Scenario-Flexibility and Resilience
Viaarxiv icon

Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation

Add code
Jan 10, 2025
Figure 1 for Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation
Figure 2 for Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation
Figure 3 for Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation
Figure 4 for Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation
Viaarxiv icon

Char-SAM: Turning Segment Anything Model into Scene Text Segmentation Annotator with Character-level Visual Prompts

Add code
Dec 27, 2024
Figure 1 for Char-SAM: Turning Segment Anything Model into Scene Text Segmentation Annotator with Character-level Visual Prompts
Figure 2 for Char-SAM: Turning Segment Anything Model into Scene Text Segmentation Annotator with Character-level Visual Prompts
Figure 3 for Char-SAM: Turning Segment Anything Model into Scene Text Segmentation Annotator with Character-level Visual Prompts
Figure 4 for Char-SAM: Turning Segment Anything Model into Scene Text Segmentation Annotator with Character-level Visual Prompts
Viaarxiv icon

Less is More: Towards Green Code Large Language Models via Unified Structural Pruning

Add code
Dec 20, 2024
Viaarxiv icon

LDP: Generalizing to Multilingual Visual Information Extraction by Language Decoupled Pretraining

Add code
Dec 19, 2024
Viaarxiv icon

SimADFuzz: Simulation-Feedback Fuzz Testing for Autonomous Driving Systems

Add code
Dec 18, 2024
Figure 1 for SimADFuzz: Simulation-Feedback Fuzz Testing for Autonomous Driving Systems
Figure 2 for SimADFuzz: Simulation-Feedback Fuzz Testing for Autonomous Driving Systems
Figure 3 for SimADFuzz: Simulation-Feedback Fuzz Testing for Autonomous Driving Systems
Figure 4 for SimADFuzz: Simulation-Feedback Fuzz Testing for Autonomous Driving Systems
Viaarxiv icon

Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues

Add code
Dec 17, 2024
Figure 1 for Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Figure 2 for Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Figure 3 for Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Figure 4 for Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Viaarxiv icon

Arbitrary Reading Order Scene Text Spotter with Local Semantics Guidance

Add code
Dec 13, 2024
Viaarxiv icon