Picture for Chaoyi Zhang

Chaoyi Zhang

Scaling Law for Quantization-Aware Training

Add code
May 20, 2025
Viaarxiv icon

Model Merging in Pre-training of Large Language Models

Add code
May 17, 2025
Viaarxiv icon

Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding

Add code
Mar 13, 2025
Viaarxiv icon

RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm

Add code
Feb 18, 2025
Viaarxiv icon

Learning to Synthesize Graphics Programs for Geometric Artworks

Add code
Oct 21, 2024
Figure 1 for Learning to Synthesize Graphics Programs for Geometric Artworks
Figure 2 for Learning to Synthesize Graphics Programs for Geometric Artworks
Figure 3 for Learning to Synthesize Graphics Programs for Geometric Artworks
Figure 4 for Learning to Synthesize Graphics Programs for Geometric Artworks
Viaarxiv icon

DeepIcon: A Hierarchical Network for Layer-wise Icon Vectorization

Add code
Oct 21, 2024
Figure 1 for DeepIcon: A Hierarchical Network for Layer-wise Icon Vectorization
Figure 2 for DeepIcon: A Hierarchical Network for Layer-wise Icon Vectorization
Figure 3 for DeepIcon: A Hierarchical Network for Layer-wise Icon Vectorization
Figure 4 for DeepIcon: A Hierarchical Network for Layer-wise Icon Vectorization
Viaarxiv icon

Enhancing Advanced Visual Reasoning Ability of Large Language Models

Add code
Sep 21, 2024
Figure 1 for Enhancing Advanced Visual Reasoning Ability of Large Language Models
Figure 2 for Enhancing Advanced Visual Reasoning Ability of Large Language Models
Figure 3 for Enhancing Advanced Visual Reasoning Ability of Large Language Models
Figure 4 for Enhancing Advanced Visual Reasoning Ability of Large Language Models
Viaarxiv icon

Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images

Add code
Aug 15, 2024
Figure 1 for Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images
Figure 2 for Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images
Figure 3 for Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images
Figure 4 for Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images
Viaarxiv icon

Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights

Add code
Jul 16, 2024
Viaarxiv icon

Enhancing Robustness to Noise Corruption for Point Cloud Model via Spatial Sorting and Set-Mixing Aggregation Module

Add code
Jul 15, 2024
Figure 1 for Enhancing Robustness to Noise Corruption for Point Cloud Model via Spatial Sorting and Set-Mixing Aggregation Module
Figure 2 for Enhancing Robustness to Noise Corruption for Point Cloud Model via Spatial Sorting and Set-Mixing Aggregation Module
Figure 3 for Enhancing Robustness to Noise Corruption for Point Cloud Model via Spatial Sorting and Set-Mixing Aggregation Module
Figure 4 for Enhancing Robustness to Noise Corruption for Point Cloud Model via Spatial Sorting and Set-Mixing Aggregation Module
Viaarxiv icon