Picture for Chaofeng Chen

Chaofeng Chen

Decoupled Similarity for Task-Aware Token Pruning in Large Vision-Language Models

Add code
Apr 13, 2026
Viaarxiv icon

Beyond the Dirac Delta: Mitigating Diversity Collapse in Reinforcement Fine-Tuning for Versatile Image Generation

Add code
Jan 18, 2026
Viaarxiv icon

GaussianMorphing: Mesh-Guided 3D Gaussians for Semantic-Aware Object Morphing

Add code
Oct 02, 2025
Figure 1 for GaussianMorphing: Mesh-Guided 3D Gaussians for Semantic-Aware Object Morphing
Figure 2 for GaussianMorphing: Mesh-Guided 3D Gaussians for Semantic-Aware Object Morphing
Figure 3 for GaussianMorphing: Mesh-Guided 3D Gaussians for Semantic-Aware Object Morphing
Figure 4 for GaussianMorphing: Mesh-Guided 3D Gaussians for Semantic-Aware Object Morphing
Viaarxiv icon

MVQA: Mamba with Unified Sampling for Efficient Video Quality Assessment

Add code
Apr 22, 2025
Viaarxiv icon

Text4Seg: Reimagining Image Segmentation as Text Generation

Add code
Oct 13, 2024
Figure 1 for Text4Seg: Reimagining Image Segmentation as Text Generation
Figure 2 for Text4Seg: Reimagining Image Segmentation as Text Generation
Figure 3 for Text4Seg: Reimagining Image Segmentation as Text Generation
Figure 4 for Text4Seg: Reimagining Image Segmentation as Text Generation
Viaarxiv icon

Combining Generative and Geometry Priors for Wide-Angle Portrait Correction

Add code
Oct 13, 2024
Viaarxiv icon

MRSE: An Efficient Multi-modality Retrieval System for Large Scale E-commerce

Add code
Aug 27, 2024
Figure 1 for MRSE: An Efficient Multi-modality Retrieval System for Large Scale E-commerce
Figure 2 for MRSE: An Efficient Multi-modality Retrieval System for Large Scale E-commerce
Figure 3 for MRSE: An Efficient Multi-modality Retrieval System for Large Scale E-commerce
Figure 4 for MRSE: An Efficient Multi-modality Retrieval System for Large Scale E-commerce
Viaarxiv icon

ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation

Add code
Aug 09, 2024
Viaarxiv icon

Q-Ground: Image Quality Grounding with Large Multi-modality Models

Add code
Jul 24, 2024
Figure 1 for Q-Ground: Image Quality Grounding with Large Multi-modality Models
Figure 2 for Q-Ground: Image Quality Grounding with Large Multi-modality Models
Figure 3 for Q-Ground: Image Quality Grounding with Large Multi-modality Models
Figure 4 for Q-Ground: Image Quality Grounding with Large Multi-modality Models
Viaarxiv icon

ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference

Add code
Jul 17, 2024
Figure 1 for ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference
Figure 2 for ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference
Figure 3 for ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference
Figure 4 for ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference
Viaarxiv icon