Picture for Haoran Wei

Haoran Wei

additional authors not shown

Unhackable Temporal Rewarding for Scalable Video MLLMs

Add code
Feb 17, 2025
Figure 1 for Unhackable Temporal Rewarding for Scalable Video MLLMs
Figure 2 for Unhackable Temporal Rewarding for Scalable Video MLLMs
Figure 3 for Unhackable Temporal Rewarding for Scalable Video MLLMs
Figure 4 for Unhackable Temporal Rewarding for Scalable Video MLLMs
Viaarxiv icon

PerPO: Perceptual Preference Optimization via Discriminative Rewarding

Add code
Feb 05, 2025
Figure 1 for PerPO: Perceptual Preference Optimization via Discriminative Rewarding
Figure 2 for PerPO: Perceptual Preference Optimization via Discriminative Rewarding
Figure 3 for PerPO: Perceptual Preference Optimization via Discriminative Rewarding
Figure 4 for PerPO: Perceptual Preference Optimization via Discriminative Rewarding
Viaarxiv icon

SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation

Add code
Jan 20, 2025
Figure 1 for SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation
Figure 2 for SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation
Figure 3 for SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation
Figure 4 for SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation
Viaarxiv icon

Slow Perception: Let's Perceive Geometric Figures Step-by-step

Add code
Dec 30, 2024
Figure 1 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Figure 2 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Figure 3 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Figure 4 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Viaarxiv icon

Qwen2.5 Technical Report

Add code
Dec 19, 2024
Figure 1 for Qwen2.5 Technical Report
Figure 2 for Qwen2.5 Technical Report
Figure 3 for Qwen2.5 Technical Report
Figure 4 for Qwen2.5 Technical Report
Viaarxiv icon

Towards High-Fidelity 3D Portrait Generation with Rich Details by Cross-View Prior-Aware Diffusion

Add code
Nov 15, 2024
Viaarxiv icon

P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs

Add code
Nov 14, 2024
Figure 1 for P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs
Figure 2 for P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs
Figure 3 for P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs
Figure 4 for P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs
Viaarxiv icon

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Add code
Sep 03, 2024
Figure 1 for General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Figure 2 for General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Figure 3 for General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Figure 4 for General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Viaarxiv icon

No Re-Train, More Gain: Upgrading Backbones with Diffusion Model for Few-Shot Segmentation

Add code
Jul 23, 2024
Figure 1 for No Re-Train, More Gain: Upgrading Backbones with Diffusion Model for Few-Shot Segmentation
Figure 2 for No Re-Train, More Gain: Upgrading Backbones with Diffusion Model for Few-Shot Segmentation
Figure 3 for No Re-Train, More Gain: Upgrading Backbones with Diffusion Model for Few-Shot Segmentation
Figure 4 for No Re-Train, More Gain: Upgrading Backbones with Diffusion Model for Few-Shot Segmentation
Viaarxiv icon

Qwen2 Technical Report

Add code
Jul 16, 2024
Figure 1 for Qwen2 Technical Report
Figure 2 for Qwen2 Technical Report
Figure 3 for Qwen2 Technical Report
Figure 4 for Qwen2 Technical Report
Viaarxiv icon