Picture for Wei Li

Wei Li

Tsinghua University, Beijing, China

BRIDGES: Bridging Graph Modality and Large Language Models within EDA Tasks

Add code
Apr 07, 2025
Viaarxiv icon

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

Add code
Mar 30, 2025
Viaarxiv icon

Harmonizing Visual Representations for Unified Multimodal Understanding and Generation

Add code
Mar 27, 2025
Figure 1 for Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Figure 2 for Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Figure 3 for Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Figure 4 for Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Viaarxiv icon

OpenHuEval: Evaluating Large Language Model on Hungarian Specifics

Add code
Mar 27, 2025
Figure 1 for OpenHuEval: Evaluating Large Language Model on Hungarian Specifics
Figure 2 for OpenHuEval: Evaluating Large Language Model on Hungarian Specifics
Figure 3 for OpenHuEval: Evaluating Large Language Model on Hungarian Specifics
Figure 4 for OpenHuEval: Evaluating Large Language Model on Hungarian Specifics
Viaarxiv icon

Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

Add code
Mar 26, 2025
Figure 1 for Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
Figure 2 for Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
Figure 3 for Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
Figure 4 for Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
Viaarxiv icon

ACVUBench: Audio-Centric Video Understanding Benchmark

Add code
Mar 25, 2025
Figure 1 for ACVUBench: Audio-Centric Video Understanding Benchmark
Figure 2 for ACVUBench: Audio-Centric Video Understanding Benchmark
Figure 3 for ACVUBench: Audio-Centric Video Understanding Benchmark
Figure 4 for ACVUBench: Audio-Centric Video Understanding Benchmark
Viaarxiv icon

ISPDiffuser: Learning RAW-to-sRGB Mappings with Texture-Aware Diffusion Models and Histogram-Guided Color Consistency

Add code
Mar 25, 2025
Figure 1 for ISPDiffuser: Learning RAW-to-sRGB Mappings with Texture-Aware Diffusion Models and Histogram-Guided Color Consistency
Figure 2 for ISPDiffuser: Learning RAW-to-sRGB Mappings with Texture-Aware Diffusion Models and Histogram-Guided Color Consistency
Figure 3 for ISPDiffuser: Learning RAW-to-sRGB Mappings with Texture-Aware Diffusion Models and Histogram-Guided Color Consistency
Figure 4 for ISPDiffuser: Learning RAW-to-sRGB Mappings with Texture-Aware Diffusion Models and Histogram-Guided Color Consistency
Viaarxiv icon

CCMusic: An Open and Diverse Database for Chinese Music Information Retrieval Research

Add code
Mar 24, 2025
Viaarxiv icon

Improving LLM Video Understanding with 16 Frames Per Second

Add code
Mar 18, 2025
Figure 1 for Improving LLM Video Understanding with 16 Frames Per Second
Figure 2 for Improving LLM Video Understanding with 16 Frames Per Second
Figure 3 for Improving LLM Video Understanding with 16 Frames Per Second
Figure 4 for Improving LLM Video Understanding with 16 Frames Per Second
Viaarxiv icon

Boosting Semi-Supervised Medical Image Segmentation via Masked Image Consistency and Discrepancy Learning

Add code
Mar 18, 2025
Viaarxiv icon