Picture for Kaijun Tan

Kaijun Tan

M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?

Add code
Mar 27, 2025
Figure 1 for M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?
Figure 2 for M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?
Figure 3 for M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?
Figure 4 for M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?
Viaarxiv icon

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

Add code
Mar 14, 2025
Viaarxiv icon

CheapNET: Improving Light-weight speech enhancement network by projected loss function

Add code
Nov 27, 2023
Figure 1 for CheapNET: Improving Light-weight speech enhancement network by projected loss function
Figure 2 for CheapNET: Improving Light-weight speech enhancement network by projected loss function
Figure 3 for CheapNET: Improving Light-weight speech enhancement network by projected loss function
Figure 4 for CheapNET: Improving Light-weight speech enhancement network by projected loss function
Viaarxiv icon