Picture for Yu Wu

Yu Wu

Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking

Add code
Jun 23, 2024
Figure 1 for Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
Figure 2 for Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
Figure 3 for Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
Figure 4 for Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
Viaarxiv icon

MAC: A Benchmark for Multiple Attributes Compositional Zero-Shot Learning

Add code
Jun 18, 2024
Viaarxiv icon

LAIP: Learning Local Alignment from Image-Phrase Modeling for Text-based Person Search

Add code
Jun 16, 2024
Viaarxiv icon

CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation

Add code
Jun 15, 2024
Figure 1 for CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Figure 2 for CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Figure 3 for CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Figure 4 for CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Viaarxiv icon

From Space-Time to Space-Order: Directly Planning a Temporal Planning Graph by Redefining CBS

Add code
Apr 23, 2024
Figure 1 for From Space-Time to Space-Order: Directly Planning a Temporal Planning Graph by Redefining CBS
Figure 2 for From Space-Time to Space-Order: Directly Planning a Temporal Planning Graph by Redefining CBS
Figure 3 for From Space-Time to Space-Order: Directly Planning a Temporal Planning Graph by Redefining CBS
Figure 4 for From Space-Time to Space-Order: Directly Planning a Temporal Planning Graph by Redefining CBS
Viaarxiv icon

D$^3$: Scaling Up Deepfake Detection by Learning from Discrepancy

Add code
Apr 06, 2024
Figure 1 for D$^3$: Scaling Up Deepfake Detection by Learning from Discrepancy
Figure 2 for D$^3$: Scaling Up Deepfake Detection by Learning from Discrepancy
Figure 3 for D$^3$: Scaling Up Deepfake Detection by Learning from Discrepancy
Figure 4 for D$^3$: Scaling Up Deepfake Detection by Learning from Discrepancy
Viaarxiv icon

Improving Bird's Eye View Semantic Segmentation by Task Decomposition

Add code
Apr 02, 2024
Viaarxiv icon

Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning

Add code
Apr 01, 2024
Figure 1 for Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning
Figure 2 for Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning
Figure 3 for Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning
Figure 4 for Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning
Viaarxiv icon

Advanced Long-Content Speech Recognition With Factorized Neural Transducer

Add code
Mar 20, 2024
Figure 1 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Figure 2 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Figure 3 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Figure 4 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Viaarxiv icon

Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning

Add code
Feb 18, 2024
Figure 1 for Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
Figure 2 for Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
Figure 3 for Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
Figure 4 for Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
Viaarxiv icon