Picture for Yuhang Cao

Yuhang Cao

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Add code
Jul 03, 2024
Figure 1 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 2 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 3 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 4 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Viaarxiv icon

V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results

Add code
Jun 17, 2024
Viaarxiv icon

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

Add code
Apr 09, 2024
Figure 1 for InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
Figure 2 for InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
Figure 3 for InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
Figure 4 for InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
Viaarxiv icon

DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models

Add code
Feb 22, 2024
Figure 1 for DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models
Figure 2 for DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models
Figure 3 for DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models
Figure 4 for DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models
Viaarxiv icon

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model

Add code
Jan 29, 2024
Viaarxiv icon

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition

Add code
Sep 29, 2023
Figure 1 for InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Figure 2 for InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Figure 3 for InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Figure 4 for InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Viaarxiv icon

PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System

Add code
Sep 28, 2023
Figure 1 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Figure 2 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Figure 3 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Figure 4 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Viaarxiv icon

DiaCorrect: Error Correction Back-end For Speaker Diarization

Add code
Sep 15, 2023
Figure 1 for DiaCorrect: Error Correction Back-end For Speaker Diarization
Figure 2 for DiaCorrect: Error Correction Back-end For Speaker Diarization
Figure 3 for DiaCorrect: Error Correction Back-end For Speaker Diarization
Figure 4 for DiaCorrect: Error Correction Back-end For Speaker Diarization
Viaarxiv icon

V3Det: Vast Vocabulary Visual Detection Dataset

Add code
Apr 07, 2023
Figure 1 for V3Det: Vast Vocabulary Visual Detection Dataset
Figure 2 for V3Det: Vast Vocabulary Visual Detection Dataset
Figure 3 for V3Det: Vast Vocabulary Visual Detection Dataset
Figure 4 for V3Det: Vast Vocabulary Visual Detection Dataset
Viaarxiv icon

DiaCorrect: End-to-end error correction for speaker diarization

Add code
Oct 31, 2022
Figure 1 for DiaCorrect: End-to-end error correction for speaker diarization
Figure 2 for DiaCorrect: End-to-end error correction for speaker diarization
Figure 3 for DiaCorrect: End-to-end error correction for speaker diarization
Figure 4 for DiaCorrect: End-to-end error correction for speaker diarization
Viaarxiv icon