Picture for Jiabo Ye

Jiabo Ye

mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding

Add code
Sep 05, 2024
Viaarxiv icon

mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models

Add code
Aug 09, 2024
Viaarxiv icon

mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

Add code
Mar 19, 2024
Viaarxiv icon

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

Add code
Jan 29, 2024
Viaarxiv icon

mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model

Add code
Nov 30, 2023
Viaarxiv icon

mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration

Add code
Nov 09, 2023
Viaarxiv icon

UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model

Add code
Oct 08, 2023
Viaarxiv icon

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Add code
Jul 04, 2023
Viaarxiv icon

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks

Add code
Jun 07, 2023
Viaarxiv icon

mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality

Add code
Apr 27, 2023
Viaarxiv icon