Picture for Zhiyin Ma

Zhiyin Ma

Exploring the Capabilities of Large Multimodal Models on Dense Text

Add code
May 09, 2024
Viaarxiv icon

TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document

Add code
Mar 15, 2024
Figure 1 for TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Figure 2 for TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Figure 3 for TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Figure 4 for TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Viaarxiv icon

Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Add code
Nov 24, 2023
Figure 1 for Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Figure 2 for Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Figure 3 for Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Figure 4 for Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Viaarxiv icon