Picture for Shuo Zhang

Shuo Zhang

Exploring the Capabilities of Large Multimodal Models on Dense Text

Add code
May 09, 2024
Figure 1 for Exploring the Capabilities of Large Multimodal Models on Dense Text
Figure 2 for Exploring the Capabilities of Large Multimodal Models on Dense Text
Figure 3 for Exploring the Capabilities of Large Multimodal Models on Dense Text
Figure 4 for Exploring the Capabilities of Large Multimodal Models on Dense Text
Viaarxiv icon

BCFPL: Binary classification ConvNet based Fast Parking space recognition with Low resolution image

Add code
Apr 22, 2024
Figure 1 for BCFPL: Binary classification ConvNet based Fast Parking space recognition with Low resolution image
Figure 2 for BCFPL: Binary classification ConvNet based Fast Parking space recognition with Low resolution image
Figure 3 for BCFPL: Binary classification ConvNet based Fast Parking space recognition with Low resolution image
Figure 4 for BCFPL: Binary classification ConvNet based Fast Parking space recognition with Low resolution image
Viaarxiv icon

Representation Learning of Tangled Key-Value Sequence Data for Early Classification

Add code
Apr 11, 2024
Figure 1 for Representation Learning of Tangled Key-Value Sequence Data for Early Classification
Figure 2 for Representation Learning of Tangled Key-Value Sequence Data for Early Classification
Figure 3 for Representation Learning of Tangled Key-Value Sequence Data for Early Classification
Figure 4 for Representation Learning of Tangled Key-Value Sequence Data for Early Classification
Viaarxiv icon

InternLM2 Technical Report

Add code
Mar 26, 2024
Figure 1 for InternLM2 Technical Report
Figure 2 for InternLM2 Technical Report
Figure 3 for InternLM2 Technical Report
Figure 4 for InternLM2 Technical Report
Viaarxiv icon

Latent CLAP Loss for Better Foley Sound Synthesis

Add code
Mar 18, 2024
Figure 1 for Latent CLAP Loss for Better Foley Sound Synthesis
Figure 2 for Latent CLAP Loss for Better Foley Sound Synthesis
Figure 3 for Latent CLAP Loss for Better Foley Sound Synthesis
Figure 4 for Latent CLAP Loss for Better Foley Sound Synthesis
Viaarxiv icon

TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document

Add code
Mar 15, 2024
Viaarxiv icon

InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning

Add code
Feb 09, 2024
Figure 1 for InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning
Figure 2 for InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning
Figure 3 for InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning
Figure 4 for InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning
Viaarxiv icon

Protein Language Model-Powered 3D Ligand Binding Site Prediction from Protein Sequence

Add code
Dec 05, 2023
Figure 1 for Protein Language Model-Powered 3D Ligand Binding Site Prediction from Protein Sequence
Figure 2 for Protein Language Model-Powered 3D Ligand Binding Site Prediction from Protein Sequence
Figure 3 for Protein Language Model-Powered 3D Ligand Binding Site Prediction from Protein Sequence
Figure 4 for Protein Language Model-Powered 3D Ligand Binding Site Prediction from Protein Sequence
Viaarxiv icon

CoLLiE: Collaborative Training of Large Language Models in an Efficient Way

Add code
Dec 01, 2023
Figure 1 for CoLLiE: Collaborative Training of Large Language Models in an Efficient Way
Figure 2 for CoLLiE: Collaborative Training of Large Language Models in an Efficient Way
Figure 3 for CoLLiE: Collaborative Training of Large Language Models in an Efficient Way
Figure 4 for CoLLiE: Collaborative Training of Large Language Models in an Efficient Way
Viaarxiv icon

Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Add code
Nov 24, 2023
Figure 1 for Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Figure 2 for Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Figure 3 for Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Figure 4 for Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Viaarxiv icon