Picture for Zhishen Yang

Zhishen Yang

Jagle: Building a Large-Scale Japanese Multimodal Post-Training Dataset for Vision-Language Models

Add code
Apr 02, 2026
Viaarxiv icon

SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning

Add code
Jun 06, 2023
Figure 1 for SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning
Figure 2 for SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning
Figure 3 for SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning
Figure 4 for SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning
Viaarxiv icon

Keyframe Segmentation and Positional Encoding for Video-guided Machine Translation Challenge 2020

Add code
Jun 23, 2020
Figure 1 for Keyframe Segmentation and Positional Encoding for Video-guided Machine Translation Challenge 2020
Figure 2 for Keyframe Segmentation and Positional Encoding for Video-guided Machine Translation Challenge 2020
Viaarxiv icon