Picture for Hengjun Pu

Hengjun Pu

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Add code
Aug 25, 2025
Viaarxiv icon

Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy

Add code
Mar 25, 2025
Viaarxiv icon

Diffusion Transformer Policy

Add code
Oct 21, 2024
Figure 1 for Diffusion Transformer Policy
Figure 2 for Diffusion Transformer Policy
Figure 3 for Diffusion Transformer Policy
Figure 4 for Diffusion Transformer Policy
Viaarxiv icon

FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability

Add code
Dec 20, 2023
Figure 1 for FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability
Figure 2 for FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability
Figure 3 for FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability
Figure 4 for FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability
Viaarxiv icon

RepViT: Revisiting Mobile CNN From ViT Perspective

Add code
Jul 27, 2023
Figure 1 for RepViT: Revisiting Mobile CNN From ViT Perspective
Figure 2 for RepViT: Revisiting Mobile CNN From ViT Perspective
Figure 3 for RepViT: Revisiting Mobile CNN From ViT Perspective
Figure 4 for RepViT: Revisiting Mobile CNN From ViT Perspective
Viaarxiv icon