Picture for Lin Ma

Lin Ma

Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models

Add code
Jun 12, 2024
Figure 1 for Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models
Figure 2 for Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models
Figure 3 for Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models
Figure 4 for Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models
Viaarxiv icon

Methodology and Real-World Applications of Dynamic Uncertain Causality Graph for Clinical Diagnosis with Explainability and Invariance

Add code
Jun 09, 2024
Viaarxiv icon

AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning

Add code
Jun 01, 2024
Figure 1 for AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning
Figure 2 for AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning
Figure 3 for AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning
Figure 4 for AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning
Viaarxiv icon

TIE: Revolutionizing Text-based Image Editing for Complex-Prompt Following and High-Fidelity Editing

Add code
May 27, 2024
Viaarxiv icon

Integer Scale: A Free Lunch for Faster Fine-grained Quantization of LLMs

Add code
May 23, 2024
Viaarxiv icon

Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts

Add code
May 18, 2024
Figure 1 for Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Figure 2 for Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Figure 3 for Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Figure 4 for Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Viaarxiv icon

Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning

Add code
May 13, 2024
Viaarxiv icon

Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost

Add code
May 09, 2024
Figure 1 for Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost
Figure 2 for Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost
Figure 3 for Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost
Figure 4 for Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost
Viaarxiv icon

Matten: Video Generation with Mamba-Attention

Add code
May 05, 2024
Viaarxiv icon

LaSagnA: Language-based Segmentation Assistant for Complex Queries

Add code
Apr 12, 2024
Figure 1 for LaSagnA: Language-based Segmentation Assistant for Complex Queries
Figure 2 for LaSagnA: Language-based Segmentation Assistant for Complex Queries
Figure 3 for LaSagnA: Language-based Segmentation Assistant for Complex Queries
Figure 4 for LaSagnA: Language-based Segmentation Assistant for Complex Queries
Viaarxiv icon