Picture for Jiahao Wang

Jiahao Wang

SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs

Add code
Aug 21, 2024
Viaarxiv icon

HBot: A Chatbot for Healthcare Applications in Traditional Chinese Medicine Based on Human Body 3D Visualization

Add code
Aug 01, 2024
Viaarxiv icon

Power-LLaVA: Large Language and Vision Assistant for Power Transmission Line Inspection

Add code
Jul 27, 2024
Figure 1 for Power-LLaVA: Large Language and Vision Assistant for Power Transmission Line Inspection
Figure 2 for Power-LLaVA: Large Language and Vision Assistant for Power Transmission Line Inspection
Figure 3 for Power-LLaVA: Large Language and Vision Assistant for Power Transmission Line Inspection
Figure 4 for Power-LLaVA: Large Language and Vision Assistant for Power Transmission Line Inspection
Viaarxiv icon

Fast and Continual Knowledge Graph Embedding via Incremental LoRA

Add code
Jul 08, 2024
Figure 1 for Fast and Continual Knowledge Graph Embedding via Incremental LoRA
Figure 2 for Fast and Continual Knowledge Graph Embedding via Incremental LoRA
Figure 3 for Fast and Continual Knowledge Graph Embedding via Incremental LoRA
Figure 4 for Fast and Continual Knowledge Graph Embedding via Incremental LoRA
Viaarxiv icon

Mixture-of-Subspaces in Low-Rank Adaptation

Add code
Jun 16, 2024
Figure 1 for Mixture-of-Subspaces in Low-Rank Adaptation
Figure 2 for Mixture-of-Subspaces in Low-Rank Adaptation
Figure 3 for Mixture-of-Subspaces in Low-Rank Adaptation
Figure 4 for Mixture-of-Subspaces in Low-Rank Adaptation
Viaarxiv icon

LOGO: Video Text Spotting with Language Collaboration and Glyph Perception Model

Add code
May 29, 2024
Figure 1 for LOGO: Video Text Spotting with Language Collaboration and Glyph Perception Model
Figure 2 for LOGO: Video Text Spotting with Language Collaboration and Glyph Perception Model
Figure 3 for LOGO: Video Text Spotting with Language Collaboration and Glyph Perception Model
Figure 4 for LOGO: Video Text Spotting with Language Collaboration and Glyph Perception Model
Viaarxiv icon

Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast

Add code
May 23, 2024
Viaarxiv icon

Mamba-R: Vision Mamba ALSO Needs Registers

Add code
May 23, 2024
Figure 1 for Mamba-R: Vision Mamba ALSO Needs Registers
Figure 2 for Mamba-R: Vision Mamba ALSO Needs Registers
Figure 3 for Mamba-R: Vision Mamba ALSO Needs Registers
Figure 4 for Mamba-R: Vision Mamba ALSO Needs Registers
Viaarxiv icon

Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots

Add code
May 13, 2024
Figure 1 for Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots
Figure 2 for Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots
Figure 3 for Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots
Figure 4 for Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots
Viaarxiv icon

OneActor: Consistent Character Generation via Cluster-Conditioned Guidance

Add code
Apr 16, 2024
Figure 1 for OneActor: Consistent Character Generation via Cluster-Conditioned Guidance
Figure 2 for OneActor: Consistent Character Generation via Cluster-Conditioned Guidance
Figure 3 for OneActor: Consistent Character Generation via Cluster-Conditioned Guidance
Figure 4 for OneActor: Consistent Character Generation via Cluster-Conditioned Guidance
Viaarxiv icon