Picture for Bin Wang

Bin Wang

and Other Contributors

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Add code
Dec 12, 2024
Figure 1 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Figure 2 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Figure 3 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Figure 4 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Viaarxiv icon

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations

Add code
Dec 10, 2024
Figure 1 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Figure 2 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Figure 3 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Figure 4 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Viaarxiv icon

Chimera: Improving Generalist Model with Domain-Specific Experts

Add code
Dec 08, 2024
Figure 1 for Chimera: Improving Generalist Model with Domain-Specific Experts
Figure 2 for Chimera: Improving Generalist Model with Domain-Specific Experts
Figure 3 for Chimera: Improving Generalist Model with Domain-Specific Experts
Figure 4 for Chimera: Improving Generalist Model with Domain-Specific Experts
Viaarxiv icon

OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation

Add code
Dec 03, 2024
Viaarxiv icon

Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding

Add code
Nov 25, 2024
Figure 1 for Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Figure 2 for Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Figure 3 for Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Figure 4 for Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Viaarxiv icon

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Add code
Oct 29, 2024
Figure 1 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Figure 2 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Figure 3 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Figure 4 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Viaarxiv icon

HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation

Add code
Oct 28, 2024
Figure 1 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Figure 2 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Figure 3 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Figure 4 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Viaarxiv icon

DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization

Add code
Oct 22, 2024
Figure 1 for DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization
Figure 2 for DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization
Figure 3 for DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization
Figure 4 for DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization
Viaarxiv icon

SiamSeg: Self-Training with Contrastive Learning for Unsupervised Domain Adaptation in Remote Sensing

Add code
Oct 17, 2024
Figure 1 for SiamSeg: Self-Training with Contrastive Learning for Unsupervised Domain Adaptation in Remote Sensing
Figure 2 for SiamSeg: Self-Training with Contrastive Learning for Unsupervised Domain Adaptation in Remote Sensing
Figure 3 for SiamSeg: Self-Training with Contrastive Learning for Unsupervised Domain Adaptation in Remote Sensing
Figure 4 for SiamSeg: Self-Training with Contrastive Learning for Unsupervised Domain Adaptation in Remote Sensing
Viaarxiv icon

Order-aware Interactive Segmentation

Add code
Oct 17, 2024
Figure 1 for Order-aware Interactive Segmentation
Figure 2 for Order-aware Interactive Segmentation
Figure 3 for Order-aware Interactive Segmentation
Figure 4 for Order-aware Interactive Segmentation
Viaarxiv icon