Picture for Jianfeng Gao

Jianfeng Gao

EJ

TRA: Better Length Generalisation with Threshold Relative Attention

Add code
Apr 02, 2025
Viaarxiv icon

Towards Understanding Graphical Perception in Large Multimodal Models

Add code
Mar 13, 2025
Viaarxiv icon

A Survey on Post-training of Large Language Models

Add code
Mar 08, 2025
Viaarxiv icon

Magma: A Foundation Model for Multimodal AI Agents

Add code
Feb 18, 2025
Viaarxiv icon

On Memory Construction and Retrieval for Personalized Conversational Agents

Add code
Feb 08, 2025
Figure 1 for On Memory Construction and Retrieval for Personalized Conversational Agents
Figure 2 for On Memory Construction and Retrieval for Personalized Conversational Agents
Figure 3 for On Memory Construction and Retrieval for Personalized Conversational Agents
Figure 4 for On Memory Construction and Retrieval for Personalized Conversational Agents
Viaarxiv icon

Compositional Generalization Across Distributional Shifts with Sparse Tree Operations

Add code
Dec 18, 2024
Viaarxiv icon

TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies

Add code
Dec 13, 2024
Viaarxiv icon

SCBench: A KV Cache-Centric Analysis of Long-Context Methods

Add code
Dec 13, 2024
Figure 1 for SCBench: A KV Cache-Centric Analysis of Long-Context Methods
Figure 2 for SCBench: A KV Cache-Centric Analysis of Long-Context Methods
Figure 3 for SCBench: A KV Cache-Centric Analysis of Long-Context Methods
Figure 4 for SCBench: A KV Cache-Centric Analysis of Long-Context Methods
Viaarxiv icon

OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation

Add code
Dec 12, 2024
Viaarxiv icon

Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion

Add code
Dec 05, 2024
Figure 1 for Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
Figure 2 for Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
Figure 3 for Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
Figure 4 for Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
Viaarxiv icon