Picture for Xin Luna Dong

Xin Luna Dong

ConfQA: Answer Only If You Are Confident

Add code
Jun 08, 2025
Viaarxiv icon

Proactive Assistant Dialogue Generation from Streaming Egocentric Videos

Add code
Jun 06, 2025
Viaarxiv icon

Large Language Models, Knowledge Graphs and Search Engines: A Crossroads for Answering Users' Questions

Add code
Jan 12, 2025
Viaarxiv icon

VisualLens: Personalization through Visual History

Add code
Nov 25, 2024
Figure 1 for VisualLens: Personalization through Visual History
Figure 2 for VisualLens: Personalization through Visual History
Figure 3 for VisualLens: Personalization through Visual History
Figure 4 for VisualLens: Personalization through Visual History
Viaarxiv icon

Aligning Generalisation Between Humans and Machines

Add code
Nov 23, 2024
Figure 1 for Aligning Generalisation Between Humans and Machines
Figure 2 for Aligning Generalisation Between Humans and Machines
Figure 3 for Aligning Generalisation Between Humans and Machines
Figure 4 for Aligning Generalisation Between Humans and Machines
Viaarxiv icon

Are Large Language Models a Good Replacement of Taxonomies?

Add code
Jun 17, 2024
Viaarxiv icon

CRAG -- Comprehensive RAG Benchmark

Add code
Jun 07, 2024
Figure 1 for CRAG -- Comprehensive RAG Benchmark
Figure 2 for CRAG -- Comprehensive RAG Benchmark
Figure 3 for CRAG -- Comprehensive RAG Benchmark
Figure 4 for CRAG -- Comprehensive RAG Benchmark
Viaarxiv icon

SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM

Add code
Mar 07, 2024
Figure 1 for SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Figure 2 for SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Figure 3 for SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Figure 4 for SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Viaarxiv icon

Large Language Models as Zero-shot Dialogue State Tracker through Function Calling

Add code
Feb 16, 2024
Figure 1 for Large Language Models as Zero-shot Dialogue State Tracker through Function Calling
Figure 2 for Large Language Models as Zero-shot Dialogue State Tracker through Function Calling
Figure 3 for Large Language Models as Zero-shot Dialogue State Tracker through Function Calling
Figure 4 for Large Language Models as Zero-shot Dialogue State Tracker through Function Calling
Viaarxiv icon

Lumos : Empowering Multimodal LLMs with Scene Text Recognition

Add code
Feb 12, 2024
Figure 1 for Lumos : Empowering Multimodal LLMs with Scene Text Recognition
Figure 2 for Lumos : Empowering Multimodal LLMs with Scene Text Recognition
Figure 3 for Lumos : Empowering Multimodal LLMs with Scene Text Recognition
Figure 4 for Lumos : Empowering Multimodal LLMs with Scene Text Recognition
Viaarxiv icon