Picture for Kexin Huang

Kexin Huang

TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets

Add code
Jun 30, 2024
Viaarxiv icon

ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models

Add code
Jun 24, 2024
Figure 1 for ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models
Figure 2 for ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models
Figure 3 for ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models
Figure 4 for ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models
Viaarxiv icon

AvaTaR: Optimizing LLM Agents for Tool-Assisted Knowledge Retrieval

Add code
Jun 18, 2024
Viaarxiv icon

Optimizing Large Model Training through Overlapped Activation Recomputation

Add code
Jun 13, 2024
Viaarxiv icon

MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models

Add code
Jun 11, 2024
Viaarxiv icon

STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases

Add code
Apr 19, 2024
Figure 1 for STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases
Figure 2 for STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases
Figure 3 for STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases
Figure 4 for STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases
Viaarxiv icon

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Add code
Jan 29, 2024
Figure 1 for From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Figure 2 for From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Figure 3 for From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Figure 4 for From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Viaarxiv icon

Relational Deep Learning: Graph Representation Learning on Relational Databases

Add code
Dec 07, 2023
Figure 1 for Relational Deep Learning: Graph Representation Learning on Relational Databases
Figure 2 for Relational Deep Learning: Graph Representation Learning on Relational Databases
Figure 3 for Relational Deep Learning: Graph Representation Learning on Relational Databases
Figure 4 for Relational Deep Learning: Graph Representation Learning on Relational Databases
Viaarxiv icon

Fake Alignment: Are LLMs Really Aligned Well?

Add code
Nov 14, 2023
Viaarxiv icon

Flames: Benchmarking Value Alignment of Chinese Large Language Models

Add code
Nov 12, 2023
Figure 1 for Flames: Benchmarking Value Alignment of Chinese Large Language Models
Figure 2 for Flames: Benchmarking Value Alignment of Chinese Large Language Models
Figure 3 for Flames: Benchmarking Value Alignment of Chinese Large Language Models
Figure 4 for Flames: Benchmarking Value Alignment of Chinese Large Language Models
Viaarxiv icon