Picture for Xiaodong Liu

Xiaodong Liu

Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering

Add code
Sep 16, 2024
Figure 1 for Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering
Figure 2 for Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering
Figure 3 for Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering
Figure 4 for Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering
Viaarxiv icon

Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning

Add code
Aug 26, 2024
Figure 1 for Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning
Figure 2 for Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning
Figure 3 for Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning
Figure 4 for Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning
Viaarxiv icon

Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts

Add code
Jul 12, 2024
Figure 1 for Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts
Figure 2 for Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts
Figure 3 for Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts
Figure 4 for Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts
Viaarxiv icon

DefSent+: Improving sentence embeddings of language models by projecting definition sentences into a quasi-isotropic or isotropic vector space of unlimited dictionary entries

Add code
May 25, 2024
Viaarxiv icon

SWEA: Changing Factual Knowledge in Large Language Models via Subject Word Embedding Altering

Add code
Jan 31, 2024
Viaarxiv icon

Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning

Add code
Jan 25, 2024
Figure 1 for Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning
Figure 2 for Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning
Figure 3 for Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning
Figure 4 for Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning
Viaarxiv icon

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Add code
Jan 05, 2024
Figure 1 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Figure 2 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Figure 3 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Figure 4 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Viaarxiv icon

Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs

Add code
Nov 03, 2023
Figure 1 for Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs
Figure 2 for Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs
Figure 3 for Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs
Figure 4 for Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs
Viaarxiv icon

Automatic Hallucination Assessment for Aligned Large Language Models via Transferable Adversarial Attacks

Add code
Oct 19, 2023
Viaarxiv icon

Fast-ELECTRA for Efficient Pre-training

Add code
Oct 11, 2023
Figure 1 for Fast-ELECTRA for Efficient Pre-training
Figure 2 for Fast-ELECTRA for Efficient Pre-training
Figure 3 for Fast-ELECTRA for Efficient Pre-training
Figure 4 for Fast-ELECTRA for Efficient Pre-training
Viaarxiv icon