Picture for Xiaodong Liu

Xiaodong Liu

Efficient Long Sequence Modeling via State Space Augmented Transformer

Add code
Dec 15, 2022
Figure 1 for Efficient Long Sequence Modeling via State Space Augmented Transformer
Figure 2 for Efficient Long Sequence Modeling via State Space Augmented Transformer
Figure 3 for Efficient Long Sequence Modeling via State Space Augmented Transformer
Figure 4 for Efficient Long Sequence Modeling via State Space Augmented Transformer
Viaarxiv icon

AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning

Add code
Nov 02, 2022
Figure 1 for AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Figure 2 for AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Figure 3 for AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Figure 4 for AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Viaarxiv icon

AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers

Add code
Oct 14, 2022
Figure 1 for AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
Figure 2 for AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
Figure 3 for AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
Figure 4 for AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
Viaarxiv icon

Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering

Add code
Oct 11, 2022
Figure 1 for Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering
Figure 2 for Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering
Figure 3 for Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering
Figure 4 for Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering
Viaarxiv icon

PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection

Add code
Sep 06, 2022
Figure 1 for PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection
Figure 2 for PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection
Figure 3 for PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection
Figure 4 for PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection
Viaarxiv icon

Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models

Add code
Aug 30, 2022
Figure 1 for Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models
Figure 2 for Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models
Figure 3 for Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models
Figure 4 for Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models
Viaarxiv icon

AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models

Add code
May 24, 2022
Figure 1 for AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models
Figure 2 for AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models
Figure 3 for AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models
Figure 4 for AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models
Viaarxiv icon

Visually-Augmented Language Modeling

Add code
May 20, 2022
Figure 1 for Visually-Augmented Language Modeling
Figure 2 for Visually-Augmented Language Modeling
Figure 3 for Visually-Augmented Language Modeling
Figure 4 for Visually-Augmented Language Modeling
Viaarxiv icon

METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals

Add code
Apr 16, 2022
Figure 1 for METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Figure 2 for METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Figure 3 for METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Figure 4 for METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Viaarxiv icon

Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer

Add code
Mar 28, 2022
Figure 1 for Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Figure 2 for Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Figure 3 for Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Figure 4 for Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Viaarxiv icon