Picture for Mostafa Dehghani

Mostafa Dehghani

End-to-End Spatio-Temporal Action Localisation with Video Transformers

Add code
Apr 24, 2023
Figure 1 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Figure 2 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Figure 3 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Figure 4 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Viaarxiv icon

Scaling Vision Transformers to 22 Billion Parameters

Add code
Feb 10, 2023
Figure 1 for Scaling Vision Transformers to 22 Billion Parameters
Figure 2 for Scaling Vision Transformers to 22 Billion Parameters
Figure 3 for Scaling Vision Transformers to 22 Billion Parameters
Figure 4 for Scaling Vision Transformers to 22 Billion Parameters
Viaarxiv icon

Dual PatchNorm

Add code
Feb 06, 2023
Figure 1 for Dual PatchNorm
Figure 2 for Dual PatchNorm
Figure 3 for Dual PatchNorm
Figure 4 for Dual PatchNorm
Viaarxiv icon

Adaptive Computation with Elastic Input Sequence

Add code
Jan 30, 2023
Viaarxiv icon

DSI++: Updating Transformer Memory with New Documents

Add code
Dec 19, 2022
Figure 1 for DSI++: Updating Transformer Memory with New Documents
Figure 2 for DSI++: Updating Transformer Memory with New Documents
Figure 3 for DSI++: Updating Transformer Memory with New Documents
Figure 4 for DSI++: Updating Transformer Memory with New Documents
Viaarxiv icon

Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints

Add code
Dec 09, 2022
Figure 1 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 2 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 3 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 4 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Viaarxiv icon

Automated Deep Aberration Detection from Chromosome Karyotype Images

Add code
Nov 30, 2022
Viaarxiv icon

Scaling Instruction-Finetuned Language Models

Add code
Oct 20, 2022
Figure 1 for Scaling Instruction-Finetuned Language Models
Figure 2 for Scaling Instruction-Finetuned Language Models
Figure 3 for Scaling Instruction-Finetuned Language Models
Figure 4 for Scaling Instruction-Finetuned Language Models
Viaarxiv icon

Transcending Scaling Laws with 0.1% Extra Compute

Add code
Oct 20, 2022
Figure 1 for Transcending Scaling Laws with 0.1% Extra Compute
Figure 2 for Transcending Scaling Laws with 0.1% Extra Compute
Figure 3 for Transcending Scaling Laws with 0.1% Extra Compute
Figure 4 for Transcending Scaling Laws with 0.1% Extra Compute
Viaarxiv icon

$Λ$-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells

Add code
Oct 14, 2022
Figure 1 for $Λ$-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells
Figure 2 for $Λ$-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells
Figure 3 for $Λ$-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells
Figure 4 for $Λ$-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells
Viaarxiv icon