Picture for Bingbin Liu

Bingbin Liu

TinyGSM: achieving >80% on GSM8k with small language models

Dec 14, 2023
Viaarxiv icon

Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars

Dec 03, 2023
Viaarxiv icon

Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation

Add code
Jun 01, 2023
Figure 1 for Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation
Figure 2 for Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation
Figure 3 for Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation
Figure 4 for Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation
Viaarxiv icon

Exposing Attention Glitches with Flip-Flop Language Modeling

Add code
Jun 01, 2023
Figure 1 for Exposing Attention Glitches with Flip-Flop Language Modeling
Figure 2 for Exposing Attention Glitches with Flip-Flop Language Modeling
Figure 3 for Exposing Attention Glitches with Flip-Flop Language Modeling
Figure 4 for Exposing Attention Glitches with Flip-Flop Language Modeling
Viaarxiv icon

Transformers Learn Shortcuts to Automata

Add code
Oct 19, 2022
Viaarxiv icon

Masked prediction tasks: a parameter identifiability view

Feb 18, 2022
Viaarxiv icon

Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation

Oct 21, 2021
Figure 1 for Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation
Figure 2 for Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation
Figure 3 for Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation
Viaarxiv icon

Contrastive learning of strong-mixing continuous-time stochastic processes

Mar 03, 2021
Viaarxiv icon

Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction

Feb 20, 2020
Figure 1 for Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction
Figure 2 for Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction
Figure 3 for Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction
Figure 4 for Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction
Viaarxiv icon

Learning to Decompose and Disentangle Representations for Video Prediction

Add code
Oct 17, 2018
Figure 1 for Learning to Decompose and Disentangle Representations for Video Prediction
Figure 2 for Learning to Decompose and Disentangle Representations for Video Prediction
Figure 3 for Learning to Decompose and Disentangle Representations for Video Prediction
Figure 4 for Learning to Decompose and Disentangle Representations for Video Prediction
Viaarxiv icon