Alert button
Picture for Konrad Staniszewski

Konrad Staniszewski

Alert button

Analysing The Impact of Sequence Composition on Language Model Pre-Training

Add code
Bookmark button
Alert button
Feb 21, 2024
Yu Zhao, Yuanbin Qu, Konrad Staniszewski, Szymon Tworkowski, Wei Liu, Piotr Miłoś, Yuxiang Wu, Pasquale Minervini

Viaarxiv icon

Structured Packing in LLM Training Improves Long Context Utilization

Add code
Bookmark button
Alert button
Jan 02, 2024
Konrad Staniszewski, Szymon Tworkowski, Sebastian Jaszczur, Henryk Michalewski, Łukasz Kuciński, Piotr Miłoś

Viaarxiv icon

Focused Transformer: Contrastive Training for Context Scaling

Add code
Bookmark button
Alert button
Jul 06, 2023
Szymon Tworkowski, Konrad Staniszewski, Mikołaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Miłoś

Figure 1 for Focused Transformer: Contrastive Training for Context Scaling
Figure 2 for Focused Transformer: Contrastive Training for Context Scaling
Figure 3 for Focused Transformer: Contrastive Training for Context Scaling
Figure 4 for Focused Transformer: Contrastive Training for Context Scaling
Viaarxiv icon