Alert button
Picture for Sebastian Jaszczur

Sebastian Jaszczur

Alert button

Scaling Laws for Fine-Grained Mixture of Experts

Add code
Bookmark button
Alert button
Feb 12, 2024
Jakub Krajewski, Jan Ludziejewski, Kamil Adamczewski, Maciej Pióro, Michał Krutul, Szymon Antoniak, Kamil Ciebiera, Krystian Król, Tomasz Odrzygóźdź, Piotr Sankowski, Marek Cygan, Sebastian Jaszczur

Viaarxiv icon

MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Add code
Bookmark button
Alert button
Jan 08, 2024
Maciej Pióro, Kamil Ciebiera, Krystian Król, Jan Ludziejewski, Sebastian Jaszczur

Viaarxiv icon

Structured Packing in LLM Training Improves Long Context Utilization

Add code
Bookmark button
Alert button
Jan 02, 2024
Konrad Staniszewski, Szymon Tworkowski, Sebastian Jaszczur, Henryk Michalewski, Łukasz Kuciński, Piotr Miłoś

Viaarxiv icon

Mixture of Tokens: Efficient LLMs through Cross-Example Aggregation

Add code
Bookmark button
Alert button
Oct 24, 2023
Szymon Antoniak, Sebastian Jaszczur, Michał Krutul, Maciej Pióro, Jakub Krajewski, Jan Ludziejewski, Tomasz Odrzygóźdź, Marek Cygan

Viaarxiv icon

Sparse is Enough in Scaling Transformers

Add code
Bookmark button
Alert button
Nov 24, 2021
Sebastian Jaszczur, Aakanksha Chowdhery, Afroz Mohiuddin, Łukasz Kaiser, Wojciech Gajewski, Henryk Michalewski, Jonni Kanerva

Figure 1 for Sparse is Enough in Scaling Transformers
Figure 2 for Sparse is Enough in Scaling Transformers
Figure 3 for Sparse is Enough in Scaling Transformers
Figure 4 for Sparse is Enough in Scaling Transformers
Viaarxiv icon

Neural heuristics for SAT solving

Add code
Bookmark button
Alert button
May 27, 2020
Sebastian Jaszczur, Michał Łuszczyk, Henryk Michalewski

Figure 1 for Neural heuristics for SAT solving
Figure 2 for Neural heuristics for SAT solving
Figure 3 for Neural heuristics for SAT solving
Figure 4 for Neural heuristics for SAT solving
Viaarxiv icon