Alert button
Picture for Shai Shalev-Shwartz

Shai Shalev-Shwartz

Alert button

Jamba: A Hybrid Transformer-Mamba Language Model

Add code
Bookmark button
Alert button
Mar 28, 2024
Opher Lieber, Barak Lenz, Hofit Bata, Gal Cohen, Jhonathan Osin, Itay Dalmedigos, Erez Safahi, Shaked Meirom, Yonatan Belinkov, Shai Shalev-Shwartz, Omri Abend, Raz Alon, Tomer Asida, Amir Bergman, Roman Glozman, Michael Gokhman, Avashalom Manevich, Nir Ratner, Noam Rozen, Erez Shwartz, Mor Zusman, Yoav Shoham

Figure 1 for Jamba: A Hybrid Transformer-Mamba Language Model
Figure 2 for Jamba: A Hybrid Transformer-Mamba Language Model
Figure 3 for Jamba: A Hybrid Transformer-Mamba Language Model
Figure 4 for Jamba: A Hybrid Transformer-Mamba Language Model
Viaarxiv icon

Managing AI Risks in an Era of Rapid Progress

Add code
Bookmark button
Alert button
Oct 26, 2023
Yoshua Bengio, Geoffrey Hinton, Andrew Yao, Dawn Song, Pieter Abbeel, Yuval Noah Harari, Ya-Qin Zhang, Lan Xue, Shai Shalev-Shwartz, Gillian Hadfield, Jeff Clune, Tegan Maharaj, Frank Hutter, Atılım Güneş Baydin, Sheila McIlraith, Qiqi Gao, Ashwin Acharya, David Krueger, Anca Dragan, Philip Torr, Stuart Russell, Daniel Kahneman, Jan Brauner, Sören Mindermann

Viaarxiv icon

SubTuning: Efficient Finetuning for Multi-Task Learning

Add code
Bookmark button
Alert button
Feb 14, 2023
Gal Kaplun, Andrey Gurevich, Tal Swisa, Mazor David, Shai Shalev-Shwartz, Eran Malach

Figure 1 for SubTuning: Efficient Finetuning for Multi-Task Learning
Figure 2 for SubTuning: Efficient Finetuning for Multi-Task Learning
Figure 3 for SubTuning: Efficient Finetuning for Multi-Task Learning
Figure 4 for SubTuning: Efficient Finetuning for Multi-Task Learning
Viaarxiv icon

MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning

Add code
Bookmark button
Alert button
May 01, 2022
Ehud Karpas, Omri Abend, Yonatan Belinkov, Barak Lenz, Opher Lieber, Nir Ratner, Yoav Shoham, Hofit Bata, Yoav Levine, Kevin Leyton-Brown, Dor Muhlgay, Noam Rozen, Erez Schwartz, Gal Shachaf, Shai Shalev-Shwartz, Amnon Shashua, Moshe Tenenholtz

Figure 1 for MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning
Figure 2 for MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning
Figure 3 for MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning
Figure 4 for MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning
Viaarxiv icon

Standing on the Shoulders of Giant Frozen Language Models

Add code
Bookmark button
Alert button
Apr 21, 2022
Yoav Levine, Itay Dalmedigos, Ori Ram, Yoel Zeldes, Daniel Jannai, Dor Muhlgay, Yoni Osin, Opher Lieber, Barak Lenz, Shai Shalev-Shwartz, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham

Figure 1 for Standing on the Shoulders of Giant Frozen Language Models
Figure 2 for Standing on the Shoulders of Giant Frozen Language Models
Figure 3 for Standing on the Shoulders of Giant Frozen Language Models
Figure 4 for Standing on the Shoulders of Giant Frozen Language Models
Viaarxiv icon

Knowledge Distillation: Bad Models Can Be Good Role Models

Add code
Bookmark button
Alert button
Mar 28, 2022
Gal Kaplun, Eran Malach, Preetum Nakkiran, Shai Shalev-Shwartz

Figure 1 for Knowledge Distillation: Bad Models Can Be Good Role Models
Figure 2 for Knowledge Distillation: Bad Models Can Be Good Role Models
Viaarxiv icon

The Connection Between Approximation, Depth Separation and Learnability in Neural Networks

Add code
Bookmark button
Alert button
Jan 31, 2021
Eran Malach, Gilad Yehudai, Shai Shalev-Shwartz, Ohad Shamir

Viaarxiv icon

Computational Separation Between Convolutional and Fully-Connected Networks

Add code
Bookmark button
Alert button
Oct 03, 2020
Eran Malach, Shai Shalev-Shwartz

Figure 1 for Computational Separation Between Convolutional and Fully-Connected Networks
Figure 2 for Computational Separation Between Convolutional and Fully-Connected Networks
Figure 3 for Computational Separation Between Convolutional and Fully-Connected Networks
Viaarxiv icon

When Hardness of Approximation Meets Hardness of Learning

Add code
Bookmark button
Alert button
Aug 23, 2020
Eran Malach, Shai Shalev-Shwartz

Viaarxiv icon