Alert button
Picture for Shuming Ma

Shuming Ma

Alert button

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Add code
Bookmark button
Alert button
Feb 27, 2024
Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, Furu Wei

Viaarxiv icon

When an Image is Worth 1,024 x 1,024 Words: A Case Study in Computational Pathology

Add code
Bookmark button
Alert button
Dec 06, 2023
Wenhui Wang, Shuming Ma, Hanwen Xu, Naoto Usuyama, Jiayu Ding, Hoifung Poon, Furu Wei

Viaarxiv icon

Auto-ICL: In-Context Learning without Human Supervision

Add code
Bookmark button
Alert button
Nov 15, 2023
Jinghan Yang, Shuming Ma, Furu Wei

Viaarxiv icon

BitNet: Scaling 1-bit Transformers for Large Language Models

Add code
Bookmark button
Alert button
Oct 17, 2023
Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Huaijie Wang, Lingxiao Ma, Fan Yang, Ruiping Wang, Yi Wu, Furu Wei

Viaarxiv icon

Kosmos-2.5: A Multimodal Literate Model

Add code
Bookmark button
Alert button
Sep 20, 2023
Tengchao Lv, Yupan Huang, Jingye Chen, Lei Cui, Shuming Ma, Yaoyao Chang, Shaohan Huang, Wenhui Wang, Li Dong, Weiyao Luo, Shaoxiang Wu, Guoxin Wang, Cha Zhang, Furu Wei

Figure 1 for Kosmos-2.5: A Multimodal Literate Model
Figure 2 for Kosmos-2.5: A Multimodal Literate Model
Figure 3 for Kosmos-2.5: A Multimodal Literate Model
Figure 4 for Kosmos-2.5: A Multimodal Literate Model
Viaarxiv icon

Retentive Network: A Successor to Transformer for Large Language Models

Add code
Bookmark button
Alert button
Aug 09, 2023
Yutao Sun, Li Dong, Shaohan Huang, Shuming Ma, Yuqing Xia, Jilong Xue, Jianyong Wang, Furu Wei

Figure 1 for Retentive Network: A Successor to Transformer for Large Language Models
Figure 2 for Retentive Network: A Successor to Transformer for Large Language Models
Figure 3 for Retentive Network: A Successor to Transformer for Large Language Models
Figure 4 for Retentive Network: A Successor to Transformer for Large Language Models
Viaarxiv icon

LongNet: Scaling Transformers to 1,000,000,000 Tokens

Add code
Bookmark button
Alert button
Jul 19, 2023
Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, Wenhui Wang, Nanning Zheng, Furu Wei

Figure 1 for LongNet: Scaling Transformers to 1,000,000,000 Tokens
Figure 2 for LongNet: Scaling Transformers to 1,000,000,000 Tokens
Figure 3 for LongNet: Scaling Transformers to 1,000,000,000 Tokens
Figure 4 for LongNet: Scaling Transformers to 1,000,000,000 Tokens
Viaarxiv icon

Kosmos-2: Grounding Multimodal Large Language Models to the World

Add code
Bookmark button
Alert button
Jul 13, 2023
Zhiliang Peng, Wenhui Wang, Li Dong, Yaru Hao, Shaohan Huang, Shuming Ma, Furu Wei

Figure 1 for Kosmos-2: Grounding Multimodal Large Language Models to the World
Figure 2 for Kosmos-2: Grounding Multimodal Large Language Models to the World
Figure 3 for Kosmos-2: Grounding Multimodal Large Language Models to the World
Figure 4 for Kosmos-2: Grounding Multimodal Large Language Models to the World
Viaarxiv icon