Picture for Yuanzhi Li

Yuanzhi Li

Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts

Add code
Sep 02, 2024
Viaarxiv icon

Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

Add code
Aug 29, 2024
Viaarxiv icon

Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

Add code
Jul 29, 2024
Figure 1 for Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process
Figure 2 for Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process
Figure 3 for Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process
Figure 4 for Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process
Viaarxiv icon

How Does Overparameterization Affect Features?

Add code
Jul 01, 2024
Viaarxiv icon

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Add code
Apr 23, 2024
Viaarxiv icon

AgentKit: Flow Engineering with Graphs, not Coding

Add code
Apr 17, 2024
Viaarxiv icon

VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?

Add code
Apr 09, 2024
Viaarxiv icon

Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws

Add code
Apr 08, 2024
Viaarxiv icon

Role of Locality and Weight Sharing in Image-Based Tasks: A Sample Complexity Separation between CNNs, LCNs, and FCNs

Add code
Mar 23, 2024
Viaarxiv icon

Revisiting Disentanglement in Downstream Tasks: A Study on Its Necessity for Abstract Visual Reasoning

Add code
Mar 01, 2024
Viaarxiv icon