Picture for Yikang Shen

Yikang Shen

LaMAGIC2: Advanced Circuit Formulations for Language Model-Based Analog Topology Generation

Add code
Jun 11, 2025
Viaarxiv icon

FlashFormer: Whole-Model Kernels for Efficient Low-Batch Inference

Add code
May 28, 2025
Viaarxiv icon

PaTH Attention: Position Encoding via Accumulating Householder Transformations

Add code
May 22, 2025
Viaarxiv icon

Do Larger Language Models Imply Better Reasoning? A Pretraining Scaling Law for Reasoning

Add code
Apr 04, 2025
Viaarxiv icon

Stick-breaking Attention

Add code
Oct 23, 2024
Figure 1 for Stick-breaking Attention
Figure 2 for Stick-breaking Attention
Figure 3 for Stick-breaking Attention
Figure 4 for Stick-breaking Attention
Viaarxiv icon

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Add code
Aug 23, 2024
Viaarxiv icon

FlexAttention for Efficient High-Resolution Vision-Language Models

Add code
Jul 29, 2024
Figure 1 for FlexAttention for Efficient High-Resolution Vision-Language Models
Figure 2 for FlexAttention for Efficient High-Resolution Vision-Language Models
Figure 3 for FlexAttention for Efficient High-Resolution Vision-Language Models
Figure 4 for FlexAttention for Efficient High-Resolution Vision-Language Models
Viaarxiv icon

Scaling Granite Code Models to 128K Context

Add code
Jul 18, 2024
Viaarxiv icon

The infrastructure powering IBM's Gen AI model development

Add code
Jul 07, 2024
Figure 1 for The infrastructure powering IBM's Gen AI model development
Figure 2 for The infrastructure powering IBM's Gen AI model development
Figure 3 for The infrastructure powering IBM's Gen AI model development
Figure 4 for The infrastructure powering IBM's Gen AI model development
Viaarxiv icon

Octo-planner: On-device Language Model for Planner-Action Agents

Add code
Jun 26, 2024
Figure 1 for Octo-planner: On-device Language Model for Planner-Action Agents
Figure 2 for Octo-planner: On-device Language Model for Planner-Action Agents
Figure 3 for Octo-planner: On-device Language Model for Planner-Action Agents
Figure 4 for Octo-planner: On-device Language Model for Planner-Action Agents
Viaarxiv icon